Data Changemakers 14: Colin Shearer, pioneer and thought-leader in advanced analytics

by Kate Tickner on 7th August 2018

The Data Changemakers series is a set of interviews and interactions with people who have spent their careers working in or around data and data management initiatives. They have a vision for the data journey and we want to understand what they have learnt and how that drives what they do today. What are their war stories and what advice can they give others embarking on the journey?

Colin Shearer Head Shot

Colin Shearer has been a pioneer and thought leader in advanced analytics for over 25 years. His experience ranges from successful start-ups and the creation of market-leading tools and technology, to worldwide executive roles with the largest vendors. Today, he provides advice and assistance to end-user organisations and to vendors, helping them set their vision for Big Data and Advanced Analytics, plan and execute projects and initiatives, and integrate advanced analytics with their key business process and the systems that support them to drive maximum ROI.

Please describe a little about your own background and you ended up working with data?

I had planned to study Chemistry at university but I was “seduced” by computing. I enrolled into the first course Aberdeen University offered in Artificial Intelligence. After graduation I went to work for a large systems integrator – SD-Scicon, which went on to become EDS Europe – in their AI Products Division. This was the 1980s, when AI was “hot” first time around!

“My speciality was machine learning (ML) but in those days the focus wasn’t on applying it to existing historic data. It was seen mainly as a tool for getting hard-to-articulate expertise out of experts and into expert systems, by learning from their past decisions.”

A number of ex-SD-Scicon colleagues and I set up Integral Solutions Limited (ISL) in 1989. I remember presenting at a machine learning event and someone from a large retailer came to ask if ML techniques could be used to predict sales. We gave it a try and it turned out that it could – with very high accuracy! This got us into what became “data mining” and we carried out projects for many clients including the BBC, manufacturers and finance organisations.

What is your current role and its main responsibilities as they relate to data?

Well it’s roles plural for me at the moment. Having been through a career of acquisitions from ISL to SPSS to IBM I decided to semi-retire about two years ago. I am happy to say I mostly failed at that because I am now working with three companies, all of which I find exciting in different ways:

Houston Analytics – a Finnish company that has one of the strongest and most experience applied data science teams in Europe. I’m Chief Strategy Officer, helping to set the direction for taking that capability to market.

Agillic – a Danish company who have merged very interesting AI capabilities into marketing automation tools to boost marketing effectiveness. As Chief Business Development Officer, I’m helping to drive the growth of AI-based elements of the business and build awareness of Agillic’s AI credentials.

OPEX Group – based in Aberdeen, helping oil and gas operating companies to eliminate production losses and reduce maintenance costs. I have an advisory role helping to ensure the X-PAS™ Predictive Analysis Service continues to innovate and lead by leveraging the most powerful and relevant AI and analytics technologies, and delivers maximum value to upstream operators.

You were a pioneer in the early days of data mining tools. What drove you and your colleagues to create SPSS Modeller and why?

At ISL we were being brought in as consultants to execute data mining projects. However, we had set out to be a product company. Also, while building predictive models was fun, data prep was lengthy, difficult and boring. We needed something to facilitate the process of getting through that so we built the Clementine data mining workbench – now IBM SPSS Modeler. We tested Clementine prototypes on data sets and problems we had tackled manually in the past and found it was much faster. One six-week project was re-done in 45 minutes.

“We found that we could empower analysts and enable them to undertake train of thought analysis; build and fail quickly; combine models and reduce complicated techniques to a few clicks.”

We were ahead of the market because the data mining discipline was only just emerging. Our offering was unique at the time – we just saw an opportunity based on our own experience.

ISL was one of a few companies involved in the creation of the CRISP-DM methodology for data mining. How important do you think people and process Vs technology are in the success of data management projects?

I should probably give a little background on CRISP-DM first. It stands for “Cross-Industry Standard Process for Data Mining” and there were two main reasons for creating it at that point:

  1. Data mining was taking off. It was exploding for us (and others), and the existence of a methodology showed potential users that this young and burgeoning area was actually “mature”.
  2. So adopters could be confident there was an approach to follow that would help ensure their projects succeeded.

We got together with a few other companies including DaimlerBenz and NCR Teradata so it wasn’t just ISL – it was to be a non-proprietary methodology. We formed a Special Interest Group to get the broadest input possible. Over 100 members participated and what we found was that most of us were doing things in a similar way – the terminology varied but the process tended to be the same

Other methodologies have emerged over the years, but CRISP is probably still the most widely used. Its most important differentiator is a focus on the end-to-end process with the business fit in mind. You look at the business problem first, map that to analytical goals, do the technical piece, then map it back up to the business again to deploy the results and measure the value.

I often tell people my main contribution to CRISP-DM was coming up with the acronym. Tom Khabaza and Ruediger Wirth did most of the real work but I do take credit for the name!

So in answer to your question: yes, I believe process is very important in any data project.

“The people element is also hugely important but a project has to combine people who know the technology and the business. It is the separation of these two things that tends to cause the problems.”

If you keep business and analysis apart, you can wait months for the data scientists to create the models and when they’re brought back to the business people they turn out to be wrong. The teams with business and analytical skills need to be joined at the hip, or they need to be the same people.

“The right technology platform can help bring the people and process closer together.”

What do you think are the key trends in data management today and how do you think it will change the way we all do business?

The biggest trend I am seeing is that people are obsessed with…. Sorry… seeing the potential of AI. In around 2010 we saw the Big Data wave which has now blended into the AI wave. Both have had a positive effect in that most senior executives now appreciate there’s potential value in these technologies.  But their organisations often have no clue how to proceed and embark on what end up being science projects.

“Projects have to be driven by business problems and goals because these determine which AI approach to use and which data to apply it to. Getting AI right has the potential to transform virtually every aspect of business, with improved decision making driving better outcomes across the board.”

What advice would you give to someone embarking on a large data-related project today?

There are three things I’d mention and they each relate to one of the companies I am working with today:

  1. Consider your analytical approach carefully. Most “data science” today consists of one-off analyses done by mathematical/statistical geeks working at the coding level. The notorious labelling of data scientist as “the sexiest job of the 21st century” is probably partly responsible for this “craftsman” approach. It doesn’t deliver results efficiently and it doesn’t scale. Data science needs to go through its own equivalent of the Industrial Revolution, with more focus on automation and deployment. This is where Houston Analytics excels: creating automated analytical processes that integrate with existing systems.
  2. You must deliver to the business. It doesn’t matter how technically brilliant your analytical work is; until you do something effective with the results and enhance your current operations it is meaningless. Agillic injects intelligence directly into marketing actions at key points in the customer lifecycle. Marketers continue to do what they’re best at – creating compelling content, communication and offers – and the AI seamlessly integrates to deliver that to the customers for whom it’s most relevant, through the right channel, at the right time.
  3. It’s not just a matter of applying smart technology; it’s essential to incorporate human domain expertise. At OPEX, deep knowledge about systems and process engineering is at the heart of the analytical approach, and is key to interpreting the output of the analyses. Recommendations to clients are delivered “expert to expert”.

Are there any particular skills or qualifications you consider to be vital to your success?

“I think the single-most important trait is curiosity. You have to want to know what is in the data, and what it means in the broader context of the business or operation it relates to. In my career I have seen AI and ML applied to many areas – finance, heavy industry, defence, and healthcare, to name but a few – and I have found them all fascinating!”

What are you best known for or what do you like doing outside of your working life?

My main hobby is photography – I have sold some work and taken on a few interesting projects. I like most subjects but particularly landscapes, abstracts and architecture. I had a great time doing the photography for a guide book for a 13th century church. As well as capturing fascinating details and taking architectural pictures, I was able to include some more self-indulgent arty shots!

I am also keenly interested in history – most periods, but especially the Dark Ages and prehistory – and archaeology. I worked on a Roman dig in Kent, and since moving back to Scotland and getting involved with our local heritage group, I’ve worked on several sites including an 18th century ice-house and a 16th/17th century “doocot”.

Finally, I am also part of my local (Cullen) amateur dramatic society – the Disaster Theatre Company. In the 2017 pantomime “Snow White and the Magnificent Seven”, I played the Evil Queen So I’m probably best known locally for strutting around the stage in a blonde wig and full drag!

Which 3 people would you invite to dinner – alive or dead – and why?

Mervyn Peake – a wonderfully talented writer, illustrator, poet and artist. His work is fascinating – and I think he’d be fascinating to talk to – because he saw the world with the eyes of an artist and described it with the language of a poet.

Eleanor of Aquitaine – I suppose there is a slight chance of being disappointed because this is based on my love of her character in the “Lion in Winter”, but I don’t think so. She was an extraordinary woman, a match for any of the male rulers of the time. She even led a Crusade!

Charles Shearer – my late father. The others can entertain each other while Dad and I catch up…..

For more #datachangemakers interviews click here

For information on Entity Group’s advisory services in the area of data management and analytics click here

 

Colin Shearer Multi