Data Changemakers 15: Jay Limburn, Distinguished Engineer and Director, IBM
The Data Changemakers series is a set of interviews and interactions with people who have spent their careers working in or around data and data management initiatives. They have a vision for the data journey and we want to understand what they have learnt and how that drives what they do today. What are their war stories and what advice can they give others embarking on the journey?
Please describe a little about your own background and you ended up working with data?
I was at college and then 19 years ago I joined IBM as an apprentice. I did a two-year apprenticeship with day-release at University.
Originally I was working on Lotus Notes development, so on the collaboration side. Then I joined the Websphere Product Centre team as it was then known, which went on to be Master Data Management so that was my first exposure to data management. I got interested in all the sub-domains of ETL, data quality and others so my interest grew into these around governance and big data etc. MDM was very much my route in.
What is your current role and its main responsibilities as they relate to data?
I am the Director of Offering Management for data catalogs and governance, working as part of IBM Analytics. From a product point of view, I own Watson Knowledge Catalog and Information Governance Catalog and IBM’s metadata and cataloguing strategy.
I am also an IBM Distinguished Engineer so have a technical and business, executive focus. My job is really to look at our cataloguing technology and metadata capabilities and deliver a modern platform that ties together the data management side and uses metadata to push through to our data science portfolio.
“We need to go from collecting data, do something useful with it, make it rich, organise it and put it into the hands of the consumers – data scientists and analysts.”
What has been the most challenging data-related project you have worked on and why? What was your role in it and was the project a success and why?
That’s a difficult one. I’d probably mention one that is the most interesting rather than challenging.
I worked with a client who had a whole bunch of machine learning models that were feeding their critical business processes but they needed constant retraining with new training data. There is a whole lifecycle around capturing data, understanding and organising it so you can feed it into and train models.
“We have to figure out how to manage the auditability of which models have been trained on which data; which data can be used for which purposes based on consent and regulation; understanding bias and so many other things.”
It’s a huge area of focus at the moment but it’s a relatively immature area. There is a lot of understanding needed about the data before you can even push it into the model.
Then there is the area of efficiency. When you’ve got say 50 teams who are working on different models how can you drive efficiencies between them when they are all going out and buying data and perhaps it’s the same data? How could all the data be made available to all of them where it is appropriate?
“What we are doing in this new wave is solving problems clients didn’t even know they would have to solve when they started to implement AI.”
I go to so many clients who say they are doing data science but not seeing any of the benefits of it because they are not being efficient about it. I’m incredibly fortunate to be working on solutions to these problems that need to be solved to allow organizations to fully embrace AI.
“It is still very early days in general in AI but we are certainly seeing recognition of the challenges of doing AI well and the importance of good data management.”
Thus far it has all been about the collection of data, collect, collect, collect. But just collecting it doesn’t get you to AI. There are a whole bunch of steps to do before you can put it in the hands of the consumers to create an effective and efficient data science practise that allows clients to fully differentiate their business.
You have 17 patents in areas like machine learning, mobile device interaction and application generation. What drives you to discover and develop new techniques and products?
I just love technology. I’m kind of a geek like most of us in this industry – or at least we mostly all have geeky undertones! I love creating new things and I’ve always been inspired by inventors. There are IBMers with more than 100 patents – I’ll never reach those lofty heights but I really love inventing new things. It’s about doing something that has not been done before; implementing it and then protecting your idea. The breadth of it has surprised me too.
“It’s really cool to have a list of things that you may not be known for but which no one else has done or achieved. My favourite one is probably about the concept of using machine-learning to automate data stewardship.”
The reason I say that one is because data stewardship was a real struggle with clients because it is so labour intensive and requires growing numbers of people to do it. About 4 years ago we created some IP to train a model based on the actions of data stewards. It looks at the anomalies in the data; looks at what the data stewards did and then feeds a model to make a decision for each item in a list. Ultimately it will save our customers from having to have reams and reams of data stewards.
We came up with the idea in the MDM environment but now the patent is written in a generic way and can be used anywhere. It’s based on decision-tree type technology.
What do you think are the key trends in data management today and how do you think it will change the way we all do business?
I think it’s AI which is the big thing. The data management problem is not going away – it is extremely important – but AI is a positive disruptor to it.
“You go from collecting information and using perhaps 10% of it. AI helps you use larger volumes of the data to create models and bake them in to processes which allows you to change your business.”
This is really important for large organisations that are getting disrupted by smaller start-ups. They are worried about this disruption but AI is a way for them to do things like reduce churn and get to know more about their customers. The advantage they have is the volume of data they possess. If they can use that to power AI then they can do things that smaller organisations can’t – because they don’t have the data available to create and train effective AI models.
“People have talked about data monetisation and data driven etc for years now but I don’t see anyone properly doing it. If we can focus on getting that data into “consumption” mode so it can be used by data scientists then we will see some really interesting things happen.”
I think AI also applies back at the collection phase. How can it improve how we manage and operate data lakes? How can it help us improve how we collect data in the first place? Particularly when we start thinking about privacy and security. Can we embed AI into that to automate some of it; flag up risk; identify exposure; detect anomalies in how people are using data? It’s still AI but applying it in a different way – there’s a long way to go with privacy and I think AI can help with that.
What advice would you give to someone embarking on a large data-related project today?
Talent. Make sure you get a good team. Data is not traditionally a “sexy” business to be in but you can’t just take someone with no data governance experience and tell them that now they are responsible for it. It takes time to build up the right skills and you have to take the time to find the right team.
Focus. You need to focus in on what you are trying to do – you gather information for a reason and that is now to drive that monetisation and value.
“Many people have built data lakes but now they are not sure what to do with them – you have to know what you are going to do with it to drive value. Get collection and consumption working together – don’t just focus on one end of it.”
Figure out how to move fast. What are the small steps towards your goal that will show value quickly? I go to clients where things change every six months and they aren’t moving their projects forward for various reasons and they are losing patience with it.
“Think about how you can be more agile in moving projects forward – really break them down and figure out how you can show value in a few weeks.”
For example, in the past, MDM projects took a long time to get implemented but now we are seeing lighter weight MDM Express capabilities coming to market which will help clients move faster. Ask yourself, what are the core features that are needed to demonstrate value? Then move onto the next piece and the next after that.
Are there any particular skills or qualifications you consider to be vital to your success?
Passion. You’ve got to care about what you do. I spend many more waking hours a day working than I get to spend at home. You can’t just go through the motions, you have got to be passionate about it.
“I care deeply about what I do and I am proud of who I work for. I work for one of those companies that I believe can change the world in a positive way and I believe it tries to. There are very few companies that can do that on the scale that IBM can.”
Even if I play a very small part in helping the world to use AI to develop medical treatments, predict natural disasters or improve environmental factors relating to transport then it feels like it is worthwhile.
I think you’ve also got to have fun. I have fun. We all have to have difficult discussions and work hard but I have fun doing it and relieve stress through laughter.
“The relationship piece matters massively, particularly in a remote team. Laughing together sets an environment that brings the team along for the ride. It helps provide an environment where, even if we disagree we can still move forward.”
What are you best known for or what do you like doing outside of your working life?
I’ve got 3 kids so I am big on family time outside of work. Football features heavily. I am a big Southampton Football Club fan and I live 15 minutes from the stadium and go to about 10 games a season with my son. I also referee for the kids’ soccer – I have two girls and a boy. We all like attending festivals and going out for the day and having fun together.
Which 3 people would you invite to dinner – alive or dead – and why?
I had this conversation a few weeks ago with some friends but I’m not going to repeat what we came up with or I might offend someone! People I would enjoy talking to include Barack Obama because he would be very interesting. I think he tried to do a lot of good and make positive change.
Nelson Mandela would be another interesting guy – get Barack and Nelson together – that would be a fascinating conversation.
James Hunt – the ex Formula 1 driver because I think he would add a little bit of colour to the discussion.
“Three interesting people – they would all have great stories. Two political leaders who would have stories that no one else on the planet would know and James Hunt would add a different dynamic to that conversation.”
For more #datachangemakers interviews click here
For information on Entity Group’s advisory services in the area of data management and analytics click here