Interview: Autonomy chief architect Fernando Lucini on a software-based approach to big data

Data scientists? Human middleware, says Autonomy chief architect

Businesses are still grappling with the key to unlocking big data. A recent report by SAS suggested a lack of skills is holding them back, suggesting that education is needed for big data to realise its potential.

There are those who disagree that this is the key factor when it comes to unlocking big data. Autonomy Chief Architect Fernando Lucini, for example, told Computing that a skills-based approach to big data, with the need to hire data scientists, is going to lead to a world where "we're going to have a middle layer of people" when they might not actually be needed.

He does believe, however, that we need a new way of processing big data.

"The more I think about it the more it becomes clear - and certainly it is with my customer base - that we're all working together in the IT industry, all creating this data and at the same time we're the consumers of it," he said, arguing that training staff to manually process and analyse big data is a counter-evolutionary step.

"Technology moves on, makes life easier, skills transfer, all that is going in one direction So do we really think that we're going to dial back this evolutionary move where we create technology that allows us to have a better reach and make it more manual?

"The world around us is going to get better and the challenges of having more information will be solved by technology. We as individuals will get better and better at using these technologies. When I say better and better, we need technology that actually understands information. Why? Because understanding information at a rate that a human can't do means the human can concentrate on the highest added-value part of their world and let the machine bring the key items of information."

Lucini likens data scientists to the telephone desk operators or typing pools of the 1950s - employees who will be made obsolete by the march of time and technology. He argues that people adapt to new tools whenever possible, and it'll be the same for big data software.

"We're all using desktop tools to get the information on our desktops. We're not employing people to do that for us, we're employing technology! But how different is big data from the challenge of the data in everybody's laptop? If you tell me that's little data then I don't think so! When I do backup there are gigs and gigs of incompressible data! I'm not employing ten people to make sense of it for me; I'm using technology to do that."

Autonomy argues that using the correct software is all about giving the user the power to process big data for themselves, therefore cutting out a potentially inefficient middle layer of manual operatives. If software is used instead, he says, machines rather than humans can do the hard work.

"We're going to have technology that can reach. Ours does it today, we can run it and it'll consume any volume of information, take it all, form an understanding and get ready to solve problems."

"Then, when the user is trying to create a marketing campaign, or is trying to look at DNA strands for research, we're ready to provide the needle in the haystack. That sounds like the romantic notion that the information was lost: it wasn't. It was always there, it just didn't need fifty people to find it, it just needed the user."

Interview: Autonomy chief architect Fernando Lucini on a software-based approach to big data

Data scientists? Human middleware, says Autonomy chief architect

Rather than going all-out to teach skills required for analysing big data, Lucini believes that, much like with all new technologies, processing big data will become second nature for the next-generation of IT professionals.

"It seems like there's a generational jump here. Every generation learns the new tools and technology they need to use; my son is five years old and using an iPad but my mother can't use it.

"The human condition is always going to keep up with the tools that we're given to solve problems. In this case my five-year old with the iPad; it's a tool he's been given and it's been obvious to him how to solve his problems using that tool.

"I don't think we need a generation that is specifically taught the beauty of big data. We've got a society that is incredibly intelligent that in a generation can break the previous generation's understanding of the world and the tools to use.
These kids will be aligned with the tools and the capabilities given to them. They will absolutely lap them up and they'll be experts.

"In the world around us, anything that runs data also runs software. It's a fact. So what we need to do is have clever storage, have clever software that deals with what's in the storage, have clever devices that use the software to deliver the right things to the user. In all of this the key layer is always software. The data is never going to change, but the intelligence is what we should concentrate on. Let's continue to evolve software to help the user derive value from what actually has value - the data.

"If you think of big data as information that changes very often, as big and meaty and difficult to deal with, it comes in different flavours. But the reality is, if you talk to CIOs of many companies, they will not vouch for having a ridiculous forest of interaction with their users, they will always tell you ‘I don't have small data'."

The Autonomy chief architect insists that getting value out of data held in email systems, Sharepoint and other workplace tools is more relevant than anything considered to be "big data". Data stored in such systems could immediately be useful to business, if only it could be processed quickly.

"As far as I'm concerned, that's big data, it's valuable, it's immediate, and it's there now. How can I make use of it? That's the naughty part of this, let's not come up with a label that's big data, and forget the fact we've got very large data around us already which has the same characteristics. We spend most of our lives managing email. Wouldn't it be wonderful if we actually sorted that out? Because that for me is big data. Big data is my email deluge."

While Autonomy may disagree with what SAS told Computing, that the lack of skills in big data is what's holding the field back, they do agree with the SAS report in one respect: the idea that correct application of big data will bring financial benefits to the economy if its potential is unlocked.

"Any form of data is going to be a benefit, but if we don't have tools that are going to remove the noise it's meaningless. A simple example of this is Twitter. I love Twitter, it's brilliant. But if I actually think that anything [on Twitter] containing the word "computing" is going to help my business, then that's nonsense, because we're not removing the chaff," said Lucini.

"So it's going to be a great benefit when the tools are put in place. Once we start using the tools widely to remove the noise, we'll be able to keep the things that are important.

"We're all going to try to get as much value out of [our data] as possible. We're not going to let go of any piece of information without squeezing value out of it. If that's big data, fabulous, I'll be the first to sign that piece of paper. Any piece of information, no matter how small, whatever value it has, can be extracted, and we'll all be better as a society for it. If we're storing any piece of data no matter how small, without thinking of the value, then we've got a problem."