Big data in big numbers - it's time to forget the 'three Vs' and look at real-world figures
The term 'big data' has lost its meaning, says Sean Jackson, who offers some numbers to explain its impact in the here and now
We have all heard the term "big data" but what does it mean - and what do we define as big? Many people try to define it in terms of size, although opinions vary; for some a dataset over a terabyte is big data, for others it might be a gigabyte or maybe a million rows. Gartner has another opinion, they use the definition "big data is high volume, high velocity, and/or high variety information assets" - the so-called 3Vs.
Sadly, the 3Vs is such a hackneyed and overused phrase that it has effectively lost its meaning. We have a simpler definition: big data is any dataset that is difficult to analyse using standard tools. We often talk about having the right tool for the job. There's no point using a lawnmower to trim a hedge, it'll work, but it's too unwieldy to do the job properly. Likewise, you could use kitchen scissors, but it'll take weeks.
So instead of defining big data as a number or a size, it is more interesting and more relevant to define it in terms of the here and now. What is happening that is making the data big? What are the numbers behind big data? Here are a few stats and figures that define big data.
Ninety per cent. That's how much of the world's data that was generated in the past two years according to studies. Not only are we clicking, emailing, chatting and taking photos or videos more than ever, but processing that data itself generates more data. Companies have cottoned on to the fact that data is valuable so are storing more and more. Datasets such as website access logs are no longer being thrown away, they are being archived and mined to generate valuable insights.
Two times. Moore's law is a well-known observation that the number of transistors in a dense integrated circuit doubles approximately every two years. It has been estimated that the amount of data transmittable through an optical fibre doubles every nine months. And finally storage density doubles roughly every 13 months. So we can process, transmit, and store more data than ever and all three are exponentially increasing commodities. We are better placed than ever before to deal with data.
59 billion dollars. That is the estimated value of the big data market in 2015 and it is expected to grow to $102bn by 2019. Big data is big bucks, a 21st century gold rush of sorts and to extend the metaphor, big data analytics is panning for gold. For data-first companies such as Google and Facebook, the monetisation comes in the form of advertising, and the big data analytics helps them to show an appropriate advert. For other companies it is often about increasing sales (the "I see you bought this, what about this..?" offers), automating decisions (big data gives the proof that an option is the correct one to take) and decreasing costs (for example more efficient supply chains).
6.4 billion. That's the number of internet-connected "things" that will be in use this year - and by 2020 it is forecast to grow to 21 billion according to research by Gartner. Connecting devices globally may not be a new idea - at the computer laboratory of Cambridge University there was an internet-connected webcam to check a coffee pot back in 1993 (the so-called Trojan Room coffee pot) - however it is mainstream adoption we are talking about now. And what enables "things" to become smart? Data, from scheduling household appliances to run when national energy demand is low to detecting anomalies in manufacturing or automatically adjusting speed limits. Or simply automatically ordering milk when your fridge detects it is running low.
Five times more likely. Companies that use big data analytics are five times more likely to make decisions "much faster" than their competition according to research by Bain & Company, who polled 400 large companies and found that those with advanced analytics capabilities are outperforming their competitors.
It's clear, big data is growing, it's here to stay and companies are already reaping the benefits with big data analytics enabling them to outperform their competitors. With the right tools these insights are easier to obtain than ever.
Sean Jackson is CMO at in-memory database provider EXASOL
Join us for Computing's Big Data & Analytics Summit next month. Attendance is free to qualifying end-users, so book your place now before they all go