All data is big data now, says MapR
The focus now is making applications agile, says SVP Jack Norris
The term big data has lost its usefulness, according to Jack Norris, SVP data and applications, at one of the quintessential big data companies MapR.
"It's become so inclusive that it's all data now," he said. "It's a different approach to how you handle it, moving away from separate applications dictating the silos to injecting the data and bringing the processing to it."
These days there is less of an emphasis on analytics in the reporting sense and more on embedding it into applications, he went on.
"Now organisations are saying ‘how can I make application development more agile' how do I move away from silos?" he said, following up with a plug for MapR's USP: "A big piece of that is making the data platform more agile".
MapR has always held itself slightly apart from the crowd in that it does not use the Hadoop file system, HDFS, opting for its proprietary MapR-FS distributed file system from the beginning. In addition, it has always been more forward than other Hadoop firms such as Cloudera and Hortonworks in promoting a consolidated data platform approach, the scope of which now seems to be broadening into a platform for distributed applications of all types.
Norris said that having experimented with Hadoop and other big data technologies, organisations are increasingly moving them into production.
"Most organisations have an experimental phase with big data, then there's an operational phase which means putting that into production and driving business value. At that stage you rapidly go from having one application to multiple applications. Is not a single workload anymore," he explained, adding that MapR customers typically run 50 large applications on a single cluster.
As well as putting the big data label to one side, MapR no longer describes itself as a Hadoop company. Instead Norris said, Hadoop is just one component of MapR's CDP (converged data platform).
"CDP runs a lot of capabilities so you have Hadoop workloads but also legacy applications and now containers and microservices," he explained.
In March the company announced new levels of support for containers, both for the development of new applications and also for containerising existing legacy apps such as CRM, something that Norris said has always been very difficult as they rely on shared data sources.
"Applications that are ephemeral and lightweight are easy to put in containers and move from place to place," he said. "Applications that share data are much more complex to containerise."
One advantage of containerising such applications is that it allows them to be distributed across environments, such as with a hybrid cloud setup, but his requires the ability to synchronise and replicate data across the cloud and local platforms. For some applications, such as those deployed in connected cars, for example, they must be truly distributed, able to run in multiple locations at the same time. This is the area that MapR and other Hadoop players are targeting, as it plugs right into IoT-type applications.
"Is the application in cloud or on premise? Well actually it's a distributed application, is happening in the car, it's happening in the region, and it's being aggregated on a global basis," Norris explained.
In order for this to work with a containerised application the data must be persistent. If the application goes down the data must not go down with it. On a small scale, making Docker containers stateful is possible using the Flocker plug-in, but Norris insisted that this does not scale well, and is not suitable for containerising legacy applications.
"You need enterprise features like the ability to snapshot to capture all data both at rest and emotion, plus built-in security consistency data protection and secure access," he said.
As well as providing persistent storage for container-based applications CDP also supplies services including database and data streaming, and it can also integrate services offered on Amazon's, Microsoft's and Google's public cloud platforms.
James Curtis, senior analyst data platforms and analytics at 451 Research, said that while other Hadoop vendors are moving in the same direction in supporting distributed applications via a platform approach "MapR's approach is probably a bit further along on the maturity curve".
Computing's Big Data & IoT Summit 2017 and the Big Data & IoT Summit Awards are coming on 17 May 2017.
Find out what construction giant Amey, Lloyds Banking Group, Financial Times and other big names are doing in big data and the Internet of Things.
Attendance to the Summit is free to qualifying senior IT professionals and IT leaders, but places are strictly limited, so apply now.
AND on the same day, Computing is also proud to present the Big Data & IoT Summit Awards, too. See the finalists - and secure a table for your team at the Awards - now: