From ESB to OpenShift: Elsevier's infrastructure has evolved as its business model changes
Director of software engineering Tom Perry explains the advantages of containers and microservices
For academic publishing company Elsevier, digital transformation has meant far more than simply releasing books and journals in electronic form, said Tom Perry, director of software engineering.
"We've rebuilt ourselves as an analytics company," he explained. " Where we're going towards now is providing services and platforms on top of our core data and adding knowledge platforms on top of that. So we're really evolving our business into real digital future."
For IT this means re-engineering the company's infrastructure for the API age. Previously, core data was managed on an enterprise service bus (ESB) using proprietary technologies, SOA and monolithic services.
The idea was to update this architecture so it could take data wherever it may reside - in the cloud or on premises in legacy applications - and make it available for analytics, visualisations or APIs with the ability to develop applications faster to improve speed to market.
ESB to JBoss
The first step was to move to what Perry calls a "macroservices" architecture, a halfway-house based on JBoss Fuse Registry. This migration, completed in 2015, allowed the team to package small services as OSGI containers using the Apache Camel framework.
"In terms of core development this was really good, But we still had relatively long development times," Perry said.
"We wanted to have a platform which was low maintenance that allowed us to develop and deploy quickly, provide access out, provide different versioning of APIs, as well as scalability, fault handling and recovery and those kind of aspects."
JBoss to OpenShift
A couple of years later containers and Kubernetes were enjoying rapid adoption. Perry said he was wary about jumping on the new shiny thing, but after a a proof-of-concept trial with Red Hat's container application platform OpenShift decided the time was right to move from "macroservice" to microservices, at least for amenable applications.
Such migations can be notoriously time-consuming but none of the services had to be rewritten, thanks to the use of Apache Camel and Java application framework Spring. The OSGI containers were simply repackaged using Springboot.
One of Kubernetes' chief benefits is self-healing and this resilience has already been proved. In October AWS Ireland went down for an hour. Many of Elsevier's back office systems reside in Amazon's Irish cloud and they went down with it. Systems had to be restarted.
"We spent a lot of effort restarting our platforms, making sure they were running and thateverything was there, but [OpenShift] was probably the one platform in the middle that nobody had to touch, it just worked. That really showed some of the benefits of what we're doing."
Among other benefits already seen at Elsevier are lower cost of support and infrastruture, the ability to autoscale of services to meet demand and faster development cycles. Eighteen months in and the core deployment is on track to see 65 APIs deployed on the platform by early next year. But this is just the start, said Perry.
"Our growth has really accelerated and yet there is a lot of room to grow this further within our organisation. We're looking at what we can do to expand to our other products, because a lot them are going onto a microservices/container architecture as well."
Optimisation
Initial results have been extremely promising then, but there are a number of optimisations required to ensure the platform scales to meet demand as services are migrated across and new ones added.
"Really it's starting to think more about how you develop your services, how you keep it more lightweight in terms of memory, otherwise you're going to be very memory-intensive and you're not going to scale in terms of the infrastructure as much as you'd like because you can put less onto the servers themselves," Perry explained.
"A lot of what we put on there is hardly touching the CPU but it's very memory intensive with the standard ways of developing with technologies such as Java, so we have been rethinking about how we develop services, how we manage dependencies, how we keep them more lightweight and we're still along that journey. We've come a long way from where we've started but I think it's something we can take a lot further."
In the spirit of keeping everything as light as possible Elsevier is also looking at serverless cloud, for tasks that run intermittently. "They can have very quick startup time but they're not using any memory or CPU footprint at any time so it allows you to scale your platform a bit further as well."
Ultimately the plan is to move more and more services onto the container platform as microservices and to use OpenShift as the primary place where new products are developed.