Serverless databases: cheap, flexible and scalable - for the right applications
MongoDB and DataStax recently released serverless versions of their cloud databases, a sign of the direction of database evolution
For many developers, serverless cloud compute in the shape of AWS Lambda, Azure Functions, to name but two, has become part of the furniture, providing on-demand infrastructure with minimal configuration, that scales as needed and is charged according to consumption.
Less well-known are serverless databases, which include DataStax AstraDB, which reached general availability in February, and MongoDB Atlas Serverless, which was released last month. Like serverless compute, they also promise infinite scalability and consumption-based pricing, in a way that may be particularly suited to cloud-native and event-driven workloads.
Serverless databases are designed to handle unpredictable, rapidly changing tasks. Are they really ‘server-less'? Well no, but then neither is Lambda et al. The servers are there, humming away in some data centre somewhere, but you don't need to worry about configuring them or their virtualised offspring. You just turn them on like a tap and turn them off again when you're done. That's the idea anyway. 'Serverless' is one of the industry's more deceptive terms, but to the annoyance of linguistic purists it's one that has stuck.
One person that's not convinced that serverless is an evolution is Percona product owner MongoDB, Akira Kurogane.
"Why serverless when it's server-ful?" he said, before conceding that for small cloud databases that are intermittently used, such as for data science experiments, it could be cheaper to run them that way than on a per-server basis. "It depends on how MongoDB prices them in the future."
Currently, MongoDB Atlas Serverless, which is still in preview mode, costs US $0.30 per million reads and $1.25 per million writes and storage at $0.25 per GB per month.
DataStax AstraDB, a serverless Cassandra service, comes in at a similar price tag, with rates varying slightly according to the choice of cloud provider. In EMEA, reads are $0.39, $0.26 and $0.26 on GCP, AWS and Azure, respectively, with writes priced between $1.16 and $1.33. Storage costs are $0.25 GB/month.
Nanna Einarsdóttir, VP engineering at Icelandic firm Ankeri, A customer of DataStax AstraDB, told Computing about the appeal of the model for the company, which provides real-time data to the commercial shipping sector.
First, she said, the company decided on NoSQL as a primary data store architecture because of the need to handle large volumes of fast moving data, eventually plumping for Apache Cassandra for its scalability. Her team then decided the serverless option would be a good fit owing to the lack of administrative overhead.
We only have to pay for the service that we use over time, rather than having to estimate capacity in advance and pay extra - Nanna Einarsdóttir, Ankeri
"We decided on serverless Cassandra as it meant that we could run this as a service and not have to worry about having to implement our own clusters or carrying out management tasks," Einarsdóttir said.
"Our developers can use APIs to interact with data, and the service takes care of all the management tasks".
She added: "We only have to pay for the service that we use over time, rather than having to estimate capacity in advance and pay extra."
So far the results have been in line with expectations, she explained, with serverless allowing the team to get up and running "very quickly, much faster than we would have been able to if we had to design and install our own clusters."
Einarsdóttir continued: "Going serverless around our applications and our data together has helped us develop our operations faster."
Inevitably, though, there has been a learning curve, although that seems more to do with Cassandra than with the serverless model.
"In our initial design, we planned to have a lot of columns within our database implementation in order to capture all the data that each ship provides, but this could impact performance whenever search operations would be carried out," she said, adding that DataStax engineers were on hand to help with a more compact and storage-efficient design.
As Atlas Serverless is still in preview, MongoDB were unable to offer a customer for interview, but the purported benefits will be much the same: cost control, scalability, rapid iteration for developers, and reduced administration. "With MongoDB Atlas serverless instances, you will get seamless deployment and scaling, a reliable backend infrastructure, and an intuitive pricing model," the company's website says.
DataStax and MongoDB are not the only database providers with serverless options. The big cloud companies have had their own offerings for a couple of years, including AWS Aurora Serverless, Azure SQL Serverless and Google Firebase, and ISVs like CockroachDB have also created serverless alternatives.
See also: Storage at scale: picking the right options for Kubernetes
Patrick McFadin, VP developer relations at DataStax, sees serverless databases that can run in multiple clouds as offering new flexible options for developers.
"It took some rethinking of the internal architecture, but it was ready for this type of deployment based on how it runs as a distributed database," he said of AstraDB, adding that the changes will flow into the community version of Cassandra.
"We'll be sharing those changes with the open source community, which is really the big shift in databases. What open source databases will run as serverless?"
With cloud infrastructure increasing ephemeral, having always-on database connectivity may be wasteful. In a cloud-native scenario - or one where applications make use of a lot of serverless functions - a database that you only connect to when you need and which can then scale automatically and almost instantly would seem to be a much better general fit. With other storage options also available to cloud-native developers, it's not surprising to see vendors moving in this direction. Serverless is also likely to find favour with small teams who lack the capacity for managing infrastructure.
Nevertheless, they are not suitable for every application. There is an inevitable delay as connectivity is established, and cold starts mean that latency will be increased. Teams may also find that time saved in setup is used up by the need to monitor resource usage and performance and to implement new security measures.
That said, we expect to see more organisations investigating the serverless option from here on in.