Cloud native explained. An interview with Cheryl Hung, VP Ecosystem at CNCF
The containers, Kubernetes and microservices revolution has been bubbling away for some time now, but IT leaders could be excused for having a limited understanding of the whys and wherefores of a topic that can sometimes seem like a bewildering mix of technical minutiae and marketing babble.
Indeed, a recent survey by Computing Delta found that only a third of the IT leaders polled had a firm grasp of what ‘cloud native', the umbrella term for the trend, was all about, many confusing it with a cloud-first or cloud-only strategies.
Cheryl Hung is VP Ecosystem at the Cloud Native Computing Foundation (CNCF) arriving at this role from Google, where she says cloud native practices have been the norm for a decade, via a stint a with storage platform StorageOS. All things considered, she seemed the ideal person to help us cut through the confusion around cloud native. The interview is edited for brevity.
Computing: Maybe you could start off by defining what cloud native is and what are its distinctive features, not necessarily from an engineer's point of view, but something a little higher level?
CH: It's really a mentality. It's about shipping small things incrementally, and making continual improvements, rather than the old world of do one release a month. And then it's all the tooling and technology that supports that style.
So it's sort of cloud meets DevOps?
Sort of, but you can do it on-premises too. I definitely want to separate out the idea that cloud native means public cloud providers. I talk to a lot of companies, especially in financial services, where they have more regulatory needs, or companies that have already made a large investment into on-prem infrastructure, and they're deploying cloud native just as happily as the classic web startup that uses AWS.
The original definition of cloud native was an application packaged as a container and then, orchestrated with Kubernetes - orchestrated just means managed at scale - and split into microservices, which means that you contain one team's work, basically, in one microservice. Then you have a well-defined set of APIs that other services can use.
Do you think this way of delivering software will ultimately take over from the more monolithic traditional methods? It's quite a different way of thinking.
It is, and it's worth knowing that there is an upfront investment cost in switching to cloud native. But I think there will always be space for both paradigms to live side by side. For instance, if you're only serving 500 customers and you never really expect to scale beyond that, there is really no need to add to the complexity with cloud native.
But from my own perspective, I started out as a software engineer at Google in 2010 and Google's been doing things in a cloud native way for more than a decade now, so I guess I've been thinking in this paradigm since then, and at that scale it absolutely makes sense to package things and define good interfaces between services.
So where do the upfront costs and other costs lie?
There are two main costs; one is the actual technology adoption. With some paradigm shifts like virtual machines you could lift and shift, you could take what you had and just move it into a new paradigm without much of a change. With cloud native you can do that but you will not get the benefits of cloud native. So there is an upfront cost in terms of your engineers' time to rebuild the software so that it fits well within containers.
The other major cost is education and a shift in your company culture. That's the DevOps mindset where you have to move away separate engineering, development and operations teams to everybody of working together in DevOps world.
How common is it that an organisation will refactor a large existing business application to containerise it, or decompose it into microservices, and what would your advice be to those considering it?
It's very common. The vast majority of companies and not starting from greenfield, and many have huge legacy infrastructures that they need to slowly migrate over.
So my advice to them would be carve out something that makes sense to to prototype with, something that will benefit from being cloud native, in other words, where shipping small amounts constantly is actually valuable. Don't pick something that changes once a year or is super critical to your infrastructure. Pick something small and self-contained where you understand the benefits that cloud native would bring. Shifting things to containers is relatively easy these days; starting up a Kubernetes cluster is relatively easy these days, but the challenge comes with operating things over time.
What about storage? I understand that can be quite a challenge too.
I would say storage is one of the areas that is particularly difficult. That's because the original idea of cloud native was everything is stateless - so there is no storage. You can't really build storage into a world where there's no state and nothing changes. So retrofitting storage into that requires quite a lot of sophistication and careful thinking, and it's getting better over time, but, yes, end-user companies that I've spoken to say storage is a challenge.
Security is a major, major challenge because of the security-usability trade-off. Usually when things are more usable you reduce the amount of security and containers are very, very easy to use.
See also: Going cloud native at the FT. An interview with tech director Sarah Wells
Adopting containers and Kubernetes now is not that hard, but storage, security and deciding whether or not and how to use a service mesh are still challenges. And underlying all of this is still the education, training and the culture and understanding how to organise the engineering teams.
Can you explain what a service mesh is and who would need one?
The idea of a service mesh is to take a network, which is inherently unreliable because, you know, things go down, people make mistakes, goats eat cables - and add a layer to make it seem like it's perfectly reliable. Say a data centre in New York has gone down then we can move to a backup one in Seattle.
So it's an extra layer of virtualisation or abstraction that means you don't have to worry about hardware?
Yes, but specifically for networking and at large scale. If you only have three servers, odds are your network doesn't really matter because it's not going to go down very often, but if you have 100,000 servers, you're going to constantly need to deal with networking issues.
Presenting at KubeCon recently, some early adopters at Monzo said they'd take the managed services route to Kubernetes these days. Would you agree?
I would say the managed services are pretty good nowadays, but I wouldn't make a blanket recommendation. It depends a lot on what knowledge and engineering resources you have, whether you've made investments into your own on prem infrastructure, and so on. But if you were starting brand new from scratch then I think managed services are a good way to go.
The cloud native ecosystem is quite complex. How do IT leaders pick the right tools?
We have a tool called the CNCF Technology Radar. If you're fairly new to this and you want to know what to begin with, what to choose, then your best shot is asking other people what they use and what they recommend. The idea is to really help the broader community choose what things to use.
I think it's very important that it features things that are not CNCF projects, and not all are open source projects either, because it's so important to have the real world view.
The Computing & CRN Women in Tech Festival Global 2020 takes place online on November 30. This year's event will focus on empowerment, development, management, and even the taboo of managing inappropriate workplace interactions. Register today.