Product announcements come thick and fast at AWS opening keynote

Building blocks of generative-AI-driven future on show

Image:

Matt Garman on stage at AWS re:Invent24 Credit: AWS

AWS CEO Matt Garman’s first conference keynote as CEO was focused on the building blocks of AWS. It’s a generative-AI-driven future AWS is building for.

Garman announced a slew of updates to the building blocks of AWS - compute, storage, database and inference. Together, they enable companies to innovate at lower cost and with much greater energy efficiency – at scale.

AWS only began developing its own silicon in 2018. Fast forward to now and 90% of the top 1,000 EC2 customers use Graviton chips. AWS launched Graviton 4 a few months ago, which was designed to address a much broader set of workloads than its predecessors. Pinterest is one customer and, according to Garman, managed to reduce compute costs by 47% and carbon emissions by 62% using the new chips.

However, it’s generative AI workloads that are driving compute innovation, and Garman announced the P6 family of instances, which will feature the new NVIDIA Blackwell GPUs coming next year. According to Garman:

“P6 instances will give you up to 2.5 times faster compute than the current generation of GPUs.”

Garman also announced the general availability of EC2 Trn2 instances, which he said provided 30-40% better price performance than the current generation of GPU-powered instances and are purpose-built for gen AI training and inference. The instances link 16 Trainium 2 chips with a low latency, high bandwidth NeuronLink.

“One Trn2 instance will deliver 20.8 petaflops on a single compute node,” Garman said.

Customers already using Trn2 instances include Adobe and Poolside, with Databricks and Qualcomm in the pipeline.

Part of the P6 instances are EC2 TRN2 ultraservers, which, according to Garman, “connect four Trn2 instances, so 64 Trainium 2 chips all interconnected by NeuronLink. This gives you a single ultra node with over 83 petaflops of compute from a single compute node.”

One such cluster is a frontier model being developed by AWS and Anthropic.

“Project Rainier is building a cluster of Trn2 Ultraservers containing hundreds of thousands of Trainium 2 chips. This cluster will be five times the number of exaflops as the current cluster that Anthropic use to train their leading set of Claude model.”

Garman also announced the Trainium3 chip, which is expected later next year. It should deliver twice the compute of its predecessor whilst being 40% more efficient.

Garman closed the compute focused part of the keynote with a killer statistic:

“130 million new EC2 instances are launched every day.”

Building for a generative AI driven future

A veritable firehose of announcements followed. Storage announcements included S3 Table Buckets for Iceberg tables and S3 metadata in preview, which Garman said represented “a stepchange in how you can use your data for analytics as well as really large AI modelling use cases.”

On databases?

Amazon Aurora DSQL is a new distributed SQL database, benchmarked against Google Spanner. According to Garman it delivers “4x faster reads and writes than Spanner.”

Then came a stream of updates on Bedrock, the portal to a range of foundation models on which customers can build their own generative AI applications. Model distillation, automated reasoning checks and multi-agent collaboration are all now available.

Image

Description

Andy Jassy at re:Invent24 Credit: AWS

Amazon CEO Andy Jassy provided Garman with a breather whilst he unveiled a new family of state-of-the-art foundation models called Amazon Nova (see our coverage here). These include a text only model and several multi-modal models.

Jassy was characteristically bullish about the speed and efficiency of these models and also announced an image generation model (Nova Canvas) and a video equivalent (Nova Reels).

Garman returned with the final tranche of announcements focused on the Amazon Q Developer toolset, two of which seemed custom designed to give customers struggling with legacy .NET applications or on-premise VMWare stacks a helping hand to migrate away from Microsoft and VMWare.

This keynote was long, but the future Garman sees was crystal clear.

“I think generative AI actually has the potential to transform every single industry, every single company out there, every single workflow, every single user experience out there.”