AWS re:Invent 2025: Everything Amazon Announced At re:Invent 2025 in one place

At AWS re:Invent, Amazon launched four frontier Nova 2 models, a groundbreaking service for building custom AI models, browser-automating agents, and tools that turn legacy code into modern apps.

Grant Harvey

July 29, 2024

If you thought AI releases were slowing down this December, AWS just said "hold my eggnog." AWS just dropped 50+ announcements at re:Invent 2025. That's not an exaggeration—CEO Matt Garman literally speed-ran through 25 launches in the final 10 minutes of his keynote.

The theme? AI agents are eating the world, and AWS wants to feed them.

The company announced the Nova 2 model family, Nova Forge for custom model building, Nova Act for browser automation, major AWS Transform upgrades, and 18 new open-weight models in Amazon Bedrock.

Let's break down what actually matters across models, infrastructure, agentic platforms, and operations.

First, the TL;DR

If you only have 3 minutes, here's what matters:

Amazon released its Nova 2 model family this morning—four frontier models with adjustable reasoning, multimodal capabilities, and some genuinely interesting twists.

The lineup includes Nova 2 Lite (fast, cost-effective everyday tasks), Nova 2 Pro (most intelligent for complex work), Nova 2 Sonic (real-time voice conversations), and Nova 2 Omni (the first reasoning model that generates both text AND images). All come with adjustable "thinking" levels—dial up for accuracy, dial down for speed.

But the real news is what Amazon's building around these models.

‍Nova Forge lets companies build custom frontier models by mixing proprietary data into Nova's training at pre-trained, mid-trained, and post-trained checkpoints. It's $100K/year, which Amazon argues beats the hundreds of millions you'd spend building from scratch. The catch? Your model stays on AWS forever—you don't get the weights.‍
Nova Act automates browser-based workflows with 90% reliability. Describe a task in plain English ("update this CRM," "test this checkout flow"), and it handles the clicks. Powered by a custom Nova 2 Lite model trained on thousands of simulated web environments.‍
AWS Transform got AI superpowers too. The code modernization service now handles any programming language or framework, modernizing legacy apps 5x faster. Air Canada modernized thousands of Lambda functions in days with an 80% time reduction. Thomson Reuters migrates 1.5M lines of code monthly while cutting costs 30%.

Amazon Bedrock also added 18 new open-weight models from Mistral, OpenAI, Qwen, and others—bringing the total to nearly 100 models accessible through unified APIs.

Why it matters: AWS is attacking AI adoption from every angle. Not just offering models, but solving the actual bottlenecks—custom training (Forge), browser automation (Act), and legacy system modernization (Transform). For enterprises stuck on decades-old infrastructure, that last one could unlock billions in AI investment.

Try the models at nova.amazon.com/dev.

Let's now break down the rest with a bit more in depth to cover what you need to know.

The Model Lineup: Nova 2 Gets Serious

AWS expanded its Amazon Nova portfolio with four new models that finally compete with OpenAI and Anthropic on reasoning and multimodal capabilities.

‍Nova 2 Lite handles everyday tasks like chatbots and document processing—dial up or down how much it "thinks" to balance speed with accuracy (low/medium/high reasoning levels). It processes text, images, and videos, making it perfect for customer service bots or quick analysis work where you need the cost-performance sweet spot.‍
Nova 2 Pro tackles complex work like multi-step coding projects and advanced problem-solving where you need maximum intelligence. This is Amazon's most capable reasoning model, processing text, images, video, and speech. It can also serve as a "teacher" model for knowledge distillation, helping you create smaller, specialized variants for specific tasks.‍
Nova 2 Sonic powers real-time voice conversations—book a flight while discussing something else, it handles both simultaneously. With a 1M token context window (roughly 750,000 words or 1,500 pages), it's built for interactive voice systems and integrates directly with Amazon Connect and telephony partners.‍
Nova 2 Omni analyzes any content type—text, video, images, audio—and generates both text and images. Industry first: it's the first reasoning model that can both understand multimodal inputs AND produce multimodal outputs. Feed it your entire product catalog, customer testimonials, and demo videos, and it'll create complete marketing campaigns in one go.

All models include built-in web grounding, code execution, and 1M token context windows. They work with Model Context Protocol (MCP) tools and come with comprehensive safety measures.

Nova Forge: Build Your Own Frontier Model (But There's a Catch)

Here's where things get interesting. Nova Forge lets you build custom models using your data mixed into Nova's training—getting Nova's power with your company's expertise baked in.

How it works: You access Nova at pre-trained, mid-trained, and post-trained checkpoints, then blend your proprietary data with Amazon's curated training data throughout all phases. This "open training" approach significantly reduces catastrophic forgetting compared to training with raw data alone.

Early customers: Financial services firms customizing models for regulatory compliance, healthcare companies training on medical data.

The catch? You don't actually get the model weights. Your custom model must live on Amazon Bedrock forever. It's essentially a heavily customized fine-tune of Nova that locks you into AWS infrastructure. Still valuable for many companies, but definitely not "your" model in the way open-weight models are.

Cost: $100,000 per year, which Amazon argues is dramatically cheaper than the hundreds of millions or billions you'd spend assembling your own frontier model from scratch.

Agents That Work While You Sleep

AWS unveiled Frontier Agents—a new class of autonomous AI that works for hours or days without human intervention.

Kiro Autonomous Agent: Your virtual developer. Maintains context across sessions, learns your team's patterns, navigates multiple code repositories to fix bugs. Amazon made it their official internal development tool last week.

AWS Security Agent: Virtual security engineer. Reviews code, performs penetration testing, files tickets when issues are found (doesn't auto-fix to avoid breaking things).

AWS DevOps Agent: Your always-on operations team. Responds to incidents, identifies root causes, prevents future issues. Commonwealth Bank of Australia found it identified software failure root causes in 15 minutes—tasks that typically take hours.

The DevOps and Security agents hit public preview immediately. Kiro rolls out in coming months.

Nova Act: Agents That Actually Click Buttons

Nova Act automates web tasks like updating CRMs or testing sites—describe what you need in plain English, it handles the clicks.

Powered by a custom Nova 2 Lite model trained through reinforcement learning on thousands of tasks across hundreds of simulated web environments, Nova Act delivers 90% reliability on early customer workflows.

A few example use cases:

Hertz accelerated software delivery 5x (weeks of QA became hours), Sola Systems automated hundreds of thousands of monthly workflows for payment reconciliation and medical records, and Amazon Leo reduced testing from weeks to minutes.

The service handles tasks like CRM updates, website testing, insurance claims processing, and login automation (1Password uses it across hundreds of sites with a single prompt). Plus, reconcile payments or coordinate shipments. Basically, any repetitive browser-based workflow you're tired of doing manually.

To use it, you prototype agents in minutes with a no-code playground using natural language prompts, refine them in familiar IDEs like VS Code, then deploy to AWS with comprehensive management tools.

AWS Transform: Turn Any Code Into Any Other Code

AWS Transform got agentic capabilities for rapid modernization of any code or application. The service now modernizes legacy applications and code across any framework, runtime, or architecture—including company-specific programming languages. This is huge IMO.

What it does:

The system creates specialized agents that capture feedback and improve over time, making each subsequent transformation more reliable. The headline = 5x faster full-stack Windows modernization with 70% cost reduction. It works across any language or framework, including proprietary legacy systems.

Full-stack Windows modernization 5x faster (eliminating up to 70% of maintenance and licensing costs)
Mainframe migrations from years to months
Custom transformations for any code, API, framework, or language
Pre-built transformations for common patterns like Java, Node.js, and Python upgrades

Real results: Air Canada modernized thousands of Lambda functions in days with an 80% reduction in expected time and cost. Thomson Reuters migrates 1.5 million lines of code monthly while achieving 30% cost savings.

Open-Weight Model Explosion

Amazon Bedrock added 18 new open-weight models from Google, MiniMax AI, Mistral AI, Moonshot AI, NVIDIA, OpenAI, and Qwen. Highlights include:

Mistral Large 3: Optimized for long-context, multimodal, and instruction reliability
Ministral 3 (3B, 8B, 14B variants): Edge-optimized models for single GPU deployment
OpenAI's gpt-oss-120b and gpt-oss-20b: Text generation and reasoning models with open weights
Qwen3 models: Sophisticated coding and general reasoning capabilities
DeepSeek-V3.1: Exceptional performance on math, coding, and agentic tasks

Total: Amazon Bedrock now offers nearly 100 serverless models, all accessible through unified APIs without rewriting code.

Platform for Production Agents: Bedrock AgentCore

Most of the agent action happens in Amazon Bedrock AgentCore, AWS's comprehensive platform for building production-ready agents with any framework and model.

Key components:

‍AgentCore Runtime: Low-latency serverless environments with session isolation. Works with CrewAI, LangGraph, LlamaIndex, Strands Agents, OpenAI Agents SDK—pretty much any framework.‍
AgentCore Memory: Manages short-term and long-term memory. New episodic memory lets agents learn from past interactions. Memory can be shared between agents.‍
AgentCore Gateway: Converts any resource (APIs, Lambda functions) into Model Context Protocol (MCP)-compatible tools with zero code. Native MCP support throughout.‍
AgentCore Browser Tool: Cloud-based browser runtime for agents that need web access, running in isolated VMs. Model-agnostic unlike most browser-use agents.‍
AgentCore Code Interpreter: Lets agents write and execute JavaScript, TypeScript, and Python securely.‍
AgentCore Observability: End-to-end visualization with step-by-step execution, metadata tagging, trajectory inspection. Emits OpenTelemetry format.‍
Policy in AgentCore: NEW at re:Invent. Real-time controls that actively block unauthorized agent actions. Uses Cedar, an open-source policy language, or natural language. Prevents agents from giving away the store.‍
AgentCore Evaluations: NEW at re:Invent. Continuously inspect agent quality based on real-world behavior with custom scoring metrics.

The SDK has been downloaded 2M+ times in 5 months since preview. PGA TOUR built a multi-agent content system that increased writing speed 1,000% while cutting costs 95%.

Infrastructure: Trainium3 Goes Live, Trainium4 Teased

Trainium3 UltraServers hit general availability with serious specs:

144 Trainium3 chips per UltraServer (first 3nm AWS chip)
362 FP8 petaflops of compute
4.4x more compute performance than Trainium2
4x greater energy efficiency
3.9x more memory bandwidth
Scale to 1M chips via EC2 UltraClusters 3.0 (10x previous generation)

Customers including Anthropic, Karakuri, and Decart are cutting training costs up to 50%. Decart achieves 4x faster inference for real-time generative video at half the cost of GPUs.

Testing with OpenAI's GPT-OSS: 3x higher throughput per chip, 4x faster response times than Trainium2.

AWS also previewed Trainium4: 6x performance, 4x memory bandwidth, 2x memory capacity vs Trainium3. Ships 2026. Big news: Trainium4 will support Nvidia's NVLink Fusion interconnect, allowing Trainium and Nvidia chips to work together seamlessly.

AI Factories: Private AWS Regions in Your Data Center

AWS AI Factories bring dedicated AWS infrastructure directly to customer data centers.

The package: Nvidia GPUs, Trainium chips, AWS networking, high-performance storage, security infrastructure, plus Amazon Bedrock and SageMaker AI services.

Why it matters: Government agencies and financial institutions can leverage AI while meeting data sovereignty and regulatory requirements. No data leaves your facility.

First deployment: HUMAIN in Saudi Arabia is building an "AI Zone" with up to 150,000 AI chips (including GB300 GPUs). Multi-gigawatt journey.

Security clearances: Designed for Unclassified, Sensitive, Secret, and Top Secret levels.

Lambda Gets Flexible, EKS Gets Easier

AWS Lambda Managed Instances marries serverless simplicity with EC2 flexibility.

Run Lambda functions on specific EC2 instance types while AWS handles lifecycle management, OS patching, load balancing, and auto-scaling. Access specialized hardware (Graviton4, high-bandwidth networking) with EC2 pricing models including Reserved Instances and Savings Plans.

Pricing: Standard EC2 rates plus 15% compute management fee plus $0.20 per million requests. No duration charges. High-volume users save significantly.

Amazon EKS Capabilities for workload orchestration eliminates infrastructure maintenance while providing enterprise-grade reliability for Kubernetes deployments.

Everything Else (The Speed Round)

S3 Improvements:

10x boost to 50TB object limits
S3 Batch Operations now 10x faster
S3 Tables intelligent tiering for cost optimization

And...

‍Route 53 Global Resolver (preview): Anycast-based DNS resolution for hybrid environments. Resolves public and private domains globally with consistent security controls.‍
CloudWatch for GenAI: Comprehensive observability for generative AI applications. Works with AgentCore, LangChain, LangGraph, CrewAI.‍
CloudTrail Event Aggregation: 5-minute summaries of data events with anomaly detection.‍
CloudWatch Logs Centralization: Multi-account, multi-region log consolidation.‍
IAM Policy Autopilot: Open-source MCP server that analyzes your code to auto-generate valid IAM policies. Provides AI coding assistants with up-to-date AWS service knowledge.‍
AWS Clean Rooms: Privacy-enhancing synthetic dataset generation for ML training. Train on sensitive collaborative data without exposing individual privacy.‍
AWS Interconnect – Multi-Cloud (preview): Purpose-built product for simple, resilient, high-speed private connections between AWS and other clouds. Google Cloud first, Microsoft Azure coming 2026.

Here Are All the Launch Announcements from the Keynote

Below are all the timecodes for each section during the keynote in case you want to watch the section that covers that specific announcement.

(02:51) The Scale of AWS Growth Matt Garman opens by contextualizing AWS's massive scale. AWS is now a $132 billion business accelerating at 20% year-over-year. To visualize this: the $22 billion in growth AWS added in just the last 12 months is larger than the total annual revenue of more than half the companies in the Fortune 500.
(03:51) Infrastructure as the Foundation S3 now stores over 500 trillion objects with 200 million requests per second. Crucially, more than half of the new CPU capacity added to the AWS cloud for the third year running is Graviton (custom silicon), signaling a decisive shift away from generic x86 processors for general workloads.
(04:41) Breakthrough in Quantum Error Correction Garman briefly mentions "Ocelot," a quantum chip prototype that reduces the cost of implementing quantum error correction by more than 90%, a critical step toward making quantum computing commercially viable.
(12:32) The Agentic Inflection Point Garman posits that we are at a specific inflection point in AI: the transition from "AI Assistants" (chatbots) to "AI Agents" (software that performs tasks and automates workflows). He predicts billions of agents will soon exist inside companies, scaling human impact by 10x.
(15:02) Sweating the GPU Details A subtle dig at competitors: Garman notes that AWS investigates every GPU reboot and BIOS error rather than accepting them as normal. This operational rigor allows them to avoid node failures in massive clusters better than anyone else.
(15:56) LAUNCH: P6e-GB300 Ultra Servers AWS announces the P6e instances powered by NVIDIA's GB300 NVL72 systems. This continues the trend of "Ultra Servers," which are massive scale-up domains designed specifically for the largest AI model training workloads.
(17:43) LAUNCH: AWS AI Factories A major strategic pivot: AWS allows customers to deploy dedicated AWS AI infrastructure (like Trainium or NVIDIA stacks) inside their own data centers (on-premise) or sovereign locations. It operates like a private AWS region, addressing data sovereignty and power capacity constraints.
(19:11) The "Trainium" Naming Irony Garman admits a naming failure: "Trainium 2" is actually the best chip in the world for inference, despite being named for training. Most of the inference for Anthropic's Claude on Bedrock is already running on Trainium.
(22:32) LAUNCH: Trn3 Ultra Servers (Trainium 3) The first 3-nanometer AI chip in the cloud. Stats include 4.4x more compute, 3.9x memory bandwidth, and 5x more tokens per watt compared to Trn2. It features a scale-up domain of 144 chips in a single instance.
(25:10) PREVIEW: AWS Trainium 4 Garman teases the next generation (Trn4) while Trn3 is just launching. Trn4 promises 6x FP4 compute performance and massive memory bandwidth increases, indicating AWS is accelerating its silicon release cycle to match or beat NVIDIA's cadence.
(30:31) New Open Weights Models AWS adds Mistral Large (doubled context window) and Mistral 3 (optimized for edge/single GPU) to Bedrock, reinforcing their "choice" strategy over a "one model" strategy.
(31:58) LAUNCH: Amazon Nova 2.0 Family AWS updates its first-party model family.
- Nova 2 Lite: Cost-effective reasoning, beats GPT-4o-mini in benchmarks.
- Nova 2 Pro: Highly intelligent reasoning for complex agentic workflows.
- Nova 2 Sonic: Speech-to-speech model for real-time conversational AI.
(35:22) LAUNCH: Nova 2 Omni (Multimodal) A single unified model that can accept text, image, video, and audio as input, and generate text and images as output. This eliminates the need to "stitch" multiple models together for complex understanding tasks (like analyzing a video presentation).
(41:57) LAUNCH: Amazon Nova Forge (Open Training Models) A new paradigm called "Novellas." Instead of just RAG (Retrieval Augmented Generation) or simple fine-tuning, Forge allows customers to inject their proprietary data during the training process of a frontier model checkpoint. This creates a custom model that deeply understands a specific domain without the model "forgetting" its general reasoning capabilities.
(45:18) Sony's "Kando" Strategy Sony uses AWS to pivot from a hardware company to a digital entertainment giant. They utilize Nova Forge to train models on their internal compliance data, aiming to speed up compliance reviews by 100x.
(55:26) Bedrock Agent Core Philosophy Unlike competitors who offer "black box" agents, AWS Agent Core is modular. You can use their memory but not their identity, or their tools but not their runtime. It supports frameworks like LangChain and CrewAI, emphasizing developer flexibility.
(1:01:32) LAUNCH: Policy in Agent Core Solves the "unpredictable agent" problem. Uses Cedar (an open-source language for access control) to create deterministic guardrails. For example, a policy can block an agent from issuing a refund over $1,000 regardless of what the LLM generates. This check sits outside the agent's code for security.
(1:05:55) LAUNCH: Agent Core Evaluations A "Trust but Verify" mechanism for agents. It provides real-time, continuous testing of agent behavior against metrics like helpfulness or harmfulness. Crucially, it allows developers to test if a model upgrade (e.g., Claude 3 to 3.5) breaks the agent's behavior before full deployment.
(1:18:15) Amazon "Quick" (Internal Case Study) AWS built an internal agent called "Quick" that connects to all structured (Salesforce, Jira) and unstructured (Docs, SharePoint) data. It allows employees to generate deep research reports with citations. It is now used by hundreds of thousands of Amazon employees, reducing task times by 90%.
(1:33:29) LAUNCH: AWS Transform Custom AWS Transform previously handled specific migrations (like Mainframe to Cloud). The "Custom" launch allows users to build agents to modernize any legacy code, including proprietary languages or obscure frameworks (e.g., converting VBA to Python, or Bash to Rust).
(1:35:18) LAUNCH: Kuro (Spec-Driven Development) AWS rebrands/launches "Kuro," an agentic development environment. It focuses on "Spec-Driven Development," where the developer writes a detailed specification (prompt) and Kuro acts as a partner to implement it across the codebase. Amazon has standardized internally on Kuro.
(1:38:01) The Anthony Story: 6 People > 30 People A specific anecdote about an internal Amazon team. A re-architecture project estimated to take 30 developers 18 months was completed by 6 developers in 76 days using agentic workflows.
Key Insight: The team had to change how they worked, moving from assigning small tasks to assigning broad goals and running agents in parallel (scaling out work) overnight.
(1:41:55) LAUNCH: Frontier Agents Garman introduces a new class of agents called "Frontier Agents" defined by three traits: Autonomous (figure out the "how"), Scalable (can run thousands of instances), and Long-running (can work for days asynchronously).
(1:41:55) LAUNCH: Kuro Autonomous Agent A background agent that connects to Jira/GitHub. You assign it a ticket (e.g., "Upgrade this library across 15 microservices"), and it plans, implements, tests, and opens PRs for all 15 repos independently while the developer sleeps. It maintains context/memory of previous architectural decisions.
(1:47:23) LAUNCH: AWS Security Agent Moves security upstream. This agent reviews design documents before code is written and performs autonomous penetration testing on demand. It provides remediation code for vulnerabilities it finds.
(1:50:11) LAUNCH: AWS DevOps Agent An agent that investigates operational incidents. It correlates telemetry (logs, traces) with code changes (Git commits/CDK). Example: It can diagnose that a Lambda failure is due to a recent IAM policy change in a specific CDK deployment and recommend a fix before a human engineer even logs on.
(1:57:03) The Speed Round: 25 Launches in 10 Minutes Garman initiates a rapid-fire session to cover core service updates that didn't fit the AI narrative.
(1:57:29) LAUNCH: X8i & X8a Instances (Memory) New high-memory instances for SAP HANA/SQL Server, offering up to 3TB of memory per instance.
(1:58:17) LAUNCH: C8a & C8i Instances Performance computing instances. C8a (AMD) offers 30% higher performance. C8i (Intel) features Nitro V6 cards for 2.5x higher packet performance (great for network appliances).
(1:58:53) LAUNCH: M8zn Instances Instances with the absolute fastest CPU clock frequency in the cloud, targeted at high-frequency trading and multiplayer gaming.
(1:59:16) LAUNCH: M3 Ultra & M4 Max Mac Instances AWS continues to be the only provider offering Apple Silicon in the cloud for iOS build pipelines.
(1:59:53) LAUNCH: Lambda Durable Functions Allows Lambda functions to "wait" for long-running processes (like an agent working for 3 days) without complex state management. It handles state, error handling, and recovery automatically.
(2:00:20) LAUNCH: S3 50TB Object Size A massive 10x increase in maximum file size (from 5TB to 50TB) to accommodate giant AI datasets and media files.
(2:01:31) LAUNCH: S3 Tables Intelligent Tiering Following up on the S3 Tables launch, this feature automatically moves table data to colder storage, saving up to 80% on costs for data lakes.
(2:02:53) LAUNCH: S3 Vectors & OpenSearch Acceleration S3 Vectors is GA (store trillions of vectors cheaply). Plus, GPU acceleration for OpenSearch indexing makes vector indexing 10x faster at 1/4th the cost.
(2:03:29) LAUNCH: EMR Serverless Storage Provisioning Removed Eliminates the last bit of "muck" in EMR Serverless by removing the need to provision or manage local storage for big data jobs.
(2:05:22) LAUNCH: CloudWatch Unified Data Store A centralized store for all operational, security, and compliance logs (AWS + Third Party like Okta/CrowdStrike), enabling easier correlation of data across disparate systems.
(2:07:16) LAUNCH: Database Savings Plans A highly requested financial feature. Similar to Compute Savings Plans, this allows customers to commit to database usage (RDS, Aurora, DynamoDB, etc.) for a 1- or 3-year term to save up to 35% across all database services.

Why It All Matters

AWS just made its biggest bet on autonomous AI as the next computing platform. The frontier agents announcement signals a fundamental shift: AI stops being a chatbot and starts being a coworker.

The infrastructure investments (Trainium3/4, AI Factories) address the biggest bottleneck in enterprise AI: cost and access to compute. Training and inference at scale becomes economically viable for organizations beyond tech giants.

AgentCore solves the "prototype to production" gap that's kept most agentic AI stuck in demos. Framework-agnostic, model-agnostic, with built-in observability and policy controls.

For enterprises: Expect agent adoption to accelerate rapidly in 2025. The tooling is finally production-ready, the costs are dropping, and AWS is essentially pre-integrating everything.

For developers: The battle lines are drawn. AWS vs. Google's Agent SDK vs. OpenAI's Agents vs. Microsoft's Copilot. Framework wars incoming.

The next frontier of computing isn't about better models. It's about models that work—autonomously, reliably, securely—on your actual business problems.

AWS just built the factory for that future.

Additional Resources: