Everything NVIDIA Announced at GTC 2026, Explained | The Neuron

Everything NVIDIA Just Announced at GTC 2026: Seven Chips, Five Racks, One Giant Bet on Agentic AI

Here's every major announcement from GTC 2026: Jensen Huang unveiled the Vera Rubin platform, Dynamo 1.0, DLSS 5, a robotaxi partnership with Uber, and much more.

Written By
Grant Harvey
Grant Harvey
Mar 17, 2026
33 minute read

Every year, Jensen Huang walks onstage at GTC and tries to convince the world that NVIDIA is more than a chip company. Every year, he gets a little more convincing.

This year, he might have closed the argument.

GTC 2026 dropped 17 press releases and three major blog posts in a single day, covering everything from orbital data centers to robot brains to a new kind of computer graphics. The throughline? NVIDIA wants to own every layer of the AI stack, from the silicon in your data center to the software driving your car to the models running your agents.

Here's everything that happened, organized so you don't have to read 17 press releases (or watch a 2-hour keynote) yourself.

First up, the TL;DR

NVIDIA went all-in on agentic AI at GTC 2026. The centerpiece: the Vera Rubin platform, a system of seven new chips and five rack types designed to function as one massive AI supercomputer. It pairs Rubin GPUs and Vera CPUs with the new Groq 3 LPX inference accelerator, which NVIDIA claims delivers up to 35x higher inference throughput per megawatt. Jensen put the compute leap in perspective: 40 million times more compute in just 10 years.

Jensen also called OpenClaw "the operating system for personal AI" and "the fastest-growing open source project in history," launching NemoClaw to give anyone a one-command install for secure, always-on AI agents.

On the software side, Dynamo 1.0 entered production as the "operating system" for AI factories, boosting Blackwell GPU inference performance by up to 7x. AWS committed to deploying more than 1 million NVIDIA GPUs plus Groq LPUs, and Azure, Google Cloud, and Oracle are all onboard.

NVIDIA launched a Nemotron Coalition with Mistral, Cursor, Perplexity, and others to build open frontier models, expanded its robotics ecosystem with FANUC, ABB, Figure, and Agility, and announced that NVIDIA-powered robotaxis will launch with Uber across 28 markets by 2028. Jensen called autonomous vehicles "the first multitrillion-dollar robotics industry."

The big number? Jensen now sees at least $1 trillion in demand for NVIDIA AI infrastructure through 2027, up from $500 billion just one year ago.

Top Keynote Insights

Jensen's two-hour keynote covered far more ground than the press releases. Here are the most important signals, linked to the exact moments in the full keynote stream:

The Big Numbers

The Paradigm Shifts

The Product Highlights

Advertisement

NemoClaw and OpenClaw: The Operating System for Personal AI

Jensen saved some of his biggest language for this one. He called OpenClaw "the operating system for personal AI" and compared it directly to Mac and Windows, saying the platform marks "the beginning of a new renaissance in software."

The context matters. Jensen spent several minutes walking through OpenClaw's architecture, pointing out that it manages resources, accesses tools and file systems, connects to LLMs, handles scheduling and cron jobs, decomposes problems into steps, spawns sub-agents, and supports multimodal IO. His conclusion: "I've just used the same syntax that I would describe an operating system."

He also laid out the stakes for enterprise adoption. Agentic systems inside corporate networks can access sensitive information, execute code, and communicate externally. Jensen paused after saying that and told the audience to think about the implications. That security gap is exactly what NemoClaw is designed to fill.

NemoClaw is NVIDIA's stack for the OpenClaw agent platform. It bundles Nemotron models and the newly announced OpenShell runtime into a single-command install, adding an isolated sandbox with data privacy and security controls for autonomous AI agents (called "claws").

What makes it interesting: NemoClaw uses a privacy router that lets agents tap open models running locally on your hardware while selectively routing to frontier cloud models when needed. That combination of local and cloud inference, with policy-based security guardrails, is the infrastructure layer that lets always-on agents actually do useful work without leaking your data everywhere. Microsoft Security is already using Nemotron and OpenShell for adversarial learning, reporting a 160x improvement in finding and mitigating AI-based attacks.

It runs on any dedicated platform: GeForce RTX PCs and laptops, RTX PRO workstations, and the newly detailed DGX Station and DGX Spark AI supercomputers. DGX Station ships with the GB300 Grace Blackwell Ultra Desktop Superchip, packing 748GB of coherent memory, up to 20 petaflops of AI compute, and the ability to run open models up to 1 trillion parameters at your desk. DGX Spark now supports clustering up to four systems into a compact "desktop data center" with near-linear performance scaling. NemoClaw works with any coding agent and uses NVIDIA Agent Toolkit software under the hood.

OpenClaw creator Peter Steinberger said the goal is a world where everyone has their own agents. NVIDIA published an OpenClaw Playbook for running local-first AI agents on DGX Spark, and GTC attendees can try it themselves at NVIDIA's build-a-claw event running March 16-19.

Jensen's bottom line for CEOs: just as every company needed an HTTP strategy, a Linux strategy, and a Kubernetes strategy, every company now needs an OpenClaw strategy.

The Vera Rubin Platform: Seven Chips, One Supercomputer

The headline hardware announcement is the Vera Rubin platform, and it's massive in scope. Seven new chips. Five distinct rack-scale systems. All designed to work together as one coherent AI supercomputer, supported by more than 80 NVIDIA MGX ecosystem partners with a global supply chain.

Jensen walked through the full evolutionary arc onstage: from DGX-1 (8 Pascal GPUs, 170 teraflops) in 2016, through Volta's NVLink switch, the A100 SuperPod, Hopper's FP8 Transformer engine, Blackwell's NVLink 72, and now Vera Rubin with 3.6 exaflops of compute and 260 TB/s of all-to-all NVLink bandwidth.

The lineup:

  • Vera Rubin NVL72 GPU rack: 72 Rubin GPUs and 36 Vera CPUs connected by NVLink 6, plus ConnectX-9 SuperNICs and BlueField-4 DPUs. NVIDIA says it trains large mixture-of-experts models with one-fourth the GPUs compared to Blackwell, and delivers up to 10x higher inference throughput per watt at one-tenth the cost per token. Scales with Quantum-X800 InfiniBand and Spectrum-X Ethernet.
  • Vera CPU rack: 256 Vera CPUs in a liquid-cooled rack built on MGX, purpose-built for reinforcement learning and agentic AI workloads. NVIDIA claims 2x the efficiency and 50% faster than traditional CPUs, with world-class single-threaded performance. Jensen noted it's the only data center CPU using LPDDR5, and that standalone CPU sales are already tracking to be a multi-billion dollar business. Deploying with Alibaba, ByteDance, Meta, Oracle Cloud Infrastructure, CoreWeave, Lambda, Nebius, and Nscale. Manufacturing partners include Dell Technologies, HPE, Lenovo, Supermicro, ASUS, Foxconn, GIGABYTE, Pegatron, QCT, Wistron, and Wiwynn.
  • Groq 3 LPX inference rack: 256 LPU processors with 128GB of on-chip SRAM and 640 TB/s of scale-up bandwidth. Jensen explained the key insight behind the Groq integration: NVLink 72 dominates throughput, but at extreme token speeds (1,000+ tokens/second), it runs out of bandwidth. Groq's deterministic dataflow processor, with its massive SRAM and compiler-scheduled execution, picks up exactly where the GPU stops. Together they deliver up to 35x higher inference throughput per megawatt. Samsung manufactures the LP30 chip; shipping Q3 2026.
  • BlueField-4 STX storage rack: AI-native storage infrastructure that extends GPU memory across the entire pod. The STX reference architecture delivers up to 5x token throughput, 4x energy efficiency, and 2x faster data ingestion compared to traditional storage. Includes the new NVIDIA CMX context memory storage platform. Jensen's reasoning: agents are going to pound on storage systems far harder than humans ever did, processing KV cache, structured data (cuDF), and unstructured data (cuVS) simultaneously. Early adopters include CoreWeave, Crusoe, IREN, Lambda, Mistral AI, Nebius, Oracle Cloud Infrastructure, and Vultr. Storage partners: Cloudian, DDN, Dell Technologies, Everpure, Hitachi Vantara, HPE, IBM, MinIO, NetApp, Nutanix, VAST Data, and WEKA.
  • Spectrum-6 SPX Ethernet rack: The networking fabric tying it all together. NVIDIA's first co-packaged optics (CPO) switch is in full production, translating electrons to photons directly on the silicon using a process co-developed with TSMC.

The Vera Rubin system is 100% liquid cooled with 45°C hot water, which transfers cooling energy back to computing. Installation time dropped from 2 days to 2 hours. And it's already in the field: Jensen confirmed the first Vera Rubin rack is running at Microsoft Azure, with the supply chain now manufacturing thousands of racks per week.

Both Dario Amodei (Anthropic) and Sam Altman (OpenAI) provided endorsement quotes in the press release, which tells you something about how deeply NVIDIA hardware is embedded in the AI supply chain. The scale of deployment commitments backs it up: AWS announced it will deploy more than 1 million NVIDIA GPUs plus Groq LPUs spanning the full Blackwell, Rubin, RTX PRO, and Groq 3 stack. Microsoft has deployed hundreds of thousands of liquid-cooled Grace Blackwell GPUs across its global data centers in less than a year. And Thinking Machines Lab signed a multiyear, gigawatt-scale strategic partnership for Vera Rubin systems to support frontier model training.

NVIDIA also released the Vera Rubin DSX reference design and an Omniverse DSX digital twin blueprint for designing and operating AI factories. A DSX Air simulation tool accelerates time-to-token. Jensen estimates there's a factor of two in efficiency gains available through better factory design alone.

Looking further ahead: Jensen previewed the Rubin Ultra chip (taping out now), which slides vertically into a new Kyber rack connecting 144 GPUs in one NVLink domain. And beyond that, the Feynman generation: a new GPU, LP40 LPU, Rosa CPU (short for Rosalind), BlueField-5, CX10, and both copper and co-packaged optics scale-up.

Advertisement

Dynamo 1.0: The OS for AI Factories

Vera Rubin is the hardware. Dynamo 1.0 is the brain running it.

Jensen's keynote framing was instructive. He described the problem Dynamo solves by explaining disaggregated inference: the insight that throughput and latency are "enemies of each other" in chip design. Dynamo rearchitects the inference pipeline to split work between processors optimized for each, sending prefill and attention (memory-heavy) to Rubin GPUs, and decode/token generation (bandwidth-limited) to Groq LPUs.

The key numbers: up to 7x inference performance boost on Blackwell GPUs, with lower token cost and increased revenue opportunity. Jensen showed a live example from inference providers: same hardware, token speeds jumping from 700 to nearly 5,000 per second after NVIDIA updated the software stack. Seven times higher, same system.

Core Dynamo building blocks are also available as standalone modules: KVBM for smarter memory management, NVIDIA NIXL for fast GPU-to-GPU data movement, and NVIDIA Grove for simplified scaling. NVIDIA also contributes TensorRT-LLM CUDA kernels to the FlashInfer project so they integrate natively into open source frameworks like vLLM, SGLang, and LangChain.

Adoption is broad:

  • Cloud providers: AWS, Microsoft Azure, Google Cloud, and OCI
  • NVIDIA cloud partners: Alibaba Cloud, CoreWeave, Crusoe, DigitalOcean, Gcore, GMI Cloud
  • AI-native companies: Cursor, Perplexity, Baseten, Deep Infra, and Fireworks
  • Enterprise: ByteDance, Meituan, PayPal, and Pinterest

Open Models and the Nemotron Coalition

NVIDIA pushed hard on the open-source angle this year. Jensen framed open models as essential to his sovereign AI strategy: the goal is to create base models so good that every country can fine-tune them into their own domain-specific intelligence.

The Nemotron Coalition is a first-of-its-kind global collaboration between NVIDIA and AI labs including Mistral AI, Cursor, LangChain, Perplexity, Reflection AI, Sarvam, Thinking Machines Lab, and Black Forest Labs. The group will jointly develop Nemotron 4 (the next-gen open frontier model) on NVIDIA DGX Cloud, then release it for anyone to specialize.

On the model side, NVIDIA expanded its open model families across three domains:

  • Nemotron 3: Omni-understanding models for AI agents, offering natural conversation, complex reasoning, and visual capabilities. Jensen showed that Nemotron 3 in OpenClaw ranks among the top three models in the world. Already adopted by CodeRabbit, CrowdStrike, Cursor, Factory, Perplexity, and ServiceNow for agentic AI. The smaller Nemotron Nano 3 is now available on Amazon Bedrock powering Salesforce Agentforce, which Salesforce calls the most cost-efficient model for summarization and generation on their Agentic Benchmark for CRM. Reinforcement fine-tuning for Nemotron models is coming soon to Bedrock. Developers can also build and deploy Nemotron models directly in Microsoft Foundry, with Nemotron 3 Super and AI safety guardrail models expanding on Azure.
  • Cosmos 3: The first world foundation model that unifies synthetic world generation, vision reasoning, and action simulation. Built for robotics and autonomous systems. Being used by LG Electronics and Milestone Systems for physical AI.
  • Isaac GR00T N1.7 and Alpamayo 1.5: Specialized models for humanoid robots and autonomous vehicles, respectively. Alpamayo 1.5 adds an interactive, steerable reasoning model for AVs.
  • Proteina-Complexa: A protein complex prediction model (part of the BioNeMo platform) developed with Google DeepMind, EMBL's European Bioinformatics Institute, and Seoul National University, alongside a new open dataset of millions of AI-predicted protein complex predictions. Being adopted by Novo Nordisk, Viva Biotech, and Manifold Bio for healthcare AI.

NVIDIA also announced two foundational data libraries that Jensen positioned as future cornerstones: cuDF for accelerating structured dataframes (SQL, Spark, Pandas) and cuVS for vector stores and unstructured AI data. He noted that 90% of data generated each year is unstructured and was historically impossible to query until multimodal AI made it indexable.

Advertisement

Robotics: Physical AI Goes to Production

The robotics announcements were arguably the deepest at this GTC. NVIDIA is becoming the platform layer for the entire robotics industry.

Jensen framed the three-computer approach to robotics: a training computer, a synthetic data generation/simulation computer, and the onboard robotics computer. His key point: real-world data will never be enough to train for every scenario. "For robots, compute is data."

Global robotics leaders including ABB, FANUC, YASKAWA, and KUKA (companies with a combined installed base of over 2 million robots) are integrating NVIDIA Omniverse and NVIDIA Isaac simulation frameworks into their virtual commissioning solutions. They're also putting Jetson modules into their controllers for real-time AI inference at the edge.

On the humanoid front, Figure, Agility, and AGIBOT are building on NVIDIA's stack. Skild AI and FieldAI are developing generalized robot brains using Cosmos world models. World Labs is using Isaac Sim to validate its generative world models. Generalist AI is using Cosmos for synthetic data generation.

In healthcare and surgical robotics, NVIDIA launched its first domain-specific physical AI platform for healthcare robotics. CMR Surgical is contributing close to 500 hours of surgical video to the new Open-H dataset (776 hours total, 35 collaborators, 11 robotic system embodiments) and using Cosmos-H to generate synthetic surgical data. Johnson & Johnson MedTech is using a Cosmos-based foundation model and Isaac for Healthcare to generate data for the MONARCH Platform for Urology. PeritasAI is training humanoid robots for surgical environments using the Rheo blueprint. Proximie is building multimodal vision language models for real-time surgical AI agents. Medtronic is evaluating IGX Thor, which is now generally available as an industrial-grade edge AI platform. Other IGX Thor adopters include Caterpillar (in-cabin conversational AI), Planet Labs (orbital data processing), and CERN (physics-inspired AI models). Hexagon Robotics and Universal Robots round out the industrial side.

The crowd favorite: a Disney Research Olaf robot trained using NVIDIA's Newton physics simulator and Isaac Lab, walking onstage with Jensen on a Jetson compute module. Jensen teased the future of Disneyland with AI characters roaming the park. 110 robots were on display at the show floor.

NVIDIA released an open Physical AI Data Factory Blueprint for massive-scale data processing and curation.

Autonomous Vehicles: Robotaxis With Uber by 2027

Jensen declared the ChatGPT moment for self-driving cars has arrived, saying NVIDIA now knows autonomous driving works and calling autonomous vehicles "the first multitrillion-dollar robotics industry."

NVIDIA-powered robotaxis will launch with Uber across 28 cities on four continents by 2028, starting with Los Angeles and the San Francisco Bay Area in the first half of 2027. The fleet will run on NVIDIA's full-stack DRIVE AV software, tapping Alpamayo open models and the Halos operating system.

BYD, Geely, Isuzu, and Nissan (with Nissan powered by Wayve software) are building Level 4-ready vehicles on NVIDIA DRIVE Hyperion. Jensen noted these four new partners represent 18 million cars built per year, joining existing partners Mercedes, Toyota, and GM. Isuzu and TIER IV are collaborating on L4 autonomous bus development using the DRIVE AGX Thor system-on-a-chip. Ride-hailing platforms Bolt, Grab, and Lyft are also scaling robotaxi development.

NVIDIA is also collaborating with Amazon to advance Alexa Custom Assistant with multimodal edge AI on NVIDIA DRIVE AGX, enabling in-cabin AI intelligence.

NVIDIA Halos OS introduces a unified safety architecture for AI-driven vehicles. Built on ASIL D-certified DriveOS foundations, its three-layer architecture integrates safety middleware and deployable safety applications, including an NCAP five-star active safety stack. Partners joining the NVIDIA Halos AI Systems Inspection Lab include AEye, Flex, Gatik, Hesai, Lucid, MIRA, PlusAI, Qt Group, Saphira, and Valeo.

Separately, Hyundai and Kia expanded their strategic partnership with NVIDIA for next-generation autonomous driving technology, including L2+ deployment across select vehicles and L4 robotaxi innovation through their joint venture Motional.

Advertisement

Enterprise Partnerships: Adobe, Roche, and More

A few major enterprise deals stood out:

  • Adobe will use NVIDIA's advanced computing technology and libraries to build next-gen Firefly models with agentic creative and marketing workflows. Adobe is also building a cloud-native, brand identity-preserving 3D digital twin solution for marketing on NVIDIA Omniverse, and Firefly Foundry will integrate with NVIDIA's platform.
  • Roche is deploying over 3,500 Blackwell GPUs across hybrid cloud and on-premises environments in the U.S. and Europe, making it the largest announced GPU footprint in pharma. The infrastructure powers everything from biological foundation models to drug discovery to digital twins for manufacturing. Nearly 90% of Genentech's (Roche's subsidiary) eligible small-molecule programs already integrate AI. In one program, a degrader molecule for oncology was designed 25% faster; in another, AI delivered a backup molecule in seven months instead of over two years.
  • Industrial software giants are integrating NVIDIA tools to bring design, engineering, and manufacturing into the AI era.
  • T-Mobile is piloting NVIDIA RTX PRO 6000 Blackwell Server Edition AI infrastructure to run physical AI applications at the edge. Jensen explained the vision: every cell tower will become a robotics radio tower, reasoning about traffic and adjusting beam forming dynamically. Physical AI developers including Fogsphere, LinkerVision, Levatas, Vaidio, and Siemens Energy are building vision AI agents using the NVIDIA Metropolis Blueprint, with the City of San Jose among the first to adopt. NVIDIA also launched the RTX PRO 4500 Blackwell Server Edition, a 165W single-slot GPU delivering 100x performance for vision AI and 50x for vector databases versus CPU-only servers. Nokia will deploy it in its AI-RAN base stations, creating a distributed computing network for edge AI agents.
  • IBM is accelerating WatsonX Data with NVIDIA cuDF, with Jensen highlighting a Nestlé case study: same supply chain workload, 5x faster, 83% lower cost on GPUs versus CPUs. Google Cloud saw nearly 80% cost reduction for Snapchat's data processing with NVIDIA acceleration. Dell built an AI Data Platform integrating cuDF and cuVS. Oracle announced that its Private AI Services Container can accelerate vector index creation in Oracle AI Database using cuVS.

Jensen also noted NVIDIA is bringing OpenAI to AWS, expanding both companies' reach, and that Palantir, Dell, and NVIDIA have partnered to deploy AI platforms in air-gapped environments in any country, entirely on-premises.

In semiconductor design, NVIDIA launched cuEST, a new CUDA-X library that shifts electronic-structure quantum chemistry calculations onto GPUs. Applied Materials, Samsung, Synopsys, and TSMC are initial adopters, with Samsung reporting a 5x end-to-end speedup and Synopsys seeing up to 30x acceleration for semiconductor simulation workflows.

DLSS 5: The GPT Moment for Graphics

For gamers watching GTC, DLSS 5 was the headline.

NVIDIA calls it their most significant graphics breakthrough since real-time ray tracing in 2018. Jensen described the core concept as fusing controllable 3D graphics (structured data) with generative AI (probabilistic computing). One is completely predictive, the other probabilistic yet highly realistic. Together, the content is both beautiful and controllable. He called it "the GPT moment for graphics."

Jensen added a broader prediction: this concept of fusing structured information with generative AI will repeat in industry after industry. Structured data is the foundation of trustworthy AI.

For context: since the original GeForce, NVIDIA has delivered a 375,000x increase in compute, from programmable shaders (GeForce 3, 2001) to CUDA (GeForce 8800 GTX, 2006) to real-time ray tracing (RTX 2080 Ti, 2018) to path tracing and neural shaders (RTX 5090, 2025). DLSS 5 infuses pixels with AI-generated photoreal lighting and materials in real-time 16-millisecond frames.

Publishers signed on: Bethesda, CAPCOM, Hotta Studio, NetEase, NCSOFT, S-GAME, Tencent, Ubisoft, and Warner Bros. Games. Arriving this fall.

Note: The problem with this (until maybe now?) is that this kind of tech can create hallucinations too, leading to weird artifacts where your character seems to run off in a certain direction when the prediction gets it wrong. Matt Wolfe explained this to Ray Fernando on their livestream from GTC, and he says this is why gamers hate it; but the list of companies signing on have every incentive to fix this issue, so he thinks it'll be resolved.

Advertisement

Space Computing: NVIDIA Goes Orbital

And then there's space.

Jensen kept this section brief but telling: "We're going to space. We've already been out in space." He noted Thor is already radiation-approved and operating in satellites for imaging, and that NVIDIA is now working on Vera Rubin Space 1 for orbital data centers. The physics problem: in space, there's no conduction, no convection, only radiation for cooling.

The Space-1 Vera Rubin Module delivers up to 25x more AI compute than H100 for space-based inference. IGX Thor and Jetson Orin platforms handle edge computing in orbit, engineered for size-, weight-, and power-constrained environments. On the ground, NVIDIA data center platforms (including the RTX PRO 6000 Blackwell Server Edition GPU) deliver up to 100x faster geospatial intelligence processing versus legacy CPU-based batch systems.

Partners include Aetherflux, Axiom Space, Kepler Communications, Planet Labs, Sophia Space, and Starcloud.

Every Insight from Jensens' GTC 2026 Keynote

Here are our favorite moments from the stream, with timecodes.

🏭 The Era of AI Factories & New Computing Platforms

  • (0:08) Tokens as the New Commodity: AI factories are fundamentally transitioning from traditional data processing to becoming generators of "tokens"—the new building blocks of AI and digital knowledge.
    • "This is how intelligence is made. A new kind of factory, generator of tokens, the building blocks of AI. Tokens have opened a new frontier, turning data into knowledge and drawing on all we have learned."
  • (3:59) Nvidia's Trifecta of Platforms: Nvidia operates not just on chips, but across three distinct platforms: CUDA X (software), computing systems (hardware), and now end-to-end "AI Factories".
    • "Nvidia has three platforms. You think that we mostly talk about one of them. It's related to CUDA X. Our systems is another platform and now we have a new platform called AI factories. We're going to talk about all of them and most importantly we're going to talk about ecosystems."
  • (6:04) The 20-Year Bet on CUDA: The modern AI revolution stems directly from Nvidia's invention of SIMT (Single Instruction, Multi-Threaded) architecture 20 years ago, which made parallel accelerator programming radically simpler.
    • "We've been working on CUDA for 20 years. For 20 years, we've been dedicated to this architecture. This revolutionary invention, SIMT, single instruction, multi-threaded, writing scalar code could spawn off into multi-threaded application. Much much easier to program than CINDI."
  • (7:22) The Installed Base Flywheel: The ultimate moat for Nvidia is the installed base of hundreds of millions of GPUs. This massive reach attracts developers who build new breakthroughs, which in turn grows the ecosystem.
    • "It has taken us 20 years to now have built up hundreds of millions of GPUs and computing systems around the world that run CUDA. We are in every cloud. We're in every computer company. We serve just about every single industry. The installed base of CUDA is the reason why the flywheel is accelerating. The install base is what attracts developers who then creates new algorithms that achieves a breakthrough."
  • (9:05) Appreciating Compute Value: Because of continuous software updates and the vast library ecosystem, older hardware (like Ampere GPUs from six years ago) actually sees its pricing go up in the cloud due to extended useful life and constant optimizations.
    • "It is also one of the reasons why Ampear that we shipped them some six years ago the pricing of Ampear in the cloud is going up. And so all of that is made possible fundamentally because the install base is high, the flywheel is high, the developer reach is great. And when all of that happens and we continuously update our software, the computing cost declines."
  • (10:42) GeForce as Trojan Horse: The gaming brand GeForce was essentially Nvidia's 25-year marketing campaign, subsidized by parents, which eventually turned young gamers into the developers and researchers who built modern AI.
    • "GeForce is Nvidia's greatest marketing campaign. We attract future customers starting long before you could afford to pay for it yourself. Your parents paid for you to be Nvidia customers. And every single year they paid up year after year after year until someday you became an amazing computer scientist and became a proper customer, a proper developer."
  • (13:33) Neural Rendering (DLSS 5): The future of graphics is a hybrid approach. It fuses the deterministic, structured ground truth of 3D data with probabilistic generative AI, resulting in highly controllable yet breathtakingly realistic virtual worlds.
    • "This is our next generation of graphics technology. We call it neuro rendering. The fusion, the fusion of 3D graphics and artificial intelligence. This is DLSS 5... We fused controllable 3D graphics. The ground truth of virtual worlds, the structured data... We combine 3D graphics, structured data with generative AI..."
  • (16:54) The AI Database Shift: Traditional structured data (SQL, Pandas, Dataframes) is the ground truth of enterprise IT. Soon, AI agents will process this data at speeds exponentially faster than humans ever could, requiring immense database acceleration.
    • "All of these platforms are processing data frames. These data frames are giant spreadsheets and they hold all of life's information. This is the structured data, the ground truth of business. This is the ground truth of enterprise computing. Well, now we're going to have AI use structured data and we better accelerate the living daylights out of it... AI is going to be much much faster than us. Future agents are going to use structured databases as well."
  • (18:14) Unlocking Unstructured Data: 90% of the world's generated data (PDFs, videos, speeches) is unstructured and was historically "useless" for querying. Multimodal AI perception is now making this data indexable and searchable.
    • "Vector databases, unstructured data, PDFs, videos, speeches, all of the world's information. About 90% of what's generated every single year is unstructured data. Until now, this data has been completely useless to the world. We read it, we put it into our file system, and that's it. Unfortunately, we can't query it. We can't search for it... You have to understand its meaning..."
  • (19:17) Foundational Data Libraries: cuDF (for structured dataframes) and cuVS (for unstructured vector stores) are positioned to become the two most vital data processing platforms of the future.
    • "NVIDIA created two foundational libraries. Just like we created RTX for 3D graphics, we created QDF for data frames, structured data. We created QVS for vector stores, semantic data, unstructured data, AI data. These two platforms are going to be two of the most important platforms in the future."
  • (23:07) The Death of Moore's Law: Traditional CPU scaling has run out of steam. Accelerated computing combined with continuous algorithmic optimization is the only path forward to simultaneously increase scale and drive down computing costs.
    • "It was originally called Moore's law. Moore's law was about getting performance doubling every couple of years... Well, Moore's law has run out of steam. We need a new approach. Accelerated computing allows us to take these giant leaps forward..."
  • (27:18) Confidential Computing: A critical bottleneck for enterprise AI deployment is security. Modern GPUs now feature confidential computing, ensuring that not even the cloud operator or hardware owner can peek at the proprietary AI models or underlying data.
    • "That in confidential computing, you want to make sure that even the operator cannot see your data. Even the operator cannot touch or see your models. Confidential computing. Nvidia's GPUs is the first ones in the world to do that. It's now able to support confidential computing and protected deployment of these very valuable open AI models..."
  • (30:26) "Application Acceleration": Computing speedups no longer come from generalized hardware bumps. They come from vertically integrated, domain-specific software libraries (CUDA-X) tailored to specific industries.
    • "Accelerated computing is not a chip problem. Accelerated computing is not a systems problem. Accelerated computing has a missing word. We just never say it anymore. Application acceleration... The only way for us to accelerate applications going forward and continue to bring tremendous speed up, tremendous cost reduction is through application or domain specific acceleration."
Advertisement

🌐 Telecom, Startups & The Inference Inflection

  • (36:44) Reinventing Telecom (AI-RAN): The world's cellular base stations, which currently only perform radio transmissions, will be completely transformed into decentralized edge AI computing platforms.
    • "The reason for that is very simple. That base station which is it does one thing which is base station is going to be an AI infrastructure platform in the future. AI will run at the edge."
  • (44:12) The $150B AI Startup Boom: The massive spike in VC funding for "AI Natives" marks the first time in history an entire generation of startups fundamentally requires raw physical infrastructure (compute/tokens) to exist.
    • "$150 billion dollars of investment into venture investment into startups, the largest in human history. This is also the first time that the scale of the investments went from millions of dollars, tens of millions of dollars to hundreds of millions of dollars and billions of dollars. And the reason for that is this is the first time in history that every single one of these companies needs compute and lots and lots of it."
  • (46:21) The Three Pillars of the AI Revolution: The timeline moved rapidly through ChatGPT (Generative AI), to OpenAI o1 (Reasoning AI), to Claude Code (Agentic AI performing complex software execution).
    • "Chat GPT of course started the generative AI era... the next reasoning AI 01 which and then took off with 03 reasoning allowed it to reflect, allows it to think to itself, allowed it to plan... then came quad code the first agentic model. It was able to read files, code, compile it, test it, evaluate it, go back and iterate on it."
  • (47:01) Retrieval to Generation: The computing paradigm has permanently shifted. Computers used to simply retrieve stored information; now they generate it dynamically.
    • "It's not it's generative AI is a capability of software but it has profoundly changed how computing is done. Computing used to be retrieval based now it's generative."
  • (50:47) The Inference Inflection: AI is no longer just about training models. AI now must "think," "reason," and "do," which means inference computation demand has skyrocketed by a factor of 10,000x in two years.
    • "AI now has to think. In order to think, it has to inference. AI now has to do. In order to do, it has to inference. AI has to read. In order to do so, it has to inference. It has to reason. It has to inference. Every part of AI every time it has to think it has to reason it has to do it has to generate tokens it has to inference... the inference inflection has arrived at the time when the amount of tokens the amount of compute necessary increased by roughly 10,000 times..."
  • (53:45) The Trillion-Dollar Forecast: Based on current confidence and purchase orders, the demand for Blackwell and Rubin AI infrastructure architectures is projected to exceed $1 Trillion by 2027.
    • "Well, I'm here to tell you that right now where I stand, a few short months after GTCDC, one year after last GTC, right here where I stand, I see through 2027 at least $1 trillion. Now, does it make any sense?"
  • (58:30) AI Platform Resilience: The diversity and broad reach of AI across industries is what makes it resilient—it's not a single-app technology, but a fundamental computing platform shift.
    • "The diversity of AI is also its resilience. The span of reach of AI is its resilience. There is no question this is not a one app technology. This is now fundamental. This is absolutely a new computing platform shift."

⚡ Hardware Masterclass: Blackwell, Rubin & Groq

  • (1:00:06) NVFP4 Precision: Moving to 4-bit floating-point (FP4) Tensor Cores allows for massive AI inference execution without precision loss, driving extreme boosts in performance per watt.
    • "MVFP4 not just FP4 precision FP4 is a whole different type of tensor core and computational unit. We've demonstrated now that we can inference NVFP4 without loss of precision but gigantic boost in performance and energy efficiency."
  • (1:01:21) The "Tokens Per Watt" Metric: AI factories (data centers) are rigidly constrained by physical power limits (e.g., 1 gigawatt). Therefore, optimizing "tokens per watt" is the single most important factor for an AI company's revenue.
    • "Tokens per watt is important because every data center every single factory by definition is power constrained. A one gigawatt factory will never become two. It's physically constrained the laws of atoms, the laws of physicality. And so that one gigawatt of data center you want to drive the maximum number of tokens which is the production the product of that factory."
  • (1:03:36) Shattering Historical Scaling: While Moore's Law might expect a 1.5x performance bump between generations, the jump from Hopper to Blackwell NVLink 72 achieved an unprecedented 35x to 50x leap in performance per watt.
    • "You would have expected from Hopper H200 one and a half times higher. Nobody would have expected 35 times higher. I said last year at this time that Nvidia's Grace Blackwell NVLink 72 was 35 times perf per watt... Dylan Patel had a quote. He accused me of sandbagging. He says, 'Jensen sandbagged. It's actually 50 times.' And he's not wrong."
  • (1:07:42) CEOs as Token Factory Operators: Every company will soon think of itself as running a "token factory," where the efficiency of intelligence production becomes a core business metric.
    • "Every single CSP, every single computer company, every single cloud company, every single AI company, every single company period are going to be thinking about their token factory effectiveness. This is your factory in the future. And the reason why I know that is because everybody in this room is powered by intelligence. And in the future, that intelligence will be augmented by tokens."
  • (1:10:21) Vera Rubin's Agentic Design: The new Vera Rubin architecture was built specifically for Agentic AI, which pounds heavily on memory (KV Cache, structured/unstructured databases) and requires extreme single-threaded CPU speed for tool usage.
    • "Vera Rubin Nvlink 72 3.6 exoflops of compute 260 tab per second of all to all NVLink bandwidth the engine supercharging the era of Agentic AI. The Vera CPU rack designed for orchestration and agentic workflows. The STX rack AI native storage built with Bluefield 4."
  • (1:11:04) The Groq Integration: Adding Groq's LPU (a deterministic token accelerator with massive SRAM) to the Vera Rubin stack acts as an extreme turbocharger for token generation, boosting throughput by an additional 35x per megawatt.
    • "And an incredible new addition, the Gro 3 LPX rack. Tightly connected to Vera Rubin, Gro's LPU's massive on chip SRAMM, a token accelerator to the already incredibly fast Vera Rubin. Together, 35 times more throughput per megawatt."
  • (1:14:12) 100% Liquid Cooling Shift: Vera Rubin systems are fully liquid-cooled using 45°C hot water. This relieves immense cooling pressure from the data center infrastructure, transferring that energy allowance directly to computing power.
    • "Notice since the last time 100% liquid cooled. All of the cables gone. What used to take what used to take 2 days to install now takes two hours... This is also a supercomputer that is cooled by hot water 45° which takes the pressure off of the data center takes all of that cost and all of that energy that's used to cool the data center and makes it available for the system."
  • (1:15:35) Co-Packaged Optics (CPO): Using Spectrum-X, Nvidia is translating electrons to photons directly on the silicon. This drastically reduces data center switching latency and power consumption.
    • "The world's first CPO Spectrum X switch. This is also in full production. Co-packaged optics. Optics comes directly onto this chip, interfaces directly to silicon. Electrons gets translated to photons and it gets directly directly connected to this chip. We invented the process technology with TSMC."
  • (1:22:20) Token Economics & Stratification: Tokens will be tiered like traditional SaaS products: Free tiers will utilize high throughput/lower speeds, while premium tiers ($150+ per million tokens) will offer immense context windows and extreme speeds for critical research.
    • "Tokens are the new commodity and like all commodities once it reaches an inflection once it becomes mature or becomes maturing it will segment into different parts... This one is free. It's a free tier. The first tier could be $3 per million tokens. The next tier could be $6 per million tokens... And maybe one day there'll be a premium model that allows you a premium service that allows you to generate token speeds that are incredibly high because you're in a critical path or maybe you're doing really long research and $150 per million tokens is just not a thing."
  • (1:31:28) Disaggregated Inference (Dynamo): To optimize different hardware strengths, Nvidia's Dynamo OS splits the inference workload—sending memory-heavy prefill/attention tasks to Rubin GPUs, and bandwidth-limited token generation tasks to Groq LPUs.
    • "What if we disaggregated inference altogether with a piece of software called Dynamo? What if we rearchitected the way that inference is done in the pipeline? So that we could put the work that makes perfect sense on Vera Rubin and then offload the decode generation the low latency the bandwidth limited challenged part of the workload for Grock and so we united unified two processors of extreme differences one for high throughput one for low latency..."
Advertisement

🏭 Digital Twins, Space Data Centers & The Agentic OS

  • (1:41:37) Nvidia DSX (Data Center Digital Twins): Squandered data center power is lost revenue. Nvidia uses Omniverse to build physically accurate simulations of data center thermals, networking, and grids to squeeze maximum efficiency from a gigawatt facility before it's even physically built.
    • "We created Omniverse and the Omniverse DSX world a platform where all of us can meet and design these gigafactories... We have simulation systems for the racks for mechanical, thermal, electrical, networking... And then inside the data center using Max Q so that we could adjust the system dynamically across power and cooling and all of the different technologies we all work on together so that we leave no power squandered..."
  • (1:46:17) Data Centers in Space: Nvidia is actively developing Vera Rubin Space 1 for orbital data centers, facing unique physical challenges (e.g., zero convection/conduction cooling, relying solely on radiation in a vacuum).
    • "We're going to space. We've already been out in space. Thor is radiation approved and we're in satellites... In the future, we'll also build data centers in space. Obviously very complicated to do. So we're working with our partners on a new computer called Vera Rubin Space 1 and it's going to go out to space and start data centers out in space. Now, of course, in space, there's no conduction, there's no convection, there's just radiation."
  • (1:47:44) The Open-Claw Phenomenon: Andrej Karpathy's "Open-Claw" framework became the most popular open-source project in history (eclipsing 30 years of Linux in weeks), serving as the first true operating system for AI agents.
    • "Open Claw is the number one. It's the most popular opensource project in the history of humanity and it did so in just a few weeks. It exceeded what Linux did in 30 years. And it's that important. It is that important."
  • (1:52:54) The "Open-Claw Strategy" Imperative: Just as companies had to adapt to HTTP, Linux, and Kubernetes, every tech CEO now must have a designated strategy for integrating agentic operating systems.
    • "Every single company now realize every single company, every single software company, every single technology company for the CEOs, the question is what's your open claw strategy? Just as we need to all have a Linux strategy, we all needed to have a HTTP HTML strategy... Every company in the world today needs to have an open claw strategy and a gentic system strategy."
  • (1:54:45) SaaS becomes AaaS: The traditional SaaS industry will completely pivot. Every software company will soon become an "AaaS" (Agentic-as-a-Service) provider.
    • "Every single SaaS company will become a a gas company. No question about it. Every single SaaS company will become a gas company, an agentic as a service company. And what's amazing is this. You now open claw gave us gave the industry exactly what it needed at exactly the time."
  • (1:55:21) The Enterprise Agent Security Risk: Agents that can reason, access sensitive enterprise data, and communicate externally pose a massive security threat. This requires strict governance tools like "Open Shell" and "Nemo Claw" to enforce policy guardrails.
    • "Agentic systems in the corporate network can have access to sensitive information. It can execute code and it can communicate externally. Just say that out loud. Okay, think about it. Access sensitive information, execute code, communicate externally. You could of course access employee information, access supply chain, access finance information, sensitive information and send it out, communicate externally. Obviously, this can't possibly be allowed."
  • (2:04:21) Token Budgets for Employees: In the near future, an annual "token budget" will become a standard part of corporate compensation and recruiting, empowering engineers to act as 10x amplifiers of their own productivity.
    • "I could totally imagine in the future every single engineer in our company will need an annual token budget. They're going to make a few hundred,000 a year their base pay. I'm going to give them probably half of that on top of it as tokens so that they could be amplified 10x. Of course, we would. It is now one of the recruiting tools in Silicon Valley. How many tokens comes along with my job."

🤖 Physical AI & The Robotics Revolution

  • (2:06:04) Physical AI Needs Virtual Worlds: Real-world robotic data is too messy, dangerous, and slow to collect. Generative physical AI relies almost entirely on synthetic data generated inside simulated Omniverse environments.
    • "Agents as you know perceive, reason and act. Most of the agents in the world today that I've spoken about are digital agents. They act in the digital world. They reason. They write software. It's all digital. But we also have been working on physically embodied agents for a long time. We call them robots. And the AIs that they need are physical AIs."
  • (2:07:16) The Robo-Taxi "ChatGPT Moment": Autonomous driving has reached an inflection point where vehicles can now explicitly reason and narrate their decisions. Nvidia is partnering with giants like BYD, Hyundai, Nissan, and Uber to deploy these fleets at scale.
    • "As you know, we've been working on self-driving cars for a long time. The ChatGPT moment of self-driving cars has arrived. We now know we could successfully autonomously drive cars. And today we are announcing four new partners for Nvidia's robo taxi ready platform. BYD, Hyundai, Nissan, Ji all together... And we're announcing also a big partnership with Uber."
  • (2:10:21) "Compute is Data" for Robotics: Because edge cases in the physical world are infinite, developers are using continuous GPU-accelerated differentiable physics engines (Newton) to generate post-training robot data indefinitely.
    • "Around the world, developers are building robots of every kind. But the real world is massively diverse, unpredictable, full of edge cases. Real world data will never be enough to train for every scenario. We need data generated from AI and simulation. For robots, compute is data."
Advertisement

What This All Means

The sheer volume of GTC 2026 announcements is a strategy in itself. NVIDIA is building the infrastructure under every AI race simultaneously.

The Vera Rubin platform is the obvious headline for chip junkies. The strategic story is about software and ecosystems: Dynamo 1.0 as an open source inference OS. But NemoClaw positioning OpenClaw as the operating system for personal AI is the main takeaway for normies. And the Nemotron Coalition building open frontier models is an important, vital move for the future of affordable, open software. Finally, the robotics partnerships spanning the four largest industrial robot companies in the world is nothing to shake a stick at either.

All that said Jensen's trillion-dollar demand forecast through 2027 is the number that will get the most attention. But the more revealing signal was his framing of token economics: a future where every company is a token manufacturer, where SaaS becomes AGaaS, where engineers get annual token budgets alongside their base pay, and where the diversity of AI applications is itself the source of the platform's resilience.

NVIDIA makes the best chips. So the question becomes: whether this ecosystem lock-in becomes so deep that alternatives become impractical. When Anthropic, OpenAI, Roche, BYD, Adobe, Disney, and all four major cloud providers are building on your stack simultaneously, you're vital infrastructure. And when you can get both Dario and Sam to put endorsement quotes in the same press release, you're basically the Geneva Convention of the AI World.

The next 12 months will tell us whether the Vera Rubin platform delivers on its 10x claims in production, whether Dynamo simplifies inference at scale, and whether those robotaxi timelines hold. If even half of today's announcements ship on schedule, NVIDIA just made the case that the AI buildout is accelerating.

And they're putting chips in space now, so there's that.

Grant Harvey

Grant Harvey is the Lead Writer of The Neuron, where he continues to lead the publication's daily coverage of AI news, tools, and trends.

The Neuron Logo

Don't fall behind on AI. Get the AI trends & tools you need to know. Join 700,000+ professionals from top companies like Microsoft, Apple, Salesforce and more.

Property of TechnologyAdvice. © 2026 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.