China Just Dropped a Pair of 1-Trillion-Parameter AI Models in One Weekend (While OpenAI Stays Silent)

Over the weekend, Chinese AI giants Alibaba and Moonshot AI both released 1-trillion-parameter models — Qwen3-Max-Preview and Kimi K2-Instruct — that claim GPT-4 level performance while remaining accessible without enterprise barriers.

The weekend brought a surprise AI arms race between Chinese labs, with both Alibaba and Moonshot AI releasing trillion-parameter models that claim to rival or beat Western competitors.

First up: Alibaba previewed Qwen3-Max-Preview, their biggest model yet with over 1 trillion parameters (that’s a lotta params, fam). According to their benchmarks, it beats their previous heavyweight Qwen3-235B-A22B-2507 across the board — better conversations, stronger instruction following, and improved agentic tasks.

You can try it yourself via Qwen Chat or the Alibaba Cloud playground (heads up: they may train on whatever data you share in the playground). It's also live on OpenRouter for those who prefer a unified interface. Pricing is tiered by input tokens (so context): first 32K tokens = $0.861, 32K-128K = $1.434, 128K-252K = $2.151.

The catch?

A couple: Unlike some competitors, Qwen3-Max has no “thinking” mode. Also, Simon Willison noted it's not open weights — only available through their chat app and paid API.

Early demos show it handling complex visual tasks like one-shotting a voxel pagoda garden. And when Willison tested it with his O.G prompt, “Generate an SVG of a pelican riding a bicycle,” the results were... interesting.

Susan Zhang pointed out some hallucination issues, too: “it certainly hallucinates extensive thinking traces... mixing a bunch of search results that don't seem consistent with one another.”

Meanwhile, Moonshot AI wasn't sitting idle.

They released Kimi K2-Instruct-0905, a mixture-of-experts model with 32 billion activated parameters from a total of 1 trillion (read more).

Kimi's upgrades:

  • Context window extended from 128k to 256k tokens.
  • Enhanced frontend coding capabilities.
  • Improved tool-calling accuracy (claims 100% on their turbo API).
  • Better integration with coding agents like Claude Code and Roo Code.

The benchmarks are impressive: 69.2% on SWE-Bench verified, 55.9% on SWE-Bench Multilingual, and 44.5% on Terminal-Bench. But then again, benchmarks, what are they good for? It’s all about the vibes, man! And the vibes on Kimi have been vibin’…

For example:

Try Kimi K2 at kimi.com or grab the weights from Hugging Face. For blazing-fast 60-100 TPS, check out their turbo API.

Chubby on X says: “Scaling works—and the official release will surprise you even more."

Our take: 

Where you at, OpenAI? Not ready to insta-drop and model top these new open source competitors? It seems gone are the days of flip-flopping model-topping to try and swipe each other's good news... whatever else that tells you about the state of the industry, it shows we are certainly in a new phase.

So while the big US AI labs are optimizing their user experiences, the Chinese labs are pushing boundaries with trillion-parameter models that challenge Western dominance. Paging Google DeepMind… we need you! Release the Gemini 3 Pro! We need a Bat signal for Logan Kilpatrick and Demis Hassabis… maybe like a giant (nano) banana??

cat carticature

See you cool cats on X!

Get your brand in front of 550,000+ professionals here
www.theneuron.ai/newsletter/

Get the latest AI

email graphics

right in

email inbox graphics

Your Inbox

Join 550,000+ professionals from top companies like Disney, Apple and Tesla. 100% Free.