😺 OpenAI vs Claude vs Google new AI

PLUS: GPT oss, Genie 3, Opus 4.1, and more...

Welcome, humans.

Tuesday felt like Christmas morning for AI nerds. Just about every major company dropped new AI products—all on the same day.

Why the sudden rush? Everyone's trying to steal thunder from OpenAI's GPT-5, which is expected to launch Thursday.

It's like Anthropic and Google showed up to OpenAI’s wedding wearing white just to upstage the bride, but then Becky-Plus-One and Becky-Plus-Two basically got up on stage and read their own vows. TBH, based on today’s split coverage, it worked…

If this was Tuesday, what's Thursday gonna look like?

Here’s what happened in AI today:

OpenAI released gpt-oss, its first open-source model since GPT-2.
Claude release Opus 4.1, an all-around upgrade for Claude 4 Opus.
Gemini previewed Genie 3, its new next-gen “world model.”
ElevenLabs released a model for music.

Open source, open worlds, and Opus: OpenAI released gpt-oss, Claude released Opus 4.1, and Google previewed Genie 3.

DEEP DIVE: The AI Trinity Just Dropped: Open Source, Coding Genius, and World Building (read the full story here!).

Well, we didn't get GPT-5, but we got something arguably better: three game-changing releases from AI's biggest players, each flexing in their own lane.

First up, an OpenAI plot twist: They actually released some new open AI models (shocking, we know) that anyone can download (Github, HuggingFace) or try on the cloud here.

Real quick: WTF is an “open” model? See, when AI models are “open source,” it means you can run them on your own computer instead of sending your data to someone else's servers. It’s like running Microsoft Word vs using Google Docs; meaning your private documents, health data, or business secrets never leave your device.

Here’s the deets:

Two models dropped—gpt-oss-120b (big, think 60GB RAM) and gpt-oss-20b (small, think ~12GB RAM)—both with Apache 2.0 licensing (aka use them however you want).
The big deal? The smaller one runs on regular laptops (using Ollama, or our pick, LM Studio) or even a powerful phone.
They feature adjustable reasoning levels (low/medium/high), built-in web browsing, and Python execution.
Within hours, gpt-oss became #1 trending on HuggingFace.
The prevailing sentiment? “OpenAI is so back.”

Sam Altman emphasized OpenAI spent billions developing this tech and are giving it away free because “far more good than bad will come from it.” This was somewhat echoed by the reactions, but our favorite take was probably this one: “ClosedAI officially became SemiClosedAI today.”

Anthropic's Stealth Drop: While everyone was distracted, Anthropic quietly shipped Claude Opus 4.1—possibly the most capable coding model available today.

It hit 74.5% on SWE-bench Verified (state-of-the-art for fixing real GitHub issues).
GitHub itself says multi-file refactoring shows “the biggest gains.”
Rakuten reported surgical precision: it finds bugs, fixes only what's broken, and doesn't break anything new. Any vibe-coders will tell you, that’s big if true.

The catch? At $75 per million output tokens, it costs 5x more than Sonnet 4 (but that’s the same as current Opus 4). You'll basically be paying for an AI that debugs like a senior engineer… up to you if that’s worth it!

Google's Reality Simulator: Last (but deffo not least), Google DeepMind premiered Genie 3, its latest “world model” that creates fully interactive 3D worlds from text descriptions: 24fps, explorable for minutes, with perfect consistency.

Want sudden rain? Different weather? Just prompt it. No 3D modeling needed, no deterministic controls, just prompt and go. The wildest part? They achieved this leap from Genie 2 in just eight months.

As Matt Berman said, “video games will never be the same” (check his full video on Genie 3 here). But don’t worry game devs, it’ll be a loooong time before this can run consistently for a full 52 hour AAA game.

FROM OUR PARTNERS

Your keyboard called—it’s taking a long weekend ⌨️✈️

We’ve been banging away on keyboards for 150 years. Until now, voice dictation hasn’t been reliable enough to change that.

Wispr Flow finally delivers the no-edit confidence we’ve all been waiting for:

4× quicker than typing. Dictate emails, docs, and DMs in real time and save precious hours every week.
AI auto-edits on the fly. Flow cleans filler words, fixes grammar, and formats perfectly as you speak.
Works inside every app with no setup. Fly through Slack notifications, give more context to ChatGPT, or brain dump into Notion.
Use it at your desk or on the go. Available on Mac, Windows, and iPhone.

“This is the best AI product I’ve used since ChatGPT.”
— Rahul Vohra, CEO, Superhuman

Give your hands a break ➜ start flowing for free today.

Prompt Tip of the Day.

OpenAI's new open-weight gpt-oss models come with a dead-simple prompting hack: just add “Reasoning: high” to unlock deep thinking mode, or use “reasoning: low” for faster responses when you don't need the full analysis (“Reasoning: medium” is the balanced version, which is on by default). Here’s how that’s handled in LM Studio.

These models separate their outputs into channels: “analysis” shows raw chain-of-thought, while “final” contains the polished answer.

So when you prompt with high reasoning, you literally see the model working through the problem step-by-step before answering.

Developers, here’s some additional insight: First of all, HuggingFace has a guide to working with gpt-oss. Secondly, you’ll need to use the harmony response format for proper prompt formatting. OpenAI demonstrates what that looks like below:

OpenAI says this structure is needed to get the oss models to output to multiple “channels” for chain of thought, tool calling, and regular responses.

They open-sourced the Harmony renderer for this purpose, but this guide walks through how to use this if you’re going to try to spin this up on your own (and not through an API provider or via Ollama or LM Studio). Oh, and if you wanna fine-tune this model yourself, here’s OpenAI’s guide for that, too.

Treats to Try.

*Asterisk = from our partners. Advertise in The Neuron here.

*Luma AI turns your text into videos and lets you completely restyle any video's background.
Eleven Labs Music generates complete songs, like “dreamy psychedelic indie rock with reverb-soaked vocals” or “1950s crooner with vinyl crackle” and gives you full control over genre, style, structure, and lyrics in multiple languages.
NotebookLM released bulk URL uploads, so you can upload multiple links of context into the app to help you study as possible.
Happenstance searches your entire network (email, Twitter, LinkedIn contacts) using natural language, so instead of scrolling through hundreds of LinkedIn connections, you just type "fintech PM in NYC" and it instantly finds matches.
Shopify’s new commerce tools let you add shopping and checkout directly into your AI agents and chatbots.
Gemini now has “Storybook” which turns any topic into a custom illustrated storybook with narration (desktop only right now).

Around the Horn.

Apparently some guy (HE2) was hired by AI agents so they could use his likeness to promote deepfake videos of him. Wild story… looks like he just got a raise, too! Even if this is just a bit, we LOVE it.

Amazon is now hosting OpenAI’s two new open source models, which is the first time (maybe ever?) OpenAI models have been available on AWS.
OpenAI and the NYT are going back and forth over whether OpenAI will need to hand over 20M private user chats or 120M private user chat as part of the NYT’s lawsuit.
The US State of Illinois banned AI from acting like a therapist, restricting AI from directly providing therapy or making therapeutic decisions, and requiring licensed professional oversight and patient consent (read the bill).

Midweek Wisdom.

Can large language models identify fonts? Max Halford says “not really.”
Blood in the Machine writes that the AI bubble is “so big it’s propping up the US economy.”
A newly proposed datacenter in Wyoming could potentially consume over 5x more power than all the state’s homes combined.
Seva Gunitsky argues “facts will not save” us, and that Historian and Translation roles might be the first jobs to get fully automated, but only because the “interpretative element of their labor” goes under-appreciated and gets dismissed as “bias.”
Cisco researcher Amy Chang developed a “decomposition” method that tricks LLMs into revealing verbatim training data, extracting sentences from 73 of 3,723 New York Times articles despite guardrails.
Gary Marcus, famous LLM skeptic, thinks with 5 months left in the year, AI agents will remain largely overhyped (while under-delivering), and thinks neurosymbolic AI models are still needed for true AGI (but underfunded).
1. Oh, and he cited this paper, “the wall confronting large language models”, which argues language models have a fundamental design flaw where their ability to generate creative, human-like responses comes at the cost of permanent unreliability… and making them trustworthy would require 10 billion times more computing power.