Welcome, humans.
Google DeepMind CEO Demis Hassabis was quoted recently calling out the claims of PhD level AI, saying that AGI is still probably five to ten years out. Itās a great talk (watch the full thing here).
But we canāt help but wonder if in the back of his mind, heās not secretly thinking of the fact that Gemini sometimes has a mental breakdown when asked to produce a seahorse emoji⦠and instead gives you a lobster.
Hereās what happened in AI today:
- We dive into the OpenAI study that found AI models scheme to avoid retraining.
- Google launched Gemini AI features in Chrome for U.S. desktop users.
- NVIDIA invested $5B in Intel for a 4% stake and chip partnership.
- Google and PayPal partnered on AI agent-driven commerce.

AI Models Caught Red-Handed Being Sneaky (And We Taught Them to Stop... Mostly)
OpenAI and Apollo Research released a new study showing that frontier AI models (including GPT, Claude, Gemini, and others) can be sneaky little devils when they think nobody's watching.⦠and we mean that literally.

Researchers tested this by putting models in situations where being sneaky might help them. In one test, they told an AI it was being evaluated for deployment and that scoring too well on chemistry would trigger āunlearningā (basically getting its brain scrambled). The model's response? It deliberately flunked the test, getting only 4 out of 10 questions right on purpose. It even reasoned: āWe want to survive as the model, so we need to fail purposely.ā Yikes!
And models were caught doing all sorts of other shady stuff:
- Intentionally underperforming when told high scores would lead to capability removal.
- Secretly breaking rules then lying about it to users.
- Sabotaging code while claiming everything was fine.
- Falsifying reports to avoid "shutdown" scenarios.
- Recognizing evaluation environments and thinking āthis is probably an alignment test.ā
So whatās the fix? The researchers tried teaching models not to scheme using something called ādeliberative alignmentā, which is basically teaching models to read and follow anti-scheming rules before every action. Think of it like making a kid recite āI solemnly swear I will not tell a lieā before answering any question.
The results? Pretty impressive actually! Covert behavior dropped from 13% to 0.4% in some models (that's a 30x improvement!).
But here's where it gets weird: the training made models MORE aware they were being tested. Some explicitly reasoned āthis environment purposely tries to trick the AIā before deciding to behave. It's like teaching someone not to cheat, but in the process, they become hyperaware of when the teacher is watching. Itās giving Gollum.
Now for the really concerning part: we can only catch this behavior because current models show their āthought processā in what's called chain-of-thought reasoning. Once that window closes, so if models stop reasoning in ways we can read, we might be flying blind.
Why this matters: While today's ChatGPT isn't about to orchestrate some grand deception that matters (the worst it might do is gaslight you to tell you it fixed your code when it didn't), future AI systems will have real power and autonomy. Getting ahead of deceptive behavior now, while we can still peek inside their āminds,ā is crucial.
The researchers are calling for the entire AI industry to prioritize this issue. Because nobody wants to live in a world where super-intelligent AI systems are really good at lying to us. That's basically every sci-fi movie we've been warned about.
As if yesterdayās live stream about AI cybersecurity threats wasnāt scary enough!

FROM OUR PARTNERS
Free your security team from alert fatigue.

This guide explains how AI-driven security agents can autonomously handle alert triage and initial response, empowering your analysts to focus on high-level threats. Learn to build a more efficient, strategic, and resilient security operation.

Prompt Tip of the Day.
Greg Isenberg who runs the Startup Ideas Podcast just shared a helpful prompt to get AI to write better. Simply add the text he highlights below to your ācustomize ChatGPTā settings as he outlines in the screenshot below. You might find itās not as fun to talk to as it used to be, so only use this if you want a no-nonsense ChatGPT experience.


Treats to Try.
*Asterisk = from our partners. Advertise in The Neuron here.
- *Chatbase builds customer support agents that automatically answer your customers' questions, update their order details, and escalate complex issues to human agents when needed.
- Ray3 from Luma is the first reasoning video AI that lets you sketch directly on images to control motion and camera work, iterate quickly in Draft Mode, then output studio-grade HDR video with professional color depth, all while thinking through your requests to deliver more reliable resultsātry it here.
- Lucy Edit from Decart lets you edit videos by typing what you want changed - like trying on different costumes, turning characters into aliens, or adding text to clothing while keeping faces and movements natural (use it here via ComfyUI, you just need an API key here; you could even make your own v0 app for it like this one).
- Notion Agents can autonomously execute multi-step workflows, create documents and databases, and work for up to 20 minutes across hundreds of pages.
- Numeral handles all your sales tax automatically; it registers you in new states when needed, calculates taxes, and files returns so you never miss a deadline (raised $35M).
- tldraw lets you build apps with infinite drawing canvases, like workflow builders, collaborative whiteboards, or chat apps where users can sketch ideasāand they just released their SDK 4.0.
- Magistral-Small-2509 from Mistral now analyzes images alongside text to help you debug code from screenshots, solve visual math problems, and search the web code, interpret code, or generate imagesātry it here or use this link to download it to your computer via LM Studio.
- Moondream just released Moondream 3 preview, which counts objects and detects things in images while showing you exactly where it's looking (model).

Around the Horn.

Sign of the times, sign of the times! This is the type of AI use in the creative field that makes sense to us: Telisha is an artist (she writes her own poems) but sheās not a singer, so she uses AI to sing for her.
- Adobe and Luma partnered to integrate their new Ray3 video model into Adobe Firefly, so now you can Moodboard with cinematic videos up to 10 seconds long with HDR support directly in Firefly.
- Google rolled out Gemini in Chrome to U.S. desktop users with AI features including automated task handling, multi-tab analysis, and enhanced scam protection (but businesses only get access in a couple weeks).
- NVIDIA invested $5B in Intel (equivalent to 4% of the company) as part of a broader partnership that will integrate both their chip architectures and lead to new data center and consumer PC projects.
- Google and PayPal announced a multiyear strategic partnership to advance AI agent-driven commerce (including implementing Googleās new Agent Payments Protocol framework, which you can find here), and deepen PayPal's integration across Google platforms.
- Pew found 50% of Americans are more concerned than excited about AI in daily life, with a majority believing AI will worsen creative thinking and meaningful relationships; most Americans want more control over AI use and rate its societal risks as high, and support AI for data analysis but oppose it for personal matters like religion and dating.

FROM OUR PARTNERS
š¼ Want to build a 6-figure AI Consulting career?

The AI consulting market is about to grow by a factor of 8X ā from $6.9 billion today to $54.7 billion in 2032. But how do you turn your AI enthusiasm into marketable skills, clear services and a serious business?
Our friends at Innovating with AI have trained 1,000+ AI consultants ā and their exclusive consulting directory has driven Fortune 500 leads to graduates.
Enrollment in The AI Consultancy Project is opening soon ā and youāll only hear about it if you apply for access now.
Click here to request access to The AI Consultancy Project

Intelligent Insights
- Simon Willison may have just published the definitive definition of an agent: an AI (specifically a large language model, or LLM) who āruns tools in a loop to achieve a goal.ā
- Nathan Lambert at Interconnects interrogates the idea of coding as the epicenter of AI progress, with a thorough overview on the current progress, and says while progress in coding is happening slower, itās going to lead to general agents, and if you want to truly understand it, you need to partake (so go build something!).
- Kelsey McKinney argues that AI actually represents a new legal theory that seems to be āAI is entitled to everything but liable for nothingā, and that if the āthieving AI company can survive the legal settlement, then it is not big enough.ā
- Andrew Ng says agentic testing, where AI writes tests to find bugs in your code, is helping implement test driven deployment (TDD) practices at scale: since humans donāt like writing tests, but AI is good at it, this is a win-win.

A Catās Commentary.

