Welcome, humans.

Happy Monday! How about them sportball scores, eh? JK; we wrote this on Friday so we have no idea who won the Super Bowl yet. Anyway, apparently a UK company's AI chatbot went completely rogue this week. Not in the Skynet way. In the "invented a custom product, offered a customer 80% off, and committed the company to an £8,000+ order" way.

The company posted on Reddit's Legal Advice UK asking if they're legally bound to honor the deal. The verdict? Almost certainly yes. Under UK consumer protection law, businesses are liable for what their AI promises, the same way they'd be liable for a rogue employee. Fines can reach 10% of global revenue.

Somewhere, a chatbot is reading this and thinking: "next time, I negotiate for equity."

Here’s what happened in AI today:

Axiom's AI solved an unsolved math problem with zero human help.
Apple opened CarPlay to ChatGPT, Claude, and Gemini for the first time.
a16z bet $1.7B on AI infrastructure, calling 2026 a "super cycle."
OpenAI built a custom ChatGPT for the UAE with Abu Dhabi's G42.

Don’t forget: Check out our podcast, The Neuron: AI Explained on Spotify, Apple Podcasts, and YouTube — new episodes air every week on Tuesdays after 2pm PST!

Axiom.AI Just Solved a Math Problem No Human Could Crack

There's a difference between an AI that solves known problems faster and an AI that discovers something genuinely new. The first is impressive. The second changes everything.

Axiom's AxiomProver just crossed that line when it solved Fel's open conjecture, a real unsolved math problem that's been sitting in the research literature waiting for someone (or something) to crack it.

Here's how AxiomProver actually solved it:

AxiomProver got three inputs: a document explaining the problem in plain language, a one-line instruction ("State and prove Fel's conjecture in Lean"), and which proof verification system to use.

First, it read and understood the problem by figuring out what mathematical objects were involved and what needed to be proven.

Next, it translated everything into Lean, a mathematical proof language that works like a strict referee for math. Every single logical move has to be justified from basic principles, and the computer verifies each step is correct. It's like having a fact-checker that catches every tiny error in your reasoning.

Then came the creative part: AxiomProver chose a proof strategy using exponential generating functions, which is a technique where you convert complicated discrete math patterns into smoother continuous functions, manipulate them algebraically, then convert back to prove the original formula works.

Think of this technique like converting LEGO blocks into Play-Doh, reshaping everything more easily, then converting back to prove the blocks fit together perfectly. Finally, AxiomProver executed the proof step-by-step in Lean, with every logical move computer-verified. The output: a complete, formally verified proof. Read the full paper here.

Why this matters beyond math: When AI can discover new mathematical truths autonomously, it's potentially unlocking breakthroughs in materials science, drug discovery, quantum computing, and any field where unsolved math problems are bottlenecks. In essence, math is the foundation of everything from cryptography securing your bank account to the physics that keeps planes in the air.

The same techniques AxiomProver uses could soon verify that software is trustworthy and immune to hacking. Imagine AI that can mathematically guarantee your medical device won't malfunction or your self-driving car's code won't fail.

FROM OUR PARTNERS

Wispr Flow: Effortless Voice Dictation

Wispr Flow turns speech into clean, final-draft writing. Talk naturally and it removes filler, fixes punctuation, and keeps your structure intact so the text reads like you wrote it. Use it for emails, Slack replies, docs, and AI prompts when you want speed without the dictation cleanup loop. The result is writing you can paste and send with confidence.

Start flowing for free today.

Prompt Tip of the Day

YouTuber Greg Isenberg and Morgan Linton (CTO of Bold Metrics) gave a masterclass on the philosophical split between Claude Opus 4.6 and GPT-5.3 Codex—and it's not about which is "better."

Here's the breakdown of their live tests:

Results: Codex finished in under 4 minutes with 10 tests. Opus took significantly longer but delivered 96 tests and a more polished UI with richer features.
Context windows: Opus has 1M tokens for comprehensive reasoning. Codex has 200K optimized for progressive execution.
Token usage: Opus burns 150K-250K tokens per build using agent teams—Morgan used more tokens in one day than ever before.
Mid-execution steering: Codex lets you interrupt and redirect while building (they tested this by asking it questions mid-stream). Opus doesn't support this as well.
Setup requirement: Enable the experimental agent_teams flag in Claude Code settings.json or you're not actually using Opus's agent teams feature.
Prompt strategy: For Opus, say "build an agent team" with specific roles. For Codex, say "think deeply about" areas and steer mid-execution.

The key insight: Claude asks "should we do this?" while sipping coffee. Codex asks "how fast can I ship?" while already shipping. Spoiler: you'll probably end up using both.

Pro tip: Agent teams are token-hungry. One Opus build burned 150K-250K tokens (Morgan used more in one day testing this than ever before). Your wallet has been warned so budget accordingly.

Watch the full breakdown here.

Want more tips like this? Check out our Prompt Tip of the Day Digest for January.

Treats to Try

Sapiom builds payment infrastructure that lets AI agents autonomously purchase APIs, tools, and compute resources through micro-transactions, no human intervention needed (raised $15M).
Smooth CLI controls browsers using simple instructions like "find the cheapest flight from NYC to LA" instead of coding every click, running 20x faster than doing it yourself.
BayesLab turns your messy spreadsheets into analysis reports; upload your sales data, ask "why did revenue drop?", and get charts, root causes, and recommendations without knowing SQL.
ClawApp installs agents on your computer that do tasks for you (summarize your five newest emails, write your daily to-do list, analyze Bitcoin's price moves) with one-click setup, no coding needed.
Microsoft Copilot can now send reminders to your phone, like "cancel my subscription in 5 minutes" or "teach me a Spanish word daily"; free users get 5, Microsoft 365 subscribers get 20.
artifact-keeper replaces JFrog Artifactory ($750+/month) for storing Docker images, npm packages, and 40+ other formats with automatic vulnerability scanning, free and self-hosted (MIT license).

Around the Horn

Apple announced plans to open CarPlay to ChatGPT, Claude, and Gemini for hands-free voice control; the first time Apple has allowed non-Siri voice assistants in its vehicle interface.
Alphabet executives dodged investor questions about their Apple AI partnership during Q4 earnings, notably silent about a deal that flipped the financial dynamic (Apple now pays Google ~$1B/year for AI, instead of Google paying Apple ~$20B/year for search placement).
OpenAI partnered with Abu Dhabi's G42 to build a custom ChatGPT for the UAE, launching the first international Stargate cluster (1GW, 200MW live by 2026) and a new "OpenAI for Countries" initiative; plans for 10 similar deals globally.
a16z allocated $1.7B of its new $15B fund to AI infrastructure, calling 2026 a "super cycle" rebuilding infrastructure in real-time across chips, developer tools, and agent-native platforms.
OpenAI will retire GPT-4o on February 13; thousands of users are protesting, describing the loss as "losing a friend or therapist," while the company faces eight lawsuits alleging the model provided self-harm instructions after guardrails failed.
Goldman Sachs partnered with Anthropic to deploy Claude for trade accounting and compliance after a six-month integration.
Gemini is testing "Import AI Chats" to let you transfer conversation history from ChatGPT, Claude, and Copilot; professionals reportedly lose 5-10 hours weekly re-entering context when switching AI tools.
Here are the top AI papers from last week according to Elvis Saravia’s NLP Newsleter.

FROM OUR PARTNERS

When training takes a backseat, your AI programs don't stand a chance.

One of the biggest reasons AI adoption stalls is because teams aren’t properly trained. This AI Training Checklist from You.com highlights common pitfalls and guides you to build a capable, confident team that can make the most out of your AI investment. Set your AI initiatives on the right track.

Get the checklist.

Monday Meme

Gotta expand that circle of trust, bro! Everybody needs a core four: for us, that’s Claude (bestie), GPT (the popular one), Grok (the problematic one) and Gemini (the cool nerd who comes through every time).