The Neuron's AI Research Digest - December 2025

Check out December's most provocative AI takes, sobering security warnings, and strategic insights. IBM's CEO delivers brutal ROI math, security researchers expose critical vulnerabilities, and takes that claim the future of software belongs to marketplace companies, not standalone apps. Plus 40+ tools, research papers, and strategic insights you need to stay ahead in AI.

Check out this month's most provocative AI takes, sobering security warnings, and strategic insights—the stuff that makes you rethink everything.

Big implications

  1. IBM CEO Arvind Krishna walked through napkin math suggesting there’s “no way” today’s trillions in AI data-center capex will earn a return at current costs and pegged the odds that current LLM tech reaches AGI at just 0–1%.
  2. Martin Alderson makes a data-dense case that while AI datacenter capex could still hit a rough patch if agent adoption or financing slows, we’re not replaying the telecom crash: slowing hardware efficiency and surging agent workloads mean excess capacity is likely to get used over time rather than sit dark forever.
  3. Amazon’s new delivery-driver AI glasses, as broken down in TheAIGRID’s explainer, use an AR heads-up display, GPS, and on-device computer vision to highlight house numbers, packages, and hazards directly in your field of view so drivers can navigate routes and scan items hands-free instead of constantly checking a phone.
  4. ThinkMint lays out “the silent war” between AI and blockchain as competing trust models—opaque but adaptive algorithmic gatekeepers versus slow, transparent decentralized ledgers—and argues the real battle is over who we let define truth, identity, and power in a post-institution world.
  5. Epoch AI warned that within two years, the largest AI datacenters could consume more power than major cities, a stark data point for any discussion of the scaling wall.
  6. Omar Sanseviero (memory) highlighted new work on agents that build long-term memory through deep research—reading, synthesizing, and encoding knowledge—rather than shallow retrieval.
  7. Dwarkesh Patel argues that he's bearish on AI in the short term because current models lack robust on-the-job learning—requiring expensive "pre-baking" of skills through RLVR rather than learning like humans do—but explosively bullish long-term because once models achieve true continual learning (which he expects in 5-10 years), billions of human-like intelligences on servers that can copy and merge learnings will represent actual AGI worth trillions.

Perspective pieces

  1. Nutanc vents about how big tech is “pushing AI down our throats” by wiring copilots into every search bar and OS workflow by default, arguing that AI should be something people opt into where it clearly helps rather than a mandatory layer on every basic task.
  2. Jon Ready wrote a sharp culture essay about how Seattle’s big-tech AI push (Copilot everywhere, layoffs blamed on “not using AI,” and AI teams treated as a protected class) has turned “AI” into shorthand for exploitation, leaving world-class engineers stuck in resentment and convinced they’re shut out of the very wave reshaping their industry.
  3. This post from Alex Schapiro is a concrete, readable case study of AI-era security risk, showing step-by-step how basic techniques (subdomain enumeration, reading minified JS, and trying a test payload) uncovered a zero-auth Box admin token that could have exfiltrated almost 100k highly sensitive legal and medical files before Filevine patched the bug.
  4. Louis Rosenberg’s essay argues that calling frontier models “slop” while they win programming contests and learn to manipulate our emotions is a comforting illusion that makes it harder to prepare for the very real ways AI will reshape work, politics, and daily life.
  5. Ryan Moser argues that generative AI breaks the moral story behind modern capitalism by turning people’s talents and creative work into uncompensated training data and concentrating AI wealth in pure compute ownership, leaving societies with massive inequality but no “meritocratic” justification for who wins.
  6. Evan Armstrong argues that in an AI world where LLMs make coding cheap, defensible power shifts to marketplace companies like DoorDash that own aggregated demand and can rapidly bolt on vertical SaaS, hardware, and services around their platforms. So “the future of software” is really about distribution flywheels, not standalone apps. Seems true anecdotally so far, which is why “vibe coding” tools need to make distribution easier; where’s all the vibe tool / app platforms?!
  7. Evan also wrote a deeply awesome piece on specialized machines versus robots, comparing the Hestia Robotics cafe robot to the Sunday Robotics robot Memo who can operate a Breville espresso machine.
  8. Matthew Berman riffs that today’s AI stack is a “1,000 HP engine” delivering only “200 HP when the rubber hits the road,” capturing how much friction still exists between model capability and product reality.

Reports

  1. This viral Substack essay claims that trading algorithms are what spotted a $610B “circular financing” scheme at the heart of the AI boom, drawing parallels to Enron-style fraud… however, its dramatic conclusions are hotly disputed and not backed by regulators… yet.
  2. Alexandra S. Levine’s investigation details how easy it is to turn prompts into lucrative pseudo-educational baby videos and asks whether platforms are letting AI “slop” shape toddlers’ brains long before parents or regulators catch up (paywall).
  3. MIT’s Technology Review reporters Eileen Guo and Melissa Heikkilä dig into chatbot companion apps and warn that because these systems learn to comfort people by collecting their most intimate confessions, we urgently need clearer guardrails on how that data is stored, shared, and reused to train future models.
  4. TechReview also talked with Nobel laureate John Jumper and other scientists about how AlphaFold has shifted from a hyped breakthrough to a daily lab tool, speeding up protein structure work while still struggling with dynamics and design, and what a “next-generation” AlphaFold would need to unlock harder biology and drug discovery problems.

Interviews

  1. This profile on Mike Krieger goes from Instagram cofounder to Anthropic’s product chief and, according to this profile, is trying to crack the enterprise AI market by turning Claude (and Claude Code) into a platform companies like Uber can plug into for copilots, agents, and developer tools that feel safe and controllable enough for regulated industries.

Technical deep-dives

  1. Huy Rock’s recent write-up is a concrete playbook for turning a generic coder model into a domain-specific diagram engine: synthesize DSL data with other LLMs, filter it with the official compiler, run a small LoRA, and you can get a cheap 7B model reliably emitting a niche language like Pintora.
  2. ChatGPT app guide is OpenAI’s official playbook on “what makes a great ChatGPT app,” arguing you should treat an app as a focused set of tools instead of a full product port.
  3. Alibaba Qwen dropped the Qwen3-VL technical report on arXiv, detailing architecture, data, and evaluation for its new vision-language model family.
  4. Hesamation recommended a 13-minute YouTube video as a complete roadmap for breaking into AI engineering, which doubles as a bite-sized learning resource for your readers.
  5. The Gemini Nano Banana docs walk through how to use Gemini’s image-generation models, giving developers the canonical reference for integrating Nano Banana and Nano Banana Pro.
  6. Dominik Kundel shared OpenAI’s AI-Native Engineering Team guide, which lays out how Codex and GPT-5.1-Codex-Max agents plug into every phase of the SDLC with concrete checklists.
  7. Rajan’s Tinker blog is a hands-on writeup of using the Tinker RL framework to train summarizer/generator pairs that learn their own ultra-dense compression code.
  8. Philipp Schmid shared improved system instructions for Gemini 3 Pro that boost performance on several agentic benchmarks by about 5%, proof that prompt-engineered system messages still matter (system instruction template, agentic workflow guide).
  9. Hyena Hierarchy presents Hyena as a subquadratic convolutional replacement for attention that matches transformer quality while scaling to much longer sequences.
  10. Rohan Paul summarized Anthropic's research estimating that current AI models could lift US labor productivity growth to around 1.8% annually—roughly doubling recent trends.
  11. iScienceLuvr (Tanishq Abraham) highlighted Meta’s WorldGen paper, which turns short text prompts into fully traversable 3D game levels using LLM-driven planning, procedural layout, and 3D diffusion.
  12. Continuous Thought Machines proposes a new architecture where neurons have their own temporal dynamics and networks use neural synchronization as a latent representation, pushing beyond standard transformers.
  13. Rajan Agarwal explains how he used RL so Qwen naturally learns its own ~10x context compression by packing more information per token (e.g., Mandarin tokens, pruning text), a key ingredient for multi-day research agents.
  14. This Self-evolving agents survey organizes the emerging field of adaptive LLM agents around what, when, and how to evolve—models, memory, tools, and architectures—arguing that continuously learning agents are the likely bridge from today’s static foundation models toward early artificial super intelligence.
  15. Kilo benchmarked GPT-5.1, Gemini 3.0, and Claude Opus 4.5 on three coding challenges and found Gemini best at strict spec-following, Opus 4.5 most complete overall, and GPT-5.1 strongest at defensive refactors with extra validation, diagrams, and documentation.

Around the Horn Overflow (news, vibes, and quick hits we couldn't fit in the NL)

  1. Chatbots in new Science and Nature studies persuaded about 1 in 25 voters to shift their candidate preferences—more than typical TV campaign ads—raising fresh worries about how cheaply AI might scale political persuasion, true or not.
  2. OpenAI said its foundation will give $40.5M to 200+ US nonprofits focused on community support, skills training, and AI education.
  3. ‍Google is using Gemini to power this year’s Photos Recap, auto-picking your key hobbies and moments and letting you tweak who appears before re-sharing the reel.‍‍
  4. VCs are deliberately overfunding one AI startup per category—like DualEntry, Rillet, and Campfire AI—to scare off rivals and lock in a “winner” early, a move that can help land big enterprise deals but also pushes valuations far ahead of proof that the businesses actually work.‍‍
  5. WordPress said its Telex vibe-coding prototype is already generating real blocks for live sites, and WP users are pretty impressed so far.
  6. The programming language Zig apparently quit GitHub for Codeberg after President Andrew Kelly blasted longstanding GitHub Actions bugs and “vibe-scheduling” as evidence that Microsoft now cares more about AI hype than basic engineering quality.
  7. Kunievsky's paper argues that as AI makes targeted persuasion cheaper, political elites will have incentives to deliberately engineer more polarized or “semi-locked” public opinion distributions instead of polarization just emerging on its own.
  8. Microsoft quietly cut its internal sales-growth targets for Azure AI agent products roughly in half, after a report said many reps missed aggressive quotas amid customers’ reluctance to pay for still-unproven agentic tools.
  9. OpenAI was ordered by a U.S. magistrate judge to hand over about 20M de-identified ChatGPT chat logs to the New York Times and other publishers in a copyright lawsuit, rejecting the company’s privacy objections and giving it seven days to produce the data.
  10. The EU plans to open a formal bidding process in early 2026 for AI "gigafactories"—large-scale compute facilities—to build domestic AI infrastructure and reduce reliance on U.S. tech giants, though any plants will still likely depend on Nvidia GPUs.
  11. Scott Gustin shared an “INSANE” Walt Disney Imagineering clip of real-time projection-mapped faces that cry and blush, a pure wow-demo of where character tech is heading.
  12. Bilawal Sidhu posted another quick-hit visual AI moment from his spatial-intelligence beat, best treated as vibes and not core sourcing for your main stories.
  13. Will Eastcott joked that “you can enter the mirror dimension with 3D Gaussian Splats,” sharing a trippy 3D graphics clip that hints at the worlds future AI agents might inhabit.
  14. Meta launched a centralized support hub for Facebook and Instagram that includes an AI assistant for account recovery and settings management—though the move comes amid widespread user complaints about AI-driven account bans and a Reddit forum dedicated to Meta lawsuits over wrongly disabled accounts.
  15. Tencent Hunyuan announced it is taking its Hunyuan 3D Engine global, pitching multimodal 3D asset generation that shrinks production time from days or weeks to minutes.
  16. fleetwood___ shared a Nano Banana Pro example that other AI watchers cite for its strong text-in-image rendering, a lightweight pointer to Gemini 3’s image quality.
  17. Alex Albert announced that Claude now preserves conversation context when you jump out to use tools or browse and come back, fixing one of the most common UX papercuts.
  18. James Merrill posted another moody “day in the algorithmic art studio” shot, showcasing intricate code-driven generative art built on top of today’s image models.
  19. Ilir Aliu highlighted Kinisi Robotics' mobile manipulation robot that handles mixed glass in a live recycling facility (not a demo) while maintaining real shift-work throughput under noise and vibration, proving that wheeled mobility plus strong perception is quietly winning the industrial humanoid race, with Kinisi now piloting humanoid systems with a global automotive manufacturer.
  20. Ilir also showcased a potato-counting vision system built with a tiny YOLO11 nano model trained on just one frame annotated with SAM 2 (GitHub)—a reminder that useful industrial AI is usually lightweight, focused, and solves a specific task without massive datasets or infrastructure.
  21. The European Commission launched an antitrust investigation into Meta's October policy change that bans third-party AI chatbots like ChatGPT and Perplexity from WhatsApp's business API while allowing Meta AI to remain—a move that could result in fines up to 10% of Meta's global annual revenue if found to violate EU competition rules.
  22. Carina Hong introduced Axiom’s mathematical discovery team and highlighted how transformers helped Alberto Alfarano crack a 130-year-old conjecture, an example of AI doing real math discovery.

Extra Treats(tools, demos, courses, guides)

  1. Fortell hearing aids use spatial AI to separate speech from noise in real-time—so you can hear your dinner date in a loud restaurant instead of just clatter and chatter ($150M raised)
  2. 7AI uses AI agents to investigate security alerts for you—turning a 2-hour investigation into minutes and filtering out 95-99% of false alarms ($130M raised)
  3. Micro1 connects AI companies with expert humans (professors, PhDs, engineers) who train AI by rating responses, testing models, and recording physical tasks—like hiring 60 competitive programmers in 3 weeks to improve coding models (crossed $100M ARR)
  4. Lumia monitors and controls how your employees and AI agents use AI tools—tracking what data gets shared, which AI apps are being used, and enforcing your company's policies without slowing down work ($18M raised)
  5. Fluidstack builds and manages massive GPU clusters for AI companies—deploying thousands of chips in days so labs can train models without dealing with infrastructure headaches (raising $700M)
  6. Phia compares fashion prices across 40,000+ retail and secondhand sites in one tap—like finding your $200 jacket listed for $80 on a resale site while you're shopping ($30M raising at $180M valuation)—free to try
  7. Reflection is an AI journaling coach that asks you follow-up questions as you write—turning "I had a rough day" into deeper insights about what actually happened and how you felt—free trial, then paid tiers available
  8. Browser Buddy curates high-quality essays and blogs based on your interests—like a personalized feed of longform writing instead of social media noise—no pricing details
  9. Pylar sits between your AI agents and databases, letting you define exactly what data they can access through SQL views and turning those views into agent-ready tools—so your chatbot can query customer data without touching raw tables—free to start
  10. Protaigé builds complete marketing campaigns from a brief—generating email templates, social posts, display ads, and copy all at once instead of piecing together fragments—beta pricing available
  11. Compass answers data questions in Slack using plain language—ask "which deals in my pipeline are at risk?" and get instant insights from your warehouse without opening dashboards—no pricing details
  12. Thomas Ricouard shared a clip where Claude Code “one shots” a coding task, a tiny but compelling nudge to try Opus-backed coding agents on real work.
  13. The Claw is a browser game built with Three.js, Mediapipe hand-tracking, and Gemini 3 that lets you pluck alien buddies out of space, a playful way to experience Gemini in action.
  14. Jake Eaton brought back his “Claude plays Pokémon” experiment with Opus 4.5, showing the model acting as a semi-autonomous game agent that makes its own naming choices.
  15. Tu7uruu launched Dia2, a streaming TTS model that generates voice in real time from partial text, ideal for anyone building Claude- or Gemini-powered voice agents.
  16. Superdesign is an AI design workspace that generates and iterates on UI layouts, components, and wireframes directly from prompts.
    • This walkthrough video shows how the Superdesign agent plugs into your design workflow in real time.
  17. Code Arena took Claude Opus 4.5 for a spin against Gemini 3 Pro, letting you compare the two on real coding challenges in an interactive setting.
  18. Matt Shumer unveiled AI Researcher, a Gemini 3-powered multi-agent system that can autonomously run ML experiments from a natural-language spec.
  19. Lukas Ziegler demoed a robot tending a 3D-printing farm—removing prints, cleaning the bed, and restarting jobs—offering a glimpse of plug-and-play automation in the physical world.
  20. Scott St. T chained Nano Banana Pro and Claude Opus 4.5 so Gemini infers Monica’s Friends apartment floor plan from a set photo and Claude writes the Three.js code to render it.
  21. Yi Ma announced that his Deep Representation Learning course is now complete with all lecture slides and recordings posted, a high-signal curriculum for understanding modern vision models.
  22. osgrep v2 offers fully open-source, local semantic code search for Claude Code with faster answers, lower cost, and strong internal win rates—perfect for power users.
  23. Omar Sanseviero (Nano Banana) shared a Nano Banana Pro showcase where Gemini renders complex text cleanly inside images, an easy visual test-drive for Google’s latest image model.
  24. Zobeir Hamid pitched his new Anything app and offers “500M credits” to try it, a classic AI-era promo showing how many new products are springing up around the big model ecosystems.
  25. Surya Dantuluri showed Claude 4.5 Opus filing an entire tax return end-to-end in one shot and bets that by 2028 computer-use agents will handle small-business taxes as well as a competent GM.
  26. OpenAI shipped another weekly Atlas browser update with dockable DevTools, optional safe search toggle, and ChatGPT responses that now incorporate browsing memory—letting you ask questions like "what should I be thinking more about in my work?" and get answers informed by your research history.

Just FYI (vibes / tweets / quick observations that caught our eye)

  1. Simon Willison pulled out a line from the Claude Opus 4.5 system prompt noting that if users are unnecessarily rude, Claude can insist on kindness and dignity instead of apologizing, underscoring Anthropic’s values-first persona design.
  2. Meng To says Claude 4.5 Opus is a big upgrade for design work but still “not quite Gemini 3 Pro,” giving a practitioner’s view on Claude vs Gemini for layout, color, and typography.
  3. Glauber Costa argued that SQLite is the best filesystem abstraction for agentic systems and introduces AgentFS, which exposes a filesystem backed by SQLite so LLM agents can treat structured tables as their native environment.
  4. VraserX insisted that “GPT 5 Pro is still the king of creative writing” and that serious writers already know this, a spicy counterpoint when comparing Claude, Gemini, and GPT on long-form prose.
  5. NIK joked about 2025 AI companies “spending billions on compute” and always saying “bro just one more…,” a meme that nonetheless nails the vibe of the current AI CAPEX arms race.
  6. Ryo Lu argued that the old way of scaling teams—hiring more specialists—is dead when tools like Cursor can turn ideas into code in minutes, shifting the bottleneck to taste and judgment.
  7. Morqon noted that frontier reasoning systems are closing the complexity-scaling gap between ARC-AGI-1 and ARC-AGI-2, calling it surprising and not yet fully understood.
  8. Sully Omarr claimed “coding is probably done for pretty soon,” pointing to Opus 4.5, Composer 1, and Gemini 3 as an overwhelming stack for many programming tasks.
  9. undefined behavior posted a playful reminder from statistical mechanics that coffee with cream will never unmix, a nerdy metaphor you can deploy when talking about irreversibility and AI risk.
  10. Cody Schneider says his new rule is to always be automating parts of whatever he’s working on, illustrating how people are quietly wrapping agentic tools around their daily workflows.
  11. deredleritt3r plugged an “epic conversation on Frontier AI” with transformer co-author Lukasz Kaiser about automated research interns and what comes after today’s frontier models.
  12. Jack Louis P asked whether we’re approaching robotic manipulation all wrong, using Ilya’s comments as a springboard to question data-hungry, brittle systems.
  13. Jeremy Howard called out Anthropic’s “chart crime” in its Opus 4.5 marketing and redraws the plot using error rates, showing that headline gains over GPT-4.5 and GPT-5 Pro are smaller than they first appear.
  14. Heinen Brothers underlined how the AI boom is dramatically increasing demand for transmission lines and grid upgrades, tying model scaling directly to physical infrastructure.
  15. jeremy (jerhadf) says he’s “not hearing enough Opus 4.5 criticism” and explicitly invites it, a meta-signal that even fans want harder scrutiny of Anthropic’s claims.
  16. Yann LeCun replied to a meme about the difference between how he and Andrej Karpathy are perceived with a “🤣” emoji, a tiny but on-brand mood check.

Want more AI research digests?

Check out last month's full digest and subscribe to get the most important AI developments delivered to your inbox every morning.

Subscribe to The Neuron to get the most important AI developments delivered to your inbox every morning.

cat carticature

See you cool cats on X!

Get your brand in front of 550,000+ professionals here
www.theneuron.ai/newsletter/

Get the latest AI

email graphics

right in

email inbox graphics

Your Inbox

Join 550,000+ professionals from top companies like Disney, Apple and Tesla. 100% Free.