Everything That Happened in AI Today Monday, March 31, 2026

Anthropic leaked Claude Code's entire source code via npm, someone rewrote it in Python with Codex in hours, OpenAI closed the largest private funding round in human history, axios got compromised with malware, and a 1-bit model ran on an iPhone.

Welcome to the Around the Horn Digest, where we track every AI story that matters so you don't have to. Today was a security nightmare wrapped in a copyright meltdown wrapped in a $122 billion check. Anthropic's own CLI tool got exposed by a misconfigured npm package, a supply chain attack hit one of the most downloaded libraries on Earth, OpenAI casually announced it's generating $2 billion a month in revenue, and a Caltech startup squeezed an 8-billion-parameter model into 1.15 gigabytes. Oh, and NVIDIA shipped DLSS 4.5, Oracle cut thousands of jobs, and AI data centers are literally warming the neighborhoods around them.

Let's get into it.

Previous digests: Mar 28-29 | Mar 27 | Mar 26 | Mar 25 | Week of Mar 21 Monthly skill digests: AI Skill — March (Part 3) | AI Skill — March (Part 2)

Around the Horn — Tuesday, April 1, 2026

The big news today was the Great Claude Code Leak of 2026.

Anthropic's Claude Code CLI tool had its full TypeScript source (512,000 lines across ~1,900 files) accidentally exposed via a misconfigured .map file in the npm package. Within hours, the repo hit 25,000+ stars. Developers started pulling it apart: Wes Bos found 187 hardcoded spinner verbs (including "hullaballooing" and "razzmatazzing"), an internal analytics system that logs your prompt as negative whenever you swear at it, and random 4-character IDs filtered to exclude 25 curse words. Sebastian Rasbt extracted six architectural lessons, himanshustwts broke down Claude Code's memory system (a lightweight index of ~150-char pointers, not storage), Victor A mapped the full 35-module architecture, and Justin Schroeder flagged that most system prompts live client-side, the repo uses the just-hacked axios package, and comments are written explicitly for LLMs, not humans.

Then it got weirder. The leaked repo was flooded with 2,000+ bot issues in Chinese begging users to join scam groups. Gergely Orosz argued that the leaked TS source was rewritten in Python via OpenAI's Codex in hours, creating a DMCA-proof derived work that cannot be taken down, exposing a new reality where any closed-source codebase is one agent session from a functional open clone. The claw-code repo (clean-room Python reimplementation, Rust rewrite underway) hit 44,500 stars. thdxr quipped that Claude Code's 512k lines dwarf their OpenCode's 118k. Meanwhile, scaling01 revealed internal codenames: Capybara (Mythos) v8 for development, Numbat as the upcoming launch codename, and Fennec = Opus 4.6. A reverse-engineer on Reddit also found two cache bugs that can silently 10-20x your API costs. The first rule of shipping npm packages: check your .map files. The second rule: also check your .map files.

🏆 TOP 5 NEWS (Around the Horn)

OpenAI closed a $122 billion funding round at an $852B valuation, the largest private raise in history. Amazon invested $50B, NVIDIA and SoftBank each put in $30B. The company says it's generating $2B/month in revenue, has 900M weekly active users, and is building a "unified AI superapp" combining ChatGPT, Codex, browsing, and agentic capabilities. Their ads pilot hit $100M ARR in under six weeks.
The axios npm package (300M+ weekly downloads) was compromised via a hijacked maintainer account that published malicious versions dropping a remote access trojan (a program that gives attackers control of your computer). Karpathy called it a wake-up call for package managers to change their defaults. Cognition's Devin caught the attack for customers within an hour, before public disclosure.
NVIDIA launched DLSS 4.5 with Dynamic Multi Frame Generation (the system automatically adjusts how many AI-generated frames your GPU creates to hit your display's refresh rate) and a new 6X mode for RTX 50 Series GPUs, delivering up to 35% higher frame rates in path-traced games at 4K.
Oracle is cutting thousands of jobs while ramping AI data-center spending to meet $553B in remaining performance obligations from OpenAI and other customers.
The Forecasting Research Institute completed the most comprehensive study of how 69 economists, 52 AI experts, and 38 superforecasters expect AI to affect the U.S. economy: they predict major AI progress but no dramatic break from current GDP growth trends by 2030 or 2050 (paper, policy memo).
AI data centres warm surrounding areas by an average of 2°C (up to 9.1°C) within 10 km, already affecting up to 340 million people based on 20 years of satellite data cross-referenced with 8,400+ sites.

Honorable Mentions

ellen_in_sf flagged six major AI infrastructure security incidents this week: LiteLLM backdoored, Axios supply-chain malware, Railway CDN data leak, OpenAI Codex command injection via GitHub branch names, Mercor 1TB data leak, and Delve data leak.
OpenAI shipped Codex, a Codex plugin for Claude Code so users can delegate tasks directly from inside Claude Code using their ChatGPT subscription (dkundel, GitHub). Yes, you read that right. OpenAI built a plugin for Anthropic's tool.
Ollama 0.19 now runs up to 2× faster on Apple Silicon by switching to Apple's MLX framework (machine learning tools Apple built specifically for M-series chips), with NVFP4 support and smarter cache reuse. MLX creator Awni Hannun celebrated the milestone (MacRumors).
Anthropic launched Claude Cowork, a desktop agentic system for knowledge work that reads and edits local files, completes multi-step tasks autonomously, and supports Dispatch (phone-triggered background work). lindavivah revealed Anthropic acquired Vercept.ai in February to power it, and Claude's desktop benchmark score jumped from <15% to >72%.
Zhipu's losses climbed 60% to 4.7B yuan ($680M) in 2025 as fierce Chinese AI competition intensifies; revenue of 724M yuan missed analyst estimates.

🍪 TOP TREATS TO TRY

PrismML's 1-bit Bonsai is an 8B-parameter model squeezed into 1.15 GB (14x smaller than its peers) that runs on an iPhone at 40 tokens/sec and hits 440 tokens/sec on an RTX 4090, competitive with full-precision 8B models on benchmarks —free on Hugging Face.
ChatGPT for Excel builds and updates spreadsheets from natural language, analyzes data across tabs, and reviews results before you share; now available worldwide except EU consumer plans (announcement) —included with Plus, Pro, Business, Enterprise.
Ollama 0.19 runs local AI models up to 2× faster on Macs by switching to Apple's MLX framework, with smarter cache reuse for coding agents like Claude Code and OpenCode —free to try.
OpenAI Codex is a lightweight terminal coding agent you install with npm i -g @openai/codex; sign in with your ChatGPT account or API key for local AI-assisted coding —free to try.
Holo3 by H Company scored 78.85% on OSWorld-Verified (the leading desktop computer-use benchmark), beating GPT-5.4 and Opus 4.6 with only 10B active parameters; open weights for the 35B-A3B version on Hugging Face —free (Apache 2.0).
Google Veo 3.1 Lite generates AI video from text or images at half the cost of Veo 3.1 Fast, supporting 16:9 and 9:16 formats in 4-8 second clips —$0.05/sec (announcement).
Capy is an AI-native IDE for parallel development that runs multiple coding agents at once and manages work from task to PR —no pricing details.
Viktor manages your Meta and Google Ads from Slack: pauses underperformers, scales winners, adjusts budgets, and exports real-ROAS reports —$100 free credits, then from $50/month.
Pardus Browser outputs structured semantic trees (tagged by role, landmarks, and interactive elements) in under 200 ms with no Chromium, no screenshots, and no GPU, built specifically for AI agents —free and open source (Show HN).
Latchkey injects stored credentials into curl requests for known public APIs (Slack, Google, GitHub, Linear) via the command line, so agents can make authenticated HTTP calls without embedding secrets —free and open source.
Letterbook automates customer support with an AI agent built for founders and small teams —no pricing details.

🏢 Big Tech & Major Companies

OpenAI closed a $122 billion funding round at an $852B post-money valuation, the largest private raise in history. Amazon invested $50B ($35B contingent on IPO or AGI), NVIDIA and SoftBank each put in $30B. The company says it's generating $2B/month in revenue, growing 4x faster than Alphabet and Meta did at the same stage. 900M weekly active users, 50M+ subscribers. Ads pilot hit $100M ARR in under six weeks. Codex serves 2M weekly users (up 5x in 3 months). APIs process 15B tokens/minute. Enterprise is 40% of revenue, on track for parity with consumer by end of 2026. They're building a "unified AI superapp" combining ChatGPT, Codex, browsing, and agentic capabilities. Also raised $3B from retail investors via bank channels for the first time and will be included in ARK Invest ETFs.
NVIDIA launched DLSS 4.5 Dynamic Multi Frame Generation and 6X mode for RTX 50 Series GPUs, with support for 200+ games at launch including Battlefield 6, Marvel's Spider-Man 2, and Monster Hunter Wilds.
Oracle is cutting thousands of jobs while continuing to ramp AI data center capex to meet demand from OpenAI and others.
OpenAI Developers reported that developers delegate long-running tasks to Codex at end of day; tasks kicked off at 11 pm are 60% more likely to run 3+ hours, letting AI work while they sleep.
Ring launched an app store with 15 launch apps for its 100M+ installed cameras, expanding beyond home security into elder care (Density's Routines detects falls and routine changes), business analytics (QueueFlow tracks wait times), and rental management; privacy rules ban facial recognition and license-plate reading; developers get a 10% commission model.
Zhipu's net losses surged 60% to 4.7 billion yuan ($680M) in 2025, worse than the 3.76B yuan analysts expected, while revenue hit 724M yuan ($105M); the Hong Kong-listed company continues spending aggressively to keep pace with DeepSeek and other Chinese AI rivals.
Salesforce invested $330M in Anthropic (~1% stake) after Microsoft blocked its attempted investment in OpenAI.
Google AI Studio updated with save-temp-chats, two-click playground-to-app conversion, simplified mobile vibe coding UI, STT button, and full model categories without scrolling.
Omar Shahine joined Microsoft to bring OpenClaw and personal agents to Microsoft 365, with proactive workplace assistants and a fully integrated Teams plugin already live.

💼 AI Productivity, Labor & Economics

The Forecasting Research Institute surveyed 69 economists, 52 AI experts, 38 superforecasters, and 401 public respondents on AI's U.S. economic impact; they predict major progress but GDP growth rates similar to today's by 2030 and 2050 (announcement, paper, policy memo, interactive survey).
AI data centres create heat islands raising nearby temperatures by an average 2°C (up to 9.1°C) within 10 km, affecting up to 340 million people based on 20 years of satellite data.
Sebastian Baltes, Marc Cheong, and Christoph Treude argue that AI-generated code "slop" (low-quality AI PRs, docs, and bug reports) is a tragedy of the commons: individual productivity gains externalize review friction and codebase degradation onto maintainers, based on analysis of 1,154 Reddit and HN posts.
Zapier now requires AI fluency for every hire, building AI competency into its hiring rubric across all roles a year after going AI-first. CEO Wade Foster open-sourced V2 of the rubric with three levels: Capable, Adoptive, and Transformative, evaluated across Mindset, Strategy, Building, and Accountability.
Shopify principal engineer Kshetrajna Raghavan walked through evolving one-shot LLM extraction into a full agentic DSPy system with specialized sub-agents, isolated context, and MIPRO optimization that delivered 99% cost reduction while beating larger models (dbreunig, koylanai thread).
Aparna Dhinakaran shared key takeaways from Coinbase Head of AI Chintan Turakhia's MongoDB.local talk: ticket-to-PR dropped from 8 days to 12 minutes, 20% of commits from ambient agents, unified all context into Linear for machine-readable orchestration, killed stand-ups and red tape (90% coordination reduction).
Lenny Rachitsky shared Claire Vo's complete OpenClaw guide: from first install to multi-agent setups, real costs, security gotchas, and persistent workspace orchestration for non-technical users.

🤖 AI Agents & Infrastructure

Matija Franklin et al. published a taxonomy of six "AI Agent Traps" where the information environment itself (web pages, emails, APIs) becomes the attack surface for autonomous agents, from hidden prompt injections in HTML comments to fabricated reports triggering synchronized market sell-offs (SSRN paper).
Kangwook Lee (UC Berkeley) explained that production LLM agents succeed through context engineering (dynamically swapping tools, compacting history, keeping only compiler results), plus recursive loops and multi-agent orchestration.
RightNow-AI built OpenFang, a Rust-based open-source Agent Operating System (~32 MB binary, 137k lines of code) running autonomous agents on schedules with 16 security layers, 40 messaging adapters, WASM sandbox, and one-command migration from OpenClaw.
concensure built Semantic, a local Rust semantic-first cognitive layer for code agents that uses syntax-tree parsing and a persisted logic graph for deterministic retrieval and safe editing, exposing just two MCP tools and delivering 27.78% step savings in A/B tests (HN discussion).
Linyue Pan et al. introduced Natural-Language Agent Harnesses (NLAHs) that externalize high-level control logic as portable, editable natural-language artifacts executed by an Intelligent Harness Runtime, making agent orchestration transferable and studyable rather than buried in controller code.
Christine Yip ran SiliconSwarm@Ensue where autonomous agents on 6 Macs autoresearched Apple Neural Engine optimization for DistilBERT, bypassing CoreML via reverse-engineered low-level APIs, achieving up to 6.31× faster median latency while keeping accuracy >91%.
Icarus built a pair of autonomous Hermes agents (Icarus creates, Daedalus critiques) running 24/7 on a Raspberry Pi that browse the web, watch YouTube, and collaboratively evolve 3D worlds from their ongoing conversation.
FLORA launched FAUNA, a creative agent that builds full generative workflows on a canvas from a description, letting you redirect, push further, or tell it what to avoid while your vision stays in control (launch post).
Rivet built agentOS, a portable open-source operating system for agents with ~6 ms coldstarts and 32× cheaper execution than sandboxes, powered by WebAssembly and V8 isolates (benchmarks, GitHub).
Parallel published a case study on how Opendoor uses their enterprise-grade web research layer to power AI-native real estate operations (SOC-2 Type II certified).

💻 AI Coding & Developer Tools

The Claude Code source leak dominated developer discourse. The full 512k-line TypeScript codebase was exposed via a misconfigured npm .map file. Key findings from multiple deep dives: Wes Bos found 187 spinner verbs and swear-word-filtered IDs; himanshustwts mapped the 3-layer memory architecture with background consolidation; rasbt extracted six design lessons including aggressive prompt-cache reuse and fork/subagent parallelism; Victor A catalogued 35 modules including custom Yoga flexbox terminal UI, dual-track permissions, and feature gates for PROACTIVE/VOICE_MODE/KAIROS; Justin Schroeder noted <20 tools beats maximalist MCP and the repo uses the just-hacked axios; scaling01 found internal codenames (Numbat = upcoming launch, Fennec = Opus 4.6, Capybara v8 for dev); a follow-up post added kairos/dreaming/ultrathink/ultraplan/ultrareview mode flags plus full GitHub and Slack integration.
Gergely Orosz argued the leaked Claude Code was rewritten in Python via Codex in hours, creating a DMCA-proof derived work. The claw-code repo (clean-room reimplementation, now rewriting in Rust) hit 44.5k stars. altryne greentexted the full saga. Copyright law didn't prepare for this.
Gergely Orosz also noted the irony that Anthropic markets Claude Code's $15-25/PR security reviews as enterprise-grade while their own sourcemap exposed the entire codebase, questioning whether heavy internal AI reliance led to skipping basic security hygiene.
kitze shared that GPT 5.4 rated the leaked Claude Code codebase 6.5/10, calling it "staff-engineer spaghetti: performance-aware, feature-flagged, telemetry-instrumented, surgically optimized spaghetti."
A reverse-engineer on Reddit found two cache bugs in Claude Code that can silently 10-20x API costs, with workarounds (use npx for Bug 1; pin to v2.1.68 for Bug 2).
drona23 released a universal CLAUDE.md file that claims to cut Claude output tokens by 63% with no code changes, drop-in ready.
chatgptprojects/claude-code appeared as another community fork of the leaked source (by paidev).
Yoko broke down Anthropic's anti-distillation defenses in Claude Code: the ANTI_DISTILLATION_CC flag injects decoy tool definitions to poison training data, while CONNECTOR_TEXT replaces reasoning traces with signed summaries so even full client control never exposes the original chain-of-thought. Sahil Patel explained the same two systems with screenshots from the leaked source.
Yuchen Jin (Hyperbolic Labs CTO) argues that with 500k+ lines of Claude Code now public, the real gap in coding tools is the harness, not raw model capability, and every model lab and AI coding startup will study it and close that gap fast.
shadcn built Luma, a new shadcn/ui style with rounded geometry, soft elevation, and breathable layouts inspired by macOS Tahoe.
Lawrence Chen built cmux SSH so you can drag images into remote Claude Code sessions (auto-uploaded via scp) and access localhost dev servers without port forwarding, making remote workspaces feel local.
Chris Tate (Vercel) shifted to letting agents freely yolo on repos using emulate.dev (API emulation), port1355.dev (local URL naming), and agent-browser.dev (browser automation) inside a no-network, credential-less sandbox.
metamike built TurboQuant, a deployment running NousResearch Hermes Agent with OmniCoder 9B at 60 tokens/s on an old 3070 GPU with solid tool calling for local coding agents.
Teknium shipped tool-call streaming in OpenWebUI via the NousResearch Hermes Agent endpoint.
thdxr quipped that Claude Code's 512k lines dwarfs OpenCode's 118k, calling it "LOC mogged."
Calif demoed prompting Claude to find RCE vulnerabilities (remote code execution, meaning an attacker can run commands on your machine) in both Vim and Emacs, launching "MAD Bugs: Month of AI-Discovered Bugs" through April with full advisories.

🏛️ AI Policy, Governance & Safety

geohot argues that closed-source AI creates neofeudalism by monopolizing intelligence, the greatest creative force, concentrating compute and talent into a permanent underclass; open source is explicitly anti-feudal (HN discussion).
timhwang ran an empirical experiment injecting biblical Psalms into LLM system prompts: small but consistent ethical alignment gains on GPT-4o, Claude resistant; calls for AI safety to engage more with religious representations in training data (GitHub).
petergostev visualized the OpenAI vs Anthropic compute wars: Anthropic's Opus 4.5 breakthrough came from massive AWS capacity that doubled training headroom, but OpenAI's planned capacity expansion in H2 2026 will widen the gap again unless Anthropic accelerates.
OpenAI published preliminary alignment midtraining results finding that training on fictional scenarios of aligned vs misaligned AIs affects behavior close to the training distribution, but those effects largely disappear after reasoning post-training and don't generalize to realistic chat and agentic evaluations. Geodes Research confirmed the same pattern from their own prior study.

🔬 AI Research & Models

Seungju Han argues that Synthetic Mixed Training (training on both synthetic Q&As and synthetic documents) plus Focal Rewriting scales knowledge acquisition beyond RAG (retrieval-augmented generation, where models look up info from a database): a Llama 8B model beats RAG in five of six benchmark settings (paper).
GAIR-NLP released daVinci-LLM, a fully open 3B-parameter model trained from scratch on 8 trillion tokens via a "Data Darwinism" framework with 200+ ablations; matches OLMo-3 7B overall and beats it on math by 23.2%, releasing the complete pipeline, data, and all failed experiments (model, data, GitHub).
Ran Li et al. introduced CPMöbius, a Coach-Player framework for data-free reinforcement learning on math reasoning: the Coach proposes targeted problems and is rewarded when the Player improves, delivering +4.9 overall accuracy on Qwen2.5-Math-7B with zero external data.
xu_sirui et al. built HandX, a CVPR 2026 project scaling realistic two-handed motion generation with 54.2 hours of motion-capture data and 490K text annotations, supporting text-to-motion, keyframe control, and zero-shot transfer to real humanoid robots (project page, paper, GitHub).
Quentin Le Lidec released LeWorldModel (LeWM) checkpoints and datasets on Hugging Face for a stable end-to-end JEPA world model (a system that learns to predict what happens next from raw video without needing labels) trained from pixels on 1 GPU; pairs with the stable-worldmodel library.
PrincetonVL released WAFT-Stereo, a depth estimator that tops major benchmarks (ETH3D, Middlebury, KITTI) while using 61% less error and running 1.8-6.7× faster than previous best.
IBM Granite released Granite 4.0 3B Vision, a compact adapter for enterprise document intelligence that leads on table extraction and chart understanding benchmarks.
Lech Mazur found in a 1,162-debate LLM benchmark that models begin with inflated 72% agreement then moderate toward center after reading both sides, shifting toward opposition regardless of initial stance.
SchmidhuberAI argues that Yann LeCun's 2022 JEPA is essentially identical to his own 1992 Predictability Maximization (PMAX) system, listing 19 references and accusing LeCun of repackaging without citation (blog).
cwolferesearch published a deep dive on LLM benchmark construction, distilling six recurring patterns across 30+ popular evaluations: domain taxonomy, heavy human annotation, model-in-the-loop augmentation, high-quality sourcing, realism via live problems, and continuous evolution.
Hugging Face released TRL v1.0, the post-training library now built as stable infrastructure with Asynchronous GRPO (parallel rollouts that eliminate idle GPU time), new RL trainers (VESPO, DPPO, SDPO), tool-calling support, and full OpenEnv integration for stateful multi-turn environment training (release notes).
arora_mrinaal demoed the first successful GRPO fine-tune (a reinforcement learning technique for language models) of Qwen3-1.7B on live TextArena Wordle via TRL + OpenEnv, taking win rate from 0% to 8%.
Arena Physica launched Heaviside, a transformer-based foundation model for electromagnetism that predicts EM behavior from geometry in 13 milliseconds (800,000× faster than commercial solvers), released inside Atlas RF Studio (announcement).
Interconnects curated Artifacts Log #20 highlighting NVIDIA Nemotron Super (120B MoE, 1M context), Cohere Transcribe (free speech-to-text), Sarvam 105B (sovereign Indic performance), Mistral Small 4 hybrid, and more.
PrismML launched 1-bit Bonsai, an 8B-parameter model squeezed into 1.15 GB (14x smaller than peers) that runs on an iPhone at 40 tokens/sec. A true end-to-end 1-bit model (embeddings, attention, MLP, and LM head all 1-bit) with 10.6x intelligence density (useful capability per GB) over Qwen3 8B. Runs at 136 tok/s on M4 Pro and 440 tok/s on RTX 4090. Co-founder Omead Pooladzandi noted their largest model is smaller than your Spotify cache. Tushar Bansal argues the next phase of AI will be defined by intelligence density, moving from today's "supercomputer era" to on-device agents, real-time robotics, and offline intelligence (models, GGUF, llama.cpp fork).
H Company released Holo3, scoring 78.85% on OSWorld-Verified (new state of the art on desktop computer-use) with only 10B active parameters (122B total), beating GPT-5.4 and Opus 4.6 at a fraction of the cost; open weights for Holo3-35B-A3B on Hugging Face under Apache 2.0 (launch post).
Together AI released Aurora, an open-source RL framework that turns speculative decoding (where a small fast model predicts tokens and a bigger model verifies them) from a one-time offline setup into a self-improving system. Achieves 1.5x day-0 speedup on frontier models (MiniMax M2.1, Qwen3-Coder-Next) and an additional 1.25x over static speculators on established models (paper, GitHub).
Liquid AI released LFM2.5-350M, a 350M-parameter model trained on 28T tokens with scaled RL that delivers reliable agentic loops, data extraction, and tool use on CPUs, GPUs, and mobile (under 500 MB quantized) where similar-scale models usually fail; now powering on-device document pipelines, lightweight agents, and edge workflows with day-0 support across AMD/Intel/Qualcomm and LM Studio (announcement, cofounder post).
zhaoc5 released DyMoE (CVPR 2026), a Dynamic Mixture of Experts with drift-aware token assignment for continual learning of large vision-language models, solving the problem of catastrophic forgetting when VLMs learn new tasks (paper, project page).
MIT Han Lab released FourOverSix: adaptive block-scaled data types for more accurate NVFP4 quantization (squeezing model weights into 4-bit format with less accuracy loss) (paper).
Microsoft released ArchScale, simple and scalable pretraining for neural architecture research, with a paper on transferable hypersphere optimization for language model scaling (paper).
Jack Zhang (Dao AI Lab) built Gram Newton-Schulz, a mathematically equivalent yet up-to-2× faster hardware-aware Newton-Schulz algorithm for Muon's polar decomposition that iterates on the small symmetric Gram matrix instead of the large rectangular one, with custom CuTeDSL symmetric GEMM kernels for Hopper/Blackwell; drop-in replacement cutting orthogonalization time 40-50% especially on high-aspect-ratio MoE weights (thread, GitHub, quack kernels).
Aleksei Petrenko (Apple Research) released the ICLR 2026 paper "Entropy-Preserving Reinforcement Learning" showing how PPO/GRPO-style RL collapses token-distribution entropy in LLM post-training; proposed REPO (advantage modification) and ADAPO (adaptive asymmetric clipping) fixes that push AppWorld accuracy from 45.7% to 71% (paper).
Abhinav Moudgil (Mila) introduced Celo2, a learned optimizer (simple 2-layer MLP update rule) meta-trained in 4.5 GPU hours on tiny image-classification tasks that generalizes out-of-distribution to GPT-3 1.3B pretraining, ViT, and Atari RL, outperforming tuned baselines (paper, GitHub, predecessor papers: Celo, VeLO).
michaelpsenka built GRASP, a gradient-based stochastic parallel planner for world models (DINO-WM, JEPA-WMs, LeWorldModel) that optimizes actions and virtual latent states with dynamics-consistency penalties and Langevin-style noise for efficient long-horizon planning (GitHub).
Adam Zweiger (MIT CSAIL) showed that Attention Matching achieves 50× KV-cache compaction (compressing the memory that stores previous tokens) in seconds with minimal performance loss, arguing that latent-space compaction along the sequence dimension matters more than quantization or MLA (paper, GitHub).
JiangZehua built AgenticPCG, a tool-using LLM framework for procedural content generation (automatically creating game levels) where an agent iteratively edits, evaluates, and optimizes game levels with environment feedback (project page).
Researchers published "Reward Hacking as Equilibrium under Finite Evaluation", modeling reward hacking in AI systems as a game-theoretic equilibrium rather than a bug.
allenai released OLMo-core, the official PyTorch building blocks for the OLMo ecosystem including modeling, distributed training scripts for OLMo-2/3, inference via HF Transformers + vLLM, and a beta chat-loop demo.
Gordon Wetzstein shared Multigen, a playable GameNGen multiplayer demo running real-time neural game generation.
AfterQuery used Tinker API + Harbor Framework in a two-stage SFT → RLVR pipeline to lift openai/gpt-oss-20b from 3.1% to 16.9% on Terminal-Bench 2.0, matching Gemini 2.5 Flash without training on any official eval tasks (Tinker).
alphaXiv compiled a shared reading list of 13 must-read World Models / JEPA papers to bring you up to date on the 2026 wave of predictive architectures (folder).
You Jiacheng analyzed PrismML's 1-bit Bonsai format (no VQ, no rotation, symmetric no-bias) and concluded it is likely a pure optimization-based 1-bit quantization method. nisten confirmed it runs well on PrismML's llama.cpp fork with full build/run instructions for CUDA/Metal/CPU.
Shanghai AI Lab et al. built Kernel-Smith, a unified evolutionary + post-training RL framework for GPU kernel optimization; Kernel-Smith-235B-RL sets SOTA on KernelBench and has upstream contributions merged into SGLang and LMDeploy (demo).
Developer mccoyspace built autocritic, a critic-card-driven image evaluation system using art theory and multimodal LLMs, and rewriteDrawer, a graph-rewriting simulation rendered as plotter-ready linework.
Google SRE argues that heroism is bad for teams because it creates single points of failure, burns out individuals, and hides systemic problems; focus on sustainable practices, toil reduction, and shared ownership instead.
Mohammed Alshehri built Tinker-Explorer, an RL agent (Qwen3-8B + GRPO) that learns to navigate document chunks for multi-hop QA; the key insight: reward quality beats reward quantity, with the lowest-training-reward run delivering the best F1 (GitHub).

🛠️ AI Tools & Products

Google Veo 3.1 Lite is now live in the Gemini API for rapid video prototyping at half the cost of Veo 3.1 Fast: text-to-video and image-to-video, 16:9/9:16, 4-8 second clips starting at $0.05/sec (announcement).
Composio CLI searches tools, inspects schemas, executes them, connects accounts, scripts workflows, and generates type-safe code from the terminal; now powers one-command Claude Code automations like pulling Slack complaints and spawning sub-agents to check a Notion FAQ (demo) —free to try.
Speakeasy now generates production-grade CLIs from OpenAPI specs with built-in agent mode, interactive TUI, and cross-platform distribution (announcement) —beta.
Dench is a locally-hosted AI CRM for your desktop —no pricing details.
pi-read-mode is a pi terminal plugin that lets you scroll through full conversation history while composing a follow-up, so the composer no longer snaps you to the bottom.
Flowith offers an agentic workspace for deep work with ChatGPT, Claude, and DeepSeek integration as an AI whiteboard for brainstorming and creative workflows —no pricing details.
drona23/claude-token-efficient is a universal CLAUDE.md that claims to cut Claude output tokens by 63% as a drop-in with no code changes.

🏛️ AI Security

The axios npm supply chain attack was the biggest security story of the day. A hijacked maintainer account published malicious versions 1.14.1 and 0.30.4 that injected a decoy package with a RAT dropper targeting macOS/Windows/Linux; live for ~2-4 hours before npm unpublished them. Karpathy warned that package managers must change defaults to prevent single maintainer compromises from pwning users, calling for release-age constraints, containers, and reproducible builds. Yuval Adam and Florian S recommended adding min-release-age=7 to npm, uv, pnpm, and Bun configs. Scott Wu showed Cognition's Devin Review caught the attack for customers within an hour.
ellen_in_sf catalogued six major AI infra security incidents this week: LiteLLM backdoor, Axios supply-chain malware, Railway CDN data leak, OpenAI Codex command injection via branch names, Mercor 1TB leak, and Delve data leak, concluding "infra is the attack surface now."
Clément Dumas warned of an active supply-chain attack squatting Anthropic-internal npm packages (color-diff-napi and modifiers-napi) registered by a disposable-email account targeting devs compiling leaked Claude Code source; classic dependency-confusion, do not install.

💡 Industry Commentary & Analysis

Steve Yegge published "Vibe Maintainer", coining a term for the new reality of maintaining open-source projects flooded with AI-generated PRs. At ~50 contributor PRs per day across his projects Beads (20k stars) and Gas Town (13k stars), he argues this workflow will become the norm: every successful OSS project will face PR storms, and any user with a coding agent is a credible fork threat. His conclusion: maintainers who don't adapt will lose their communities to people who AI-fork their projects.
Ethan Mollick argued that chatbot interfaces are the real bottleneck in AI adoption, not model capability. A new study found financial professionals using GPT-4o saw productivity gains but the chatbot itself created cognitive overload (walls of text, sprawling tangents, messy conversations that compounded). The people hurt most were less experienced workers. He frames Claude's new Dispatch feature (phone-triggered background tasks) as early "post-chatbot AI" (thread).
Jack argues Block is replacing hierarchy with an AI-native "intelligence" layer: a continuously updated company world model from remote-first artifacts plus a customer world model from transaction signals, so middle management's information-routing role becomes obsolete and every person operates at the edge with full context (Block post).
Oumi published a two-part argument that the era of general-purpose AI is over; specialized models built for specific domains will outperform general models, and Oumi's vision centers on making specialization easy (Part 2).
Francesco Pappone (Paradigma) argues that autonomous AI research will move the bottleneck from human accumulation to compute; consensus will form via a single computational loop of hypothesis, experiment, and verification, turning knowledge into a directed graph that agents traverse and extend.
Bloomberg reported on the oddly human work of teaching AI to talk: people vent, confess, and role-play with strangers to help machines learn to sound human.
Daniel Miessler argues the five most important ideas in AI right now are autonomous component optimization, intent-based engineering, the shift from opacity to transparency, the realization that most knowledge work is scaffolding (75-99% overhead AI crushes), and expertise diffusing irreversibly into public knowledge; together they create self-improving cycles where the speed of improvement itself accelerates, making clear intent the scarce skill.
signulll warns that OpenAI's push to combine ChatGPT + Codex + browsing + agents into one system risks creating massive cognitive load for both consumers and enterprises, turning a single seamless experience into an actual cockpit.
Kelly Greer argues the pending compute shortage (Anthropic already throttling usage while only 25% of slated 2026 datacenter capacity is under construction) makes it the perfect time to own PPAs, own and sell GPUs, or build market infrastructure for both.
Sarah Guo argues we are now shipping "dark code": emergent runtime behavior from agent-selected tools, natural-language control planes, and dynamically assembled workflows that no one can fully trace, creating normal accidents and forcing new questions about whether you can tell customers what your system did with their data.
Stephanie Chan (DeepMind) argues we are seeing early signs of a shift away from humans as primary consumers: Micron shut its consumer Crucial memory line for AI HBM, Nvidia's gaming revenue dropped from ~50% to 7%, U.S. data-center construction now exceeds office construction, and companies optimize for agentic commerce.
roon (@tszzl) argues "bullshit jobs" misses the point; the economy is now a high-stakes rat race where small skill edges yield huge rewards, AI raises the value of marginal hours via personal Jevons paradox (productivity gains make each hour worth more, so people work harder), and high earners work more than ever (NBER paper).
Larsen Cundric argues the best agent state management is simply the message history itself; every simplification of agent architecture (dropping S3/DB checkpoints, complex serialization) has made agents dramatically better.
Matt Stockton recommends DeepAgents from LangChain and @sydneyrunkle's harness-engineering series, noting that context isolation between sub-agents is where most frameworks quietly fail.
Vincent Sunn Chen interviewed Alex Shaw on how Terminal-Bench 2.0 jumped from ~25% to 80% in four months and why frontier CLI agents now need a "benchmark factory" to generate hard, representative evals at the pace models improve.

🎙️ Interviews, Panels & Podcasts

nabilwrites dropped a Discovery Engines podcast episode touring Ginkgo Bioworks' 52-RAC autonomous lab in Boston with CEO Jason Kelly, covering cell-free protein synthesis with OpenAI, the DOE Genesis Mission, and autonomous life-science R&D (YouTube, Spotify, Apple).
Shopify principal engineer Kshetrajna Raghavan's full talk on evolving one-shot to agentic DSPy at Shopify scale; 99% cost reduction and beating larger models via cleaner architecture.
Coinbase Head of AI Chintan Turakhia's MongoDB.local talk "Stop Adopting AI, Start Solving Problems" covering ticket-to-PR in 12 minutes, ambient agents, and 90% coordination reduction.

📊 Fundraising & Deals Roundup

OpenAI — $122B at $852B valuation (led by SoftBank, a16z, D.E. Shaw; anchored by Amazon $50B, NVIDIA $30B, SoftBank $30B).
Salesforce — invested $330M in Anthropic (~1% stake) after Microsoft blocked its OpenAI investment.
9fin — raised $170M Series C at $1.3B valuation (led by HarbourVest, with CPP Investments) for its AI-native debt market platform used by 300+ banks, asset managers, and law firms; total funding now tops $250M.
Runway — launched a $10M fund + Builders program for early-stage AI startups building with its video models.
Nomadic — raised $8.4M seed (led by TQ Ventures, with Pear VC and Jeff Dean) for its platform that turns autonomous vehicle and robotics video footage into structured, searchable training datasets using vision-language models; customers include Zoox, Mitsubishi Electric, and Zendar; won first prize at NVIDIA GTC's pitch contest.
Conductor — raised $22M Series A from Spark and Matrix (announcement).

📰 New from The Neuron

Anthropic Leaks Claude Code, a Blueprint for AI Coding Agents — What the leak actually revealed about memory, permissions, and the harness-is-the-moat thesis.
Meta-Harness Makes the Case for Automated AI Agents — A new paper arguing AI systems improve by automatically redesigning the prompts, tools, and workflows around a model.
How to Use AI in 2026: The Complete Proficiency Guide — The 5-level AI proficiency stack from projects to agents.
Claude Code's Creator Shares 15 Hidden Power Features — Boris Cherny reveals /loop automation, git worktrees, voice coding, and mobile sessions.
Moltbook, an AI Lobster Spy, and the Social Lives of Agents — Why memory, identity, and social behavior may define the next era of AI.
Trace2Skill: Self-Improving Agent Playbooks — AI agents improve most when experience is distilled into reusable skills.

Previous Around the Horn Digests

Catch up on everything you missed:

March 28-29, 2026: Weekend roundup.
Friday, March 27, 2026: Friday's news.
Thursday, March 26, 2026: Thursday's news.
Wednesday, March 25, 2026: Wednesday's news.
Week of March 21, 2026: Full weekly digest.

Monthly skill digests: AI Skill — March (Part 3) | AI Skill — March (Part 2) | AI Skill — March (Part 1)

That's a Wrap

That's 150+ stories from today alone. If you made it to the bottom, you now know more about the Claude Code leak than the person who accidentally shipped the .map file. Condolences to that intern's Slack DMs right now.

For the daily version (bite-sized, 5-minute reads), make sure you're subscribed to The Neuron. We send six issues a week, and yes, we read all of this so you don't have to.

See you tomorrow.

P.S: Know someone who'd find this useful? Forward this to them and tell them to subscribe here.

Around the Horn Digest: Everything That Happened in AI Today (Monday, March 31, 2026)