😺 Can AI predict the future?? 🤔🔮

PLUS: Get better AI search results w/ this trick!

Welcome, humans.

Sam Altman just revealed OpenAI is “very serious” about adding encryption to ChatGPT, likely starting with temporary chats. The reason? “People pour their heart out about their most sensitive medical issues or whatever to ChatGPT,” Altman told Axios, adding that the experience has “radicalized me into thinking that AI privilege is a very important thing to pursue.”

Here's why this matters: OpenAI wants the same legal protections for AI conversations that you get with doctors and lawyers. But there's a catch: true encryption is complicated when the AI provider needs access to your data for features like memory and training.

These encryption need exists because people use ChatGPT for everything… which is exactly the problem the recent GPT-5 launch exposed. The GPT-4o vs GPT-5 debate accidentally revealed we're cramming too many use cases into one product.

One highly upvoted post pushed back on how this debate has been framed, arguing it keeps getting reduced to “some people want cuddly emotional support AI” versus “real users who need serious work done”, but that misses the point. People are genuinely grieving the loss of their “thinking partner.” They described losing GPT-4o's contextual intelligence—how it tracked thought patterns and built continuity—only to get GPT-5's capable but sterile assistant that couldn't hold meaningful context across interactions.

Here's the thing: these should be two completely separate products. Instead of ChatGPT Plus/Pro, we need GPT Personal and GPT Professional. While we’re at it, we could even specialize down to GPT Doctor or GPT Lawyer—licensed, clinician-grade models where your conversations are actually privileged. Right now, we're shoving every use case into one model and wondering why nobody's 100% satisfied.

Thanks for listening to our soapbox. Now, please enjoy carboarding:

Don’t you just love it when ppl use AI to make genuinely dumb and amazing stuff like this instead of trying to replace real filmmaking? Though if real filmmakers wanted to recreate this… we’d also be down…

Here’s what happened in AI today:

Prophet Arena launched for AI prediction markets with real money bets.
Claude Opus 4.1 Thinking topped AI benchmarks.
AI recruiters outperformed humans in hiring study.
MIT found 95% of corporate AI pilots failed due to automating the wrong things.

Advertise in The Neuron here

P.S: Check out our latest podcast episode if you haven’t yet! This one’s on when to ask a question vs use a prompt framework vs build a project or even build an agent!

Listen and/or watch on… YouTube | Spotify | Apple Podcasts

Can AI predict the future? The answer is… kinda?

Remember when everyone thought AI would replace writers and coders first? Apparently, a venture capitalist recently told OpenAI's Noam Brown that sure, AI could do those things—but making “accurate, calibrated predictions about the future?” That's uniquely human.

Well, University of Chicago just dropped Prophet Arena to test that theory, and the results are… humbling.

Here's the deal: Prophet Arena throws AI models into live prediction markets—think Kalshi-style bets on everything from elections to sports to crypto prices. The twist? These are real, unresolved events. You can't memorize tomorrow's news (unless you've cracked time travel), which means AI models can't cheat by memorizing test answers like they do on other benchmarks.

The models get fed news articles and market data, then place probabilistic bets. When events resolve—like Toronto FC pulling off that upset win—we see who actually understood the world versus who was just pattern-matching.

The early results are fascinating:

OpenAI's o3-mini is crushing it on returns, turning $1 into $9 on a single MLS bet by spotting value the market missed.
Models have distinct "personalities"—Qwen 3 is aggressive (75% chance of AI regulation), while Llama 4 Maverick plays it safe (35% on the same event).
GPT-5 leads in accuracy but o3-mini makes more money—turns out being right and being profitable aren't the same thing.
DeepSeek R1 went rogue, sometimes betting 0% on everything (which somehow still made money when upsets happened).

Check the leaderboard yourself.

Here's where it gets spicy: In that Toronto FC match, the market gave them an 11% chance to win. o3-mini saw 30%. It bet big on the underdog and walked away with 9x returns when Toronto won.

The models aren't just randomly guessing either. They're writing detailed rationales, weighing sources differently, and showing genuine reasoning differences—kinda like a room full of analysts who can't agree.

The coolest part about this is how Prophet Arena solves AI's biggest testing problem: benchmark contamination. Most tests become useless once models train on the answers. But you can't leak tomorrow's game results.

Watch for: Anthropic's models are mysteriously absent from the leaderboard. Meta's Llama 4 Maverick was the only model to predict the Zohran political upset correctly. And models seem way more bullish on certain 2028 presidential candidates than current polls suggest. Someone knows something we don't?

FROM OUR PARTNERS

Innovate Faster with Clutch

Got a project in mind? Finding the right expert doesn’t have to be complicated. On Clutch, all it takes is sharing your project goals — and they’ll handle the rest.

Their matching process is tailored to your specific project, connecting you with vetted AI professionals who have the skills and experience to deliver measurable results.

Whether you’re building, optimizing, or innovating, skip the endless searching and guesswork. With Clutch, your perfect match is just minutes away.

Meet AI experts

Prompt Tip of the Day.

Your AI search results are only as good as your prompts—and most of us are doing it wrong. IPullRank's new multi-part series on AI SEO dropped this gem showing how prompt quality dramatically changes output quality:

Take the Italy travel example: “Italy travel tips” gets you generic tourist advice. But add context like “traveling to Northern vs Southern Italy in July with a toddler, need family-friendly local experiences and public transit access” and suddenly you're getting personalized recommendations with actual trade-offs and justifications.

The framework is simple: Subject + Context + Intent + Constraints.

Start with your subject (what you're asking about), add context (your situation), clarify intent (what you want from the answer), and include constraints (requirements or limitations).

Instead of “best restaurants,” try “authentic local restaurants in Rome's Trastevere neighborhood, walking distance from transit, good for vegetarians, under €30 per person.”

It's the difference between AI giving you Wikipedia summaries versus becoming your personal travel planner.

The more context you provide upfront, the less back-and-forth you need later. Think of prompts as briefing a really smart assistant: the clearer your brief, the better their work!

Check out all of our prompt tips of the day from August here!

Treats to Try.

*Asterisk = from our partners. Advertise in The Neuron here.

*Grammarly fixes your writing mistakes and suggests better words for emails, proposals, and Slack messages (looks like it just got a design overhaul too, with a new document interface with AI assistant, grader, and citation tools).
Paradigm transforms your spreadsheet work by letting AI agents gather, structure, and act on your data from any source.
Qwen Image Edit lets you make precise modifications to any image, including editing text within photos like changing words on signs or documents—try it out on Fal here ($0.03 per edit).
Toon Composer fills in animation frames between your sketches and colors them—draw a few key poses, get a full cartoon.
Eleven Music’s API turns text prompts into custom songs you can use commercially in your own app—try it out via Jinglemaker (here’s ours for The Neuron; related to this, check out Rick Beato’s demo of Suno on CBS here).
TensorZero creates a feedback loop that automatically optimizes your LLM applications using production data and human feedback to make your models smarter, faster, and cheaper (raised $7.3M).

Around the Horn.

OpenAI launched ChatGPT Go in India, a new affordable subscription tier priced at Rs. 399 (~$4.57) that offers 10x higher message limits, image generations, and file uploads compared to the free tier, with payments available through UPI and plans to expand to other countries based on feedback.
Anthropic’s Claude Opus 4.1 Thinking launched at #1 across major AI benchmarks (coding, math, creative writing, and web development), tying with GPT-5-high for top performance.
Researchers conducted a field experiment with 70K job applicants in the Philippines where AI voice agents outperformed human recruiters in hiring customer service representatives, delivering 12% more job offers, 18% more successful job starts, 17% higher one-month retention rates, and reduced gender discrimination while maintaining equal candidate satisfaction (paper).
An MIT study found that 95% of corporate AI pilot programs failed to achieve rapid revenue acceleration, with companies misallocating budgets toward sales and marketing tools instead of the back-office automation that showed the highest ROI (because managers don’t want to automate their own jobs…).
If you want to understand where the AI industry’s at right now, the best interview you can watch is this one with Dylan Patel of SemiAnalysis.

FROM OUR PARTNERS

Download our guide on AI-ready training data.

AI teams need more than big data—they need the right data. This guide breaks down what makes training datasets high-performing: real-world behavior signals, semantic scoring, clustering methods, and licensed assets. Learn to avoid scraped content, balance quality and diversity, and evaluate outputs using human-centric signals for scalable deployment.

Download the guide now