Welcome, humans.

Boris Cherny from the Claude Code team has a useful theory for what tech jobs become once engineering, product, design, and data science start melting into each other across five new archetypes:

The Prototyper who invents ideas.
The Builder who ships them.
The Sweeper who cleans them up.
The Grower who improves product-market fit.
And The Maintainer who keeps mature systems secure, reliable, fast, and efficient.

Now, you might not be an engineer, but you can start to see how this could reshape your role’s hierarchy, too:

In marketing, sales, ops, and basically every other function, the work starts to split by stage: a role could prototype the campaign angle, outbound play, or broken workflow fix; another builds the landing page, sales sequence, or automation; yet another sweeps up the messy lifecycle emails, CRM management, another grows the thing that works; and another maintains the system to keep the lights on so the ice cream doesn’t melt.

And anyone could do any of these roles at any stage of the project, given their skillset or interest and what stage the work is in.

So look at it like this: AI will probably make stage-of-work more important than title or position. The valuable question becomes less “What department are you in?” and more “Are you good at inventing, shipping, cleaning, growing, or maintaining?”

Counter point: Kun Chen pushed back that archetypes can actually become career cages: and actually, people should change roles as the project changes, (us paraphrasing here) adapting the mindset behind what their job is to match what the project actually requires. Boris agreed.

So you can look at this advice as less “pick your Hogwarts house and stick with it for life” and more “know which hat the project needs right now.” Unfortunately, most of us non-engineers using AI are the sixth archetype: Person Who Asks Claude To Fix The Thing They Asked Claude To Make. I’m not mad about it… I really do like making things with Claude as a partner :)

Here’s what happened in AI today:

😿 Consulting clients are pushing firms toward outcome-based pricing as AI makes hourly work harder to justify.
📰 Ford’s general counsel said in-house legal teams are adopting AI faster than many outside law firms.
📰 Meta advanced Brain2Qwerty, its non-invasive brain-to-text research pipeline.
🍪 Cursor launched an iOS app for launching and reviewing coding agents from your phone.
🎓 Today’s AI Skill shows you how to turn AI-assisted work into outcome-based proposals.

…and a whole lot more that you can read about here.

Hey: Want to reach 700,000+ AI-hungry readers? Advertise with us!

😿 Did AI Kill The Billable Hour

So the old professional-services bargain was simple: smart people spent hours on hard problems, then billed clients for those hours.

It turns out AI has begun to make that bargain weird… especially for professional consulting firms.

See, the WSJ just reported that consulting firms are trying to move away from hourly billing as AI makes some work faster, cheaper, and harder to meter the old way. Business Insider reported a similar shift: clients increasingly want firms to put “skin in the game” through fees tied to results.

Here’s what happened:

Deloitte reportedly showed consultants a chart suggesting traditional labor-based consulting could shrink sharply as a share of the market by 2035.
AI agents are expected to become a much larger part of professional services.
Firms are testing fixed-fee pricing, where clients pay a set amount for a defined project.
They are also testing outcome-based pricing, where pay depends on agreed results.
McKinsey says more than 30% of its global fees already come from pricing tied directly to client outcomes.

Why this matters: Consulting has a math problem. If AI lets a team finish a 40-hour project in 10 hours, clients will eventually ask why they are still paying for 40. But replacing time is not as simple as “pay for output.” Pay only for time, and the incentive can drift toward delay. Pay only for output, and the incentive can drift toward rushed work

There’s a darker incentive here, too. If consultants get paid based on “results,” they’ll naturally chase the results that are easiest to measure. And if the goal is saving money, the easiest measurable savings often come from cutting headcount.

That could make AI consulting a weird accelerant for AI layoffs. Big companies already want to prove they’re lean enough to compete with AI-native startups. Outcome-based consulting gives them a clean package: strategy, implementation, and a spreadsheet showing savings.

See, this is bigger than consultants. Billable hours punish fast workers because efficiency reduces the value you can accrue. Salaries can do the same thing in softer form: finish your work early, and you often get more work, more meetings, or suspicion that you’re not working enough. When is enough ever really enough, anyway?

Our take: AI probably kills selling time as a proxy for value over the long run. But the replacement, paying for outcomes, gets messy if it’s not measured correctly.

According to the WSJ, a good outcome model needs four pieces: a stable base, a clear target, quality guardrails, and shared upside. Imagine a company hires a firm to automate customer support. The bad version says, “we’ll pay you for every ticket closed,” which rewards speed and tempts everyone to bury quality. The better version says: we’ll pay a base fee to keep the team accountable, then add upside only if resolution time improves while customer satisfaction, escalation rate, and error rate stay healthy.

That is the sweet spot: not pay for time, or pay for task volume, but pay for verified improvement without hidden damage.

For consultants, that might mean retainers plus success fees. For employees, it should mean salary plus a real share of the value created: team bonuses, profit-sharing, equity, more clear promotion paths, or protected time back when AI makes the system meaningfully more productive. If someone builds an agent workflow that saves 200 hours a month, the company should not simply absorb the savings and hand them a bigger backlog of tasks to do (spoiler: this is currently what’s happening; but the current alternative is downsizing).

In our opinion, the next pay fight is attribution. Who gets credit when the model, the manager, the employee, and the vendor all helped create the outcome? The answer will not be perfect, but pretending time spent still measures value is not the right mode.

The actionable move: start defining outcomes in bundles, not single metrics. Think Cost savings plus increased quality. Speed to deliver plus accuracy of result. Revenue per employee increased plus employee retention increased. Oftentimes, these metrics are misaligned (as they say in the film industry: speed, quality, or money… pick two). But with AI applied appropriately and incentives aligned properly, you might get closer to picking all three.

Without proper alignment of pay and incentives, “AI productivity” becomes a very clean way to say: fewer people, higher quotas, same paycheck.

FROM OUR PARTNERS

Heads up: This is how banking* works now

You've seen what AI can do when it's built into the right system. Mercury Command brings that intelligence to your finances.

Ask for what you need in plain language, and Mercury Command helps execute the work—payments, forecasting, categorization, and invoices—across all of Mercury. Every answer is generated from your Mercury data, with full account context, clear reasoning, and a traceable record. You stay in control and approve every action before it happens.

No dashboards. No exports. No hunting for answers. Just AI-powered financial command.

Try Mercury Command →

*Mercury is a fintech company, not an FDIC-insured bank. Banking services provided through Choice Financial Group and Column N.A., Members FDIC.

🎓 AI Skill of the Day: Use AI To Rewrite Hourly Work As A Fixed-Scope Offer

If AI makes your work faster, your pricing should get sharper. The easiest first move is turning an hourly task into a fixed-scope package.

Try this structure:

Name the business outcome.
Define the deliverable.
Add a success metric.
Set revision limits.
State what AI speeds up and what human judgment still owns.

Copy/paste this:

Rewrite this hourly service as a fixed-scope offer.

Hourly service:
[paste service]

Typical client:
[paste client type]

What AI speeds up:
[paste tasks]

What still requires human judgment:
[paste tasks]

Give me:
1. Offer name
2. One-sentence promise
3. Deliverables
4. Timeline
5. Success metric
6. Revision policy
7. Fixed-fee positioning

The client does not care that your workflow got faster. They care whether the outcome got clearer, cheaper, or less risky.

Total AI beginner? Start here (goes with this video).

Have a specific skill you want to learn? Request it here.

🍪 Treats to Try

*Asterisk = from our partners (only the first one!). Advertise to 700K+ readers here!

*Your AI roadmap needs a test course. The Dell Pro Max with GB10 helps teams experiment before making bigger bets.
Cursor for iOS helps you launch cloud coding agents, control desktop agents, review diffs, and merge PRs from your phone.
ClinePass gives you discounted, key-free access to GLM, Kimi, DeepSeek, MiMo, and other open coding models inside Cline. $9.99/month.
Devin Fusion routes coding-agent work between a frontier model and a smaller sidekick model to reduce costs while preserving review quality.
Draft captures context from meetings, Slack, and GitHub, then injects it into new agent sessions so teams stop re-explaining projects.
Halo connects iPhone apps like mail, calendar, reminders, music, and health into one private assistant for briefings and automations.
OpenClaw, the AI agent tool for emails, calendars, home automation, and more, is now on iOS and Android —free to try.

New from The Neuron: We Turn a Spreadsheet Into a Business App w/ Pave by Quickbase

So we got to test out Pave by QuickBase and used it to turn a mock CRM spreadsheet into a live business app with dashboards, user roles, notifications, and publishing. Our recommendation on a best first use case? Transform the spreadsheet you hate to update into a business dashboard that’s easy to read at a glance. Check our test here.

📰 Around the Horn

Countdown until someone does a cat version in 3… 2… 1…

Claude became generally available in Microsoft Foundry, giving enterprises another way to use Anthropic models on Azure.
Meta shared Brain2Qwerty v2, a non-invasive brain-to-text research pipeline for real-time sentence decoding.
Arena said it reached a $100M annual revenue run rate eight months after launching its real-world AI evaluation product.
Artificial Analysis launched AA-Briefcase, a private benchmark for long-horizon agentic knowledge work across spreadsheets, presentations, and memos.
Nvidia’s CHORD project showed a way to transfer human dexterous manipulation skills to robot policies using contact-focused demonstrations.
Tidal said it will label and restrict monetization for substantially AI-generated music.

FROM OUR PARTNERS

Production-grade infrastructure for conversational AI

Voice AI experiences often break under high concurrency, packet loss, and poor connection. Agora's Conversational AI platform runs on SDRTN® — the same ultra-low latency network carrying 80B+ minutes monthly across 200+ countries. Build AI agents or add voice to any application with fully managed, real-time infrastructure.

Get started with 300 free minutes

🔧 Tuesday Tech Tip: How AI Learned To Pay Attention

SemiAnalysis put together a great thread breaking down the “oral history” of the transformer. ICYMI, the Transformer is the basic design behind modern chatbots. Its big idea is attention: before an AI predicts the next word, it checks which earlier words are most relevant.

Example: in “The trophy didn’t fit in the suitcase because it was too big,” attention helps the model connect “it” to “trophy,” not “suitcase.”

The original 2017 Transformer used Multi-Head Attention (MHA). A “head” is one way of looking for patterns. One head might track grammar. Another might track names. Another might track long-range references. Multiple heads matter because language has many relationships happening at once.

Then researchers made attention faster and cheaper:

MQA: fewer repeated memory lookups, so responses run faster.
GQA: groups attention work together, keeping much of MHA’s quality with lower serving cost.
SWA: looks through a sliding window of nearby text instead of checking every word against every other word.

Then came FlashAttention, a huge systems breakthrough. It did not change what attention meant. It changed how efficiently GPUs stored and moved the data during training and answering. That made long-context models much more practical.

DeepSeek later pushed MLA, or Multi-Head Latent Attention. Simple version: it compresses attention information so models can remember useful context with less memory. DeepSeek-V3/R1 helped make MLA a common pattern in open-weight models.

Agents created a new problem: they need to read huge contexts. So labs pushed:

Linear attention: tries to make attention cost grow more gently as context gets longer.
Sparse attention: skips less useful token pairs and focuses compute on the important ones.
Examples include Gated Delta Networks, Qwen 3.5, Kimi Delta Attention, DeepSeek Sparse Attention, MiniMax Sparse Attention, GLM-5.2’s IndexShare, and SWA-GQA hybrids from Cohere and Xiaomi.

Finally, running models for millions of users created the KV cache problem. The model stores earlier “key/value” information so it doesn’t reread everything from scratch for every new word. Radix Attention and vLLM’s Paged Attention made that stored memory easier to manage at scale. And that pretty much brings us to where we are today, with much more room to grow and evolve.

Now, one historical note: human fact checker and AI’s infamous Reply-Guy-in-Residence Jürgen Schmidhuber pointed out that parts of this idea go back before 2017. His 1991 ULTRA work used a linear Transformer-like setup, with “FROM/TO” playing a role similar to today’s “key/value.” So there’s that.

In sum: Attention is how AI decides what to look at, and it’s evolved a LOT since 2017. Every major upgrade since then has tried to make that mechanism smarter, cheaper, longer-range, or easier to serve to millions of people. Here’s to more innovation!