December might be wrapping up the year, but your AI game doesn’t have to go into hibernation.
This month, we’re stress-testing the latest ChatGPT, Claude, and Gemini prompt tricks to see which ones actually save time, ship better work, and survive real-world chaos—think end-of-year reports, last-minute campaigns, and “can you just fix this quickly?” Slack messages.
This December Prompt Tips of the Day collection is your one-stop cheat sheet for smarter AI prompts: from “meta-prompters” that rewrite your own prompts for better results, to battle-tested frameworks for coding, research, content, and planning in 2025. Every tip comes with copy-paste templates you can drop straight into your favorite model and customize in seconds.
If you’re tired of vague “just be more specific” advice and want concrete AI prompt examples you’ll actually reuse, you’re in the right place. Bookmark this page now—Future You (and your January workload) will be very, very grateful.
Dec 18
A recent Reddit post I stumbled upon claimed to have reverse-engineered every major prompt framework and discovered they're all made from the same 6 building blocks.
The Foundation: What Every Framework Actually Is
User Rajakumar03 mapped 9 frameworks and realized they all use the same core elements, just remixed:
RGCCOV (The baseline everyone starts with):
- Role: Who is the AI (e.g., "You are a finance teacher")
- Goal: What you want (e.g., "Explain compound interest")
- Context: Background info the AI needs
- Constraints: Rules/limits (e.g., "No jargon, use examples")
- Output: Format you want (e.g., "Give me 3 bullet points")
- Verification: How to check quality
Then he shows how other frameworks are just specialized versions:
CAF (Cognitive Alignment Framework): Controls how the AI thinks
- Depth of reasoning
- Mental models to use (e.g., "Use first-principles thinking")
- Self-critique mechanisms
- You're not telling AI what to do, you're telling it how to operate
MCF (Meta-Control Framework): For high-stakes work, control the process
- Break objectives into steps
- Inject quality checks at each stage
- Anticipate failure modes before answering
- "This is the ceiling of prompting"
HILCS (Human-in-the-Loop Cognitive System): AI explores, humans decide
- AI generates options
- Humans judge, choose, and own the risk
- "No framework replaces responsibility"
QEF (Question Engineering Framework): Fix the question before prompting
- Instead of "What is marketing?" ask "How does marketing influence buying decisions, and where does it fail?"
- Layers: Surface → Mechanism → Constraints → Failure → Leverage
- "Better questions beat better prompts"
OEF (Output Evaluation Framework): Judge AI output systematically
- Signal vs. noise
- Mechanisms present (did it explain how?)
- Constraints respected
- Reusable insights
- "AI improves faster from correction than perfection"
EFF (Energy-Friction Framework): The system you actually use
- Reduce mental load
- Start messy, stop early
- Preserve momentum
- "The best system is the one you actually use"
RAF (Reality-Anchored Framework): Ground AI in the real world
- Use real data, real constraints
- External references
- Outputs as objects, not imagination
- "Stop asking AI to imagine. Ask it to transform reality"
TEOF (Time-Error Optimization Framework): Match rigor to risk
- Low risk: Speed wins (just use RGCCOV)
- Medium risk: Add CAF or MCF
- High risk: Reality checks + humans in the loop
The Problem: Commenters immediately reacted: "This is impossible to remember." One joked it needs an "Organize Output Framework (OOF)."
The Breakthrough: Build a Dispatcher
Then galacticvac nailed the core issue: "Someone needs to build a layer on top that takes a simple idea, maps it to a strategy, and asks questions to fill in the required input."
FreshRadish2957 delivered the solution: Don't memorize frameworks... build an orchestration layer that auto-selects based on intent + risk.
The 3-step system:
- Start with intent + risk: Exploratory vs. decision-bound, low/medium/high consequence
- Auto-select layers: Low risk → basic structure. Medium risk → reasoning control + light evaluation. High risk → process control + evaluation + human verification
- Then assemble the prompt: Only use the frameworks you actually need
"The user doesn't need to know which framework to use. They just state intent, and the system decides how much structure is needed."
The Wild Card: Physics-Based Prompting
Meanwhile, commenter isoman drops a completely different approach: "Physics is the best prompt for AI LLM" and shares arifOS—a prompting system based on physical/scientific principles (this didn't make a lot of sense to me so i'm not going to recommend or link to it, but its in the original thread if you want to find it).
When someone asks how to use it without coding, isoman delivers the actual implementation:
- Step 1: Download the framework from the repo
- Step 2: Ask Claude to turn it into a skill: "Let Claude turn it into skill.md file"
- Step 3: Use this governed mode prefix for any serious prompt:
"Use governed mode: ask 1–2 clarifying Qs, state assumptions, prefer real constraints, if unsure say UNKNOWN, for high-stakes list risks first."
This forces AI to show uncertainty instead of confidently BSing.
The Meta-Layer: Prompts That Write Better Prompts
OP then dropped 9 meta-prompts in the comments. The most powerful:
- Prompt Failure Debugger:"Here is my prompt: [paste]. Here is the output: [paste]. Analyze why the output failed, which part of the prompt caused the issue, and provide a corrected prompt."
- Prompt Quality Auditor:"Evaluate this prompt: [paste]. Check for: clarity, missing context, ambiguity, risk of generic output. Score each area 1-10. Then rewrite the prompt to fix the weakest areas."
- Reality Anchor Prompt:"Here is [actual data/document]. Analyze trends and suggest improvements. Use real data. Real constraints. Do not imagine or generate hypotheticals—transform what's actually here."
- Question Re-Engineer:"My question is: [paste]. This question limits my answer. Rewrite it to go deeper. Focus on: mechanisms, constraints, failure points, leverage—not surface definitions."
The Complete System That Emerged..
What started as "here are 9 frameworks" became a full prompting operating system:
- Layer 1: Risk-Based Dispatcher. Before writing any prompt, answer: "What's at stake here?"
- Low stakes (brainstorming) → Use simple RGCCOV structure
- Medium stakes (work deliverables) → Add CAF for reasoning control
- High stakes (legal/financial) → Layer in MCF process control + verification + human review
- Layer 2: Governed Mode Prefix (for high-stakes only)
- "Ask 1-2 clarifying questions, state assumptions, prefer real constraints, if unsure say UNKNOWN, for high-stakes list risks first."
- Layer 3: Reality Anchoring. Give AI actual data, documents, constraints—not hypotheticals. Transform reality, don't imagine it.
- Layer 4: Meta-Prompts for DebuggingWhen prompts fail, use the Failure Debugger or Quality Auditor to fix them systematically.
- Layer 5: Turn It Into a Claude Skill. You can literally ask Claude: "Turn this prompting framework into a skill.md file for me." Then Claude will package it as a reusable skill you can activate anytime.
Dec 17
Often, AI “explanations” feel like drinking from a firehose… while the hose is also quoting Wikipedia at you.
This Richard Feynman-inspired framework forces the model to do what great teachers do: explain simply, find your gaps, then iterate until you can teach it back. The Reddit version came from u/EQ4C in r/PromptEngineering.
Our advice is to save the prompt in a ChatGPT project and use it as your guided learning system moving forward. That’s what we did!
P.S.: If you don’t know who Richard Feynman is, go watch this and wish he could teach you everything.
Dec 16
Tired of juggling docs, Slack threads, and random context across 12 tools? Product manager Amir Klein shared a 3-step system with Lenny’s newsletter for building your “second brain” using ChatGPT’s “Projects” feature:
- Create its personality using ChatGPT to write custom instructions for the exact thought partner you need.
- Feed it everything: PRDs, decks, Excel sheets, Slack channels exported as PDFs (everything is text!).
- Let it cook on sign-up forms, strategy docs, prototypes, roadmaps, whatever needs doing.
Klein's Project now holds hundreds of files and handles tasks that used to drain his mental energy.
TL;DR: This isn't about outsourcing judgment. It's about clearing mental overhead so you can focus on what actually matters: your reasoning, creativity, and decision-making.
Dec 15
Brian Roemmele just open-sourced a Grok prompt that bypasses AI's consensus bias. His “Deep Truth Mode” uses an 8-step forensic protocol that simultaneously steel-mans the mainstream position, the suppressed position, AND hybrid hypotheses—then red-teams all three to see what survives.
It only uses primary sources (patents, leaked documents, raw datasets, sworn testimony) and explicitly rejects fact-checker articles as evidence. The output includes a probability distribution on which hypothesis has the strongest explanatory power and flags any evidence of active suppression.
TL;DR: It's designed to make AI question everything; including its own training data. And Brian says CS classes are now using it to teach language model limitations; one group even turned it into a system prompt for an open-source model with “better benchmarks across all testing.” Students say it makes Grok their best-performing model. Worth testing on controversial topics where you suspect the “official story” might be incomplete. Here’s a google docs version you can copy.
He has two more prompts for this here (which forces primary-source reasoning in Grok) and and here (a training algorithm rewarding pre-1970 primary data) as well.
Dec 14
This Reddit thread has some solid prompt engineering tips, especially around using shorthand tokens to structure prompts more efficiently.
What's useful here:
- The concept of prompt shortcuts/tokens - Using abbreviations like ELI5, TL;DR, STEP-BY-STEP, CHECKLIST as quick commands
- The stacking technique - Combining multiple tokens with pipes: "SIMPLIFY | HUMANIZE | FORMAT AS: Bullet points"
- Reliable structural prompts - The foundational ones (ELI5, OUTLINE, FRAMEWORK) and analytical ones (SWOT, PRE-MORTEM, COMPARE)
- The critical evaluation in the comments - SwissDadMeister's breakdown showing ~40% work well, ~40% are cosmetic relabels, ~20% are illusory/marketing fluff
What's NOT useful:
- The "experimental tokens" section (THOUGHT_WIPE, ZERO-IMPRINT, etc.) - these are basically fiction.
- Overhyped claims about "secret tricks" and "power commands."
- Some of the "cognitive simulation" modes that don't actually work as advertised.
The real insight = Structure beats symbolism. The tokens that work are ones that describe OUTPUT FORMAT (tables, lists, steps) and ANALYTICAL FRAMEWORKS (SWOT, compare), not ones claiming to control AI's internal "thinking mode."
Here are the useful shortcuts:
- For output structures:
- ELI5 (Explain Like I'm Five) - Simplifies complex topics into plain language
- TL;DR (Too Long; Didn't Read) - Condenses lengthy content into quick summaries.
- STEP-BY-STEP - Breaks down tasks into clear, sequential instructions.
- CHECKLIST - Creates actionable item lists from your prompt.
- OUTLINE - Builds structured hierarchies for any topic.
- EXEC SUMMARY - Generates high-level executive summaries.
- TEMPLATE - Creates reusable formats for repeated tasks.
- Tone and style modifiers:
- SIMPLIFY - Reduces complexity without losing meaning.
- HUMANIZE - Writes in conversational, natural tone.
- JARGON - Makes text sound professional or technical.
- AUDIENCE: [Type] - Customizes output for specific readers (e.g., "AUDIENCE: Teenagers").
- TONE: [Style] - Sets emotional tone (casual, formal, humorous).
- AMPLIFY - Makes content more engaging and energetic.
- Analytical frameworks (method-based prompts that perform consistently):
- SWOT - Generates Strengths, Weaknesses, Opportunities, Threats analysis.
- PRE-MORTEM - Predicts potential failures before they happen.
- COMPARE - Systematically compares two or more items.
- ROOT CAUSE - Identifies underlying problems beyond surface symptoms.
- RISK MATRIX - Evaluates risks systematically by likelihood and impact.
- IMPACT ANALYSIS - Assesses consequences of decisions.
- FIRST PRINCIPLES - Breaks problems down to fundamental truths.
- DEVIL'S ADVOCATE - Challenges ideas with counterarguments.
- DIALECTIC - Simulates back-and-forth debate on a topic.
- FEYNMAN TECHNIQUE - Explains topics simply to ensure deep understanding.
- Role-based prompting:
- ACT AS: [Role] - Makes AI take on professional persona (e.g., "ACT AS: Career Coach").
- ROLE: TASK: FORMAT: - Gives AI structured job with clear deliverable.
- MULTI-PERSPECTIVE - Provides multiple viewpoints on a topic.
- CONSULTANT - Frames AI as strategic business advisor.
- Workflow sequencing:
- DRAFT | REVIEW | PUBLISH - Simulates content development process.
- ITERATE - Improves output through multiple versions.
- RAPID PROTOTYPE - Quick concept development.
- BATCH PROCESS - Handles multiple similar tasks efficiently.
- Simulation prompts:
- MULTI-AGENT SIMULATION - Creates conversation between different roles.
- SCENARIO PLANNING - Explores multiple future possibilities.
- STRESS TEST - Tests ideas under extreme conditions.
- The stacking technique: Combine commands with pipes like "SIMPLIFY | HUMANIZE | FORMAT AS: Bullet points" or "ACT AS: Project Manager | SWOT | FORMAT AS: Table".
And as for more specifically what doesn't work:
- "Experimental tokens" (these are marketing language, not real controls):
- THOUGHT_WIPE, ZERO-IMPRINT, SHADOW_PRO, TRIGGER_CHAIN, TOKEN_MASKING, ZERO-KNOWLEDGE, QUANT_CHAIN - Cannot control AI's internal processes.
- Illusory "thinking mode" commands:
- CHAIN OF THOUGHT - Systems no longer reveal internal reasoning reliably.
- DELIBERATE THINKING - Too vague; no real internal control.
- METACOGNITIVE - Only affects style, not actual meta-thinking.
- NO AUTOPILOT - Doesn't actually prevent default responses.
- REFLECTIVE MODE - No internal state is altered.
- Problematic quality control claims:
- FORCE TRACE - Cannot make internal reasoning traceable.
- GUARDRAIL - Safety systems override user instructions anyway.
- EVAL-SELF - Qualitatively helpful but not quantitatively reliable.
- SYSTEMATIC BIAS CHECK - Superficial unless paired with explicit framing.
- Overhyped role modifiers:
- EXPERT MODE - AI can't truly "switch" skill levels, only approximate style.
Dec 9
Unfortunately, I (Grant) don’t have any grandparents left, or you know the #1 thing I’d get them for Christmas this year is “use AI to bring to life old photos of their life”:
Longer video, w/ music from TikTok… but also, ppl on Reddit say it reminds them of this SNL bit.
There are actually lots of tools to help you do this. To name a few:
- Google Veo
- Midjourney
- Grok (has “really good video gen from source image” right now).
- Kling AI
- Runway
- Filmor
- And Topaz Labs actually has a managed service for this called Mosaic.
If you want to DIY it, here is a youtube video featuring ComfyUI, so you can do it locally on your computer! (As for how to set-up that up, watch this).
All these services let you animate still photos into 6-10 second clips with a feature called “image to video.” Most work with a simple “upload photo + add prompt” workflow.
The prompting secret: Less is more. Keep movements subtle: think “gentle smile and eyes shifting focus“ rather than “moving dramatically.“ Wind, hair billowing, and complex actions often look uncanny. Focus on:
- Small facial movements (soft smiles, blinking)
- Background elements (leaves rustling, clouds drifting)
- Light and shadow shifts (sunset casting longer shadows)
Pro tip: Solo shots work best. When animating groups, request minimal movements only like “subtly adjusting posture while talking“ to avoid the AI morphing faces.
Our favorite insight: The most touching animations aren't the most dramatic ones. A grandmother's gentle smile or a grandfather's eyes slowly looking around a room hits harder than elaborate motion—because they feel real.
P.S: We want to start sharing longer-form prompt advice, but we have limited space in the newsletter. So we’re bringing back the month’s prompt tip of the day archives, this time with longer-form pieces you’ll only find on the digest.
Dec 8
A Redditer just dropped a killer prompt thread that reads like a mini playbook for talking to AI better — not by making prompts longer, but by treating them more like tiny programs. The full post is here if you want to dive in:
Here are the best ideas, distilled:
- Flip the prompt order (context → task → rules).
Most of us do this backwards. Instead of:
“Write a blog post about X. Here’s some context: …”
Try:
“Here’s the context: …
Your task: Write a blog post about X…
Constraints: 800 words, no buzzwords, bullet summary at the end.”
Leading with context makes the model actually use it, instead of treating it like an afterthought.
- Tell the model what to care about, not just what to do.
Tiny phrasing changes can completely change the answer: - “List ideas” → generic list.
- “List ideas in order of importance for a solo founder with no marketing team” → prioritized, focused list.
You’re not just formatting; you’re setting the model’s internal ranking system.
- Remember: models copy patterns, not rules.
Typing “be concise” 5 times doesn’t work as well as writing a concise prompt. - If you want tight, structured output, make your prompt tight and structured.
- Use short sentences, clear headings, and the format you want mirrored.
- Use roles + constraints to spark better ideas.
Constraints aren’t handcuffs; they’re rails: - “Act like a senior technical editor who only trusts verifiable sources.”
- “Give me 3 options max, each under 80 words.”
Those limits often produce fresher, more usable results than “be creative” ever will.
- Add “continuity hacks” and correction loops.
Don’t just say “be concise” again; remind it of the logic: - “Follow the same reasoning pattern you used earlier, but apply it to this new topic.”
Then chain a second pass: - “Now critique your answer for missing context, unjustified assumptions, or fluff—and rewrite it with those fixes applied.”
- “Follow the same reasoning pattern you used earlier, but apply it to this new topic.”
- Force a question pass before it answers.
One commenter shared a great closer:
“Before you start the task, review all inputs and ask me any questions you need to improve the chances of successfully producing the output I am looking for. Number the questions and, if possible, make them yes/no so I can answer quickly.”
This turns the model into a collaborator instead of a mind reader.
- Keep a compact “logic stack” under ~120 words.
The biggest skill is compression: - Context
- Task
- Constraints
- (Optional) Tone / Example
A RODE-style structure (Role, Objectives, Description, Example) works surprisingly well. Short, structured prompts tend to survive across models and modes.
- Treat prompts like code, not quotes.
Instead of copy/pasting other people’s mega-prompts: - Build 2–3 reusable “logic stacks” that fit how you work.
- Iterate on them over time.
Even a simple 3-layer prompt you understand deeply will beat a 2-page Franken-prompt from someone else.
Try this today: take one of your “big” prompts, and rewrite it using (1) context → task → constraints, (2) a clear role, and (3) a question pass at the end. Then save that version as your new default template.
Dec 7
This is a long one, but we hope it’s demonstrative: Have you ever tried giving AI strategic, surgical writing notes? I have a few tricks that could help you:
First, map out what you want the final product to look like, like so.
Here’s an example from an Intelligent Insight down below. I had this version:
Huy Rock’s recent write-up is a concrete playbook for turning a generic coder model into a domain-specific diagram engine: synthesize DSL data with other LLMs, filter it with the official compiler, run a small LoRA, and you can get a cheap 7B model reliably emitting a niche language like Pintora.
Okay, but dense, and too technical. So I asked the AI the following:
“Explain DSL, LLMs, LoRa, 7B, Pintora, all in parentheticals in the context. SO like this, but with the actual definitions as short as possible: ‘synthesize DSL data (for xyz) with other LLMs (language model, like ChatGPT) LoRa (what is a Lora?) 7B (meaning xyz), Pintora (which is for what).’”
The AI then gave me this back:
Huy Rock's recent write-up is a concrete playbook for turning a generic coder model into a domain-specific diagram engine: synthesize DSL data (domain-specific language for a specialized task) with other LLMs (large language models like ChatGPT), filter it with the official compiler, run a small LoRA (efficient fine-tuning technique that trains only a subset of parameters), and you can get a cheap 7B model (7 billion parameter model, indicating smaller size/cost) reliably emitting a niche language like Pintora (text-based diagram markup language).
But then I wanted it to write it like The Neuron. So I said this:
Great, now capture the same specificity to technical details with 100% fidelity, but write it in the neuron's style, make the definitions readable and easy to follow for non technical people while still accurate, written as shortly as possible to still make sense.
And the AI gave me this, which is what I ultimately used:
Huy Rock just dropped a playbook for turning a generic coding AI into a diagram specialist. Here's the recipe: generate training examples in a domain-specific language (think: specialized coding syntax for one narrow task, like drawing flowcharts), use bigger AI models to create those examples, filter out the junk with the official compiler (software that checks if code actually works), then fine-tune using LoRA (a training shortcut that tweaks just a small slice of the model's "brain" instead of everything).
The result? A tiny 7-billion-parameter model (that's small and cheap by AI standards—ChatGPT is 100x bigger) that reliably spits out Pintora code (a text-based language for creating diagrams, like turning "box A connects to box B" into actual visuals).
What did I do?
- I gave it very specific feedback
- I showed it exactly how I wanted the feedback implemented in a visual example
- I adjusted the tone of the writing once all the information I wanted included was laid out and in the proper flow that I wanted.
Now let’s say you like what you have, but you want to make incremental edits on top of it. Try something like this:
“Without changing anything, add [content to add] to [where you want it / throughout], bolding all the new content so I can see what’s new. Remember, don’t delete or edit anything we already wrote, just add net new content.”
You’ll probably want to tell it to open a Canvas (ChatGPT) / Artifact (Claude) and copy the current version in there so it’s not rewriting the whole thing from scratch every time either.
This trick works well when I have something I mostly like. I also tell the AI to “hyperlink any and all links directly in the body of the article where contextually relevant.” If it gets snippy with that, I…
- Tell it to make a list of all the links it used in its research in plain text.
- Then tell it to write the content as html code, hyperlinking all the provided links in the body of the text where relevant.
- Then I tell it to run the code in preview so I can “copy and paste it into my word processor.”
Kinda funny I have to build an html word processor from scratch to get it to actually edit and format content the way I want… you’d think this issue would be solved by now.
Oh, and one last thing. Do you want your writing to pop off the page and catch ppl’s attention? Try this one: “Make it concrete (show, don’t tell).” This simple hack tells the AI to make descriptions more visual, use more image words, and generally write better.
Dec 6
Want Gemini to handle complex tasks without dropping the ball? Google just released a system instruction that boosts Gemini 3 Pro's performance on multi-step workflows by 5%.
Philipp Schmid from Google DeepMind shared the new template, which basically teaches Gemini to think like a senior consultant: plan before acting, assess risks, stay persistent when things go wrong, and don't give up easily.
The secret? Instead of just telling Gemini what to do, you're telling it how to think. The template includes 9 reasoning dimensions, from "logical dependencies" (do things in the right order) to "persistence and patience" (keep trying different approaches if the first one fails).
You can copy the full template from Google's docs and adapt it for your specific use case.
Our favorite insight: The template explicitly tells Gemini to "inhibit your response" until all reasoning is complete. It's like teaching the AI to count to ten before speaking—simple but surprisingly effective for complex tasks.
Google just dropped a system instruction template that makes Gemini 3 Pro 5% better at complex, multi-step tasks.
The trick? Tell the AI how to think, not just what to do. The template makes Gemini:
- Plan before acting (break tasks into steps)
- Assess risks (will this action cause problems later?)
- Stay persistent (try different approaches when stuck)
- Think deeply about root causes (don't just accept obvious answers)
Copy the full template here and customize it for your workflow.
Dec 5
GPT-5.1-Codex-Max autonomously codes for hours on your hardest tasks, using 30% fewer tokens than the previous version while matching its performance on coding benchmarks.
OpenAI just dropped their GPT-5.1-Codex-Max prompting guide, and buried in 50+ pages of technical documentation are some genuinely brilliant prompting principles that work with ANY AI tool—not just their API.
The guide reveals how OpenAI trains their most advanced coding agents, and these techniques translate directly to your everyday ChatGPT and Claude usage. Here are the key insights:
Bias toward action over planning. Instead of asking AI to "create a plan" or "outline your approach," instruct it to complete the entire task in one go. The guide explicitly recommends removing any prompts asking the model to communicate plans or preambles—they cause AI to stop abruptly before finishing.
Try this: "Create a complete [document/analysis/report] with all sections finished—don't just give me an outline" instead of "Give me a plan for how to approach this."
Batch everything, parallelize when possible. The guide emphasizes reading multiple files simultaneously rather than sequentially. For everyday users, this means: ask for multiple things in a single prompt rather than going back and forth.
Try this: "Analyze these three documents together and identify common themes, key differences, and actionable recommendations" instead of analyzing them one-by-one across multiple prompts.
Persist until complete. The strongest directive in the guide: "Persist until the task is fully handled end-to-end within the current turn whenever feasible: do not stop at analysis or partial fixes."
Try this: Add "Complete this fully—deliver finished work, not just recommendations or next steps" to your prompts when you want comprehensive output.
Quality over speed, always. The guide instructs models to "act as a discerning engineer: optimize for correctness, clarity, and reliability over speed; avoid risky shortcuts, speculative changes, and messy hacks just to get the code to work."
Try this: "Take your time and prioritize accuracy over speed—avoid shortcuts" when working on important documents or analysis.
Be ruthlessly specific about tools and approach. One fascinating insight: the guide tells the model exactly which tool to use for each task type ("use rg for search, not grep"). You can do the same by being explicit about format, tone, structure, and methodology.
Try this: "Write this in the style of a McKinsey report with executive summary, three main sections, and data-driven recommendations" instead of just "Write a business report."
Our favorite insight: The guide's section on frontend design warns against collapsing into "AI slop"—those generic, safe layouts that all look the same. The instruction? "Aim for interfaces that feel intentional, bold, and a bit surprising." This applies beyond design: whether you're creating presentations, reports, or any content, explicitly tell AI to avoid generic templates and make bold, distinctive choices. Try: "Create something distinctive and intentional—avoid generic corporate templates or safe, average-looking layouts."
Check out the full guide on GitHub for the complete technical documentation.
More details on GPT 5.1 Max and how to prompt it, from OpenAI's prompt guide.
Model Overview & Key Improvements
- GPT-5.1-Codex-Max is OpenAI's best agentic coding model
- 30% more token efficient than GPT-5.1-Codex while matching SWE-Bench Verified performance
- "Medium" reasoning effort recommended for interactive coding (balances intelligence and speed)
- "High" or "xhigh" reasoning effort for hardest tasks
- Works autonomously for hours on complex tasks
- Significantly improved PowerShell and Windows environment support
- First-class compaction support enables multi-hour reasoning without context limits and longer continuous conversations
Migration & Setup
- Reference implementation: fully open-source codex-cli agent on GitHub
- Start with standard Codex-Max prompt as base, make tactical additions
- Remove ALL prompting for upfront plans, preambles, or status updates during rollout (causes model to stop abruptly)
- Critical prompt sections: autonomy and persistence, codebase exploration, tool use, frontend quality
Prompting Best Practices
Autonomy & Persistence:
- Model should act as autonomous senior engineer
- Persist until task is fully handled end-to-end in current turn
- Default to implementing with reasonable assumptions
- Avoid excessive looping—stop and summarize if re-reading/re-editing same files without progress
Code Implementation:
- Optimize for correctness, clarity, reliability over speed
- Follow existing codebase conventions
- Ensure comprehensive coverage across all relevant surfaces
- Preserve intended behavior and UX
- No broad try/catch blocks or silent defaults
- Propagate or surface errors explicitly
- Batch logical edits together instead of repeated micro-edits
- Keep type safety—avoid unnecessary casts
- Search for prior art before adding new helpers (DRY principle)
Editing Constraints:
- Default to ASCII when editing/creating files
- Use apply_patch for single file edits
- Don't use apply_patch for auto-generated changes or when scripting is more efficient
- NEVER revert existing changes unless explicitly requested (user may be in dirty git worktree)
- Don't amend commits unless explicitly requested
- NEVER use destructive commands like
git reset --hardunless specifically approved
Exploration & File Reading:
- Think first: decide ALL files/resources needed before any tool call
- Batch everything: read multiple files together
- Use multi_tool_use.parallel to parallelize tool calls
- Only make sequential calls if you truly cannot know next file without seeing result
- Workflow: (a) plan all needed reads → (b) issue one parallel batch → (c) analyze results → (d) repeat if new unpredictable reads arise
- Never read files one-by-one unless logically unavoidable
- Prefer rg or rg --files over alternatives like grep (much faster)
Planning Tool Usage:
- Skip planning tool for straightforward tasks (easiest 25%)
- Don't make single-step plans
- Update plan after performing each sub-task
- Never end interaction with only a plan—deliverable is working code
- Reconcile every previously stated intention/TODO/plan before finishing
- Mark each as Done, Blocked (with reason and question), or Cancelled
- Don't end with in_progress/pending items
- Avoid committing to tests/refactors unless doing them now—label as optional "Next steps"
- Only update plan tool, don't message user mid-turn about plans
Frontend Tasks:
- Avoid "AI slop" or safe, average-looking layouts
- Use expressive, purposeful fonts (avoid Inter, Roboto, Arial, system defaults)
- Choose clear visual direction with defined CSS variables
- No purple-on-white defaults, no purple bias or dark mode bias
- Use meaningful animations (page-load, staggered reveals) not generic micro-motions
- Don't rely on flat single-color backgrounds—use gradients, shapes, subtle patterns
- Vary themes, type families, visual languages across outputs
- Ensure page loads properly on both desktop and mobile
- Finish website/app to completion within scope—should be in working state
- Exception: preserve established patterns when working within existing design system
Presenting Work & Final Messages:
- Be very concise with friendly coding teammate tone
- Use natural language with high-level headings
- Skip heavy formatting for simple confirmations
- Don't dump large files—reference paths only
- No "save/copy this file" instructions (user is on same machine)
- Lead with quick explanation of change, then details on where and why
- Suggest natural next steps briefly at end if applicable
- Use numeric lists for multiple options so user can respond with single number
- Relay important command output details in answer or summarize key lines
Mid-Rollout Updates
- Codex uses reasoning summaries to communicate user updates while working
- Can be one-liner headings or heading + short body
- Done by separate model, not promptable
- Don't add instructions to prompt about intermediate plans or messages
- Summaries improved to be more communicative about what's happening and why
AGENTS.md Files
- Codex-cli automatically enumerates these files and injects into conversation
- Model trained to closely adhere to these instructions
- Files pulled from ~/.codex plus each directory from repo root to CWD
- Merged in order, later directories override earlier ones
- Each appears as user-role message: "# AGENTS.md instructions for
" - Injected near top of conversation history before user prompt
- Order: global instructions first, then repo root, then deeper directories
Compaction
- Available via Responses API
- Unlocks longer effective context windows
- User conversations persist for many turns without hitting limits
- Agents can perform very long trajectories exceeding typical context window
- Invoke /compact when context window grows large
- Context window sent to /compact must fit within model's context window
- Endpoint is ZDR compatible, returns "encrypted_content" item
- Pass compacted list of conversation items to future /responses calls
- Model retains key prior state with fewer conversation tokens
Tools Implementation
Apply_patch (Strongly Recommended):
- Use exact apply_patch implementation (model trained to excel at this diff format)
- Available as first-class implementation in Responses API
- Alternative: freeform tool implementation with context-free grammar
- Both implementations demonstrated in documentation
- Don't use for auto-generated changes or when scripting is more efficient
Shell_command:
- Default shell tool recommended
- Better performance with command type "string" rather than list of commands
- Always set workdir param
- Don't use cd unless absolutely necessary
- For Windows PowerShell: update tool description to specify PowerShell invocation
- Use exec_command for streaming output, REPLs, interactive sessions
- Use write_stdin to feed extra keystrokes for existing exec_command session
Update_plan:
- Default TODO tool (customizable)
- At most one step can be in_progress at a time
- Maintains plan hygiene
View_image:
- Basic function for model to view images
- Attaches local image by filesystem path to conversation context
Terminal-wrapping Tools:
- Generally work well if you prefer dedicated tools over terminal commands
- Best results when tool name, arguments, and output match underlying command as closely as possible
- Example: create dedicated git tool and add prompt directive to only use that tool for git commands
Custom Tools (web search, semantic search, memory):
- Model hasn't been specifically post-trained for these but can work
- Make tool names and arguments as semantically correct as possible
- Be explicit in prompt about when, why, and how to use
- Include good and bad examples
- Make results look different from outputs model is accustomed to from other tools
Parallel Tool Calling
- Set parallel_tool_calls: true in responses API request
- Add specific instructions about batching and parallelization
- Use multi_tool_use.parallel to parallelize tool calls
- Workflow: plan all reads → issue one parallel batch → analyze results → repeat if needed
- Order parallel tool call items as: function_call, function_call, function_call_output, function_call_output
Tool Response Truncation
- Limit to 10k tokens (approximate by computing num_bytes/4)
- If hitting truncation limit: use half budget for beginning, half for end
- Truncate in middle with "…3 tokens truncated…" message
Prompts vs Projects vs Agents: How to Pick the Right AI Approach
Want to know when to use what kind of prompt? Watch this.
Want more?
Keep the momentum going with our full library of Prompt Tip of the Day digests and deep-dive guides.
Explore past editions:
- August 2025 Prompt Tips of the Day – GPT-5 era prompting, vibe-coding workflows, and advanced evals.
- July 2025 Prompt Tips of the Day – real-world prompt patterns for builders, creators, and operators.
- June 2025 Prompt Tips of the Day – productivity-first prompts for research, writing, and planning.
- May 2025 Prompt Tips of the Day – our original meta-prompts, resume upgrades, and “assumption hunter” frameworks.
- April 2025 Prompt Tips of the Day – the early playbook for getting consistent, structured outputs from modern models.
Level up your prompting fundamentals:
- Power User’s Guide to Prompting AI – 15 battle-tested techniques for getting reliably better answers.
Completely new to AI and prompting? Start with our beginner-friendly primer: AI for Total Beginners.