Codex /goal feature (TESTED)
Summary:
Codex’s new /goal mode lets you hand it a long-running task and walk away. From there it loops plan → act → test → review until your stop condition is met, or your weekly quota taps out. Currently it takes a two-line config edit to enable, then prefix your prompt with /goal. This is very much in line with what the reddit crowd does with a lot of their homebrew scripts and it opens some fun possibilities!
Worth turning on if: you have multi-hour validated work with a clear “done” definition. Skip it: if you’re still exploring or making judgment calls.
It is experimental and several SEO posts have invented unsupported subcommands and feature names that aren’t in OpenAI’s docs or codex --help. I recommend you treat anything you didn’t read at developers.openai.com with suspicion.
The two-line snippet:
[features]
goals = true
Upgrade Codex CLI to 0.128.0+, restart it, and up should be in business. If you encounter any issues, chat with Codex and you’ll likely be able to get any loose ends tied up ASAP.
What /goal is
OpenAI’s docs put it this way: “Use /goal when you want Codex to keep working toward one durable objective instead of stopping after one normal turn.”
The cool features is the persistence around a contract with a verification loop. Codex isn’t just running forever. It’s checking its own work against measurable evidence (tests, evals, builds, screenshots, Lighthouse runs) until the stop condition you defined is satisfied.
That’s the answer to the skeptic on r/codex who called it “a dumb mnemonic” because every prompt is already a goal (which they are). But a normal prompt runs maybe 3 turns. A /goal runs as many as the contract demands.
It’s different from /plan, which shapes the approach early, consider /plan pre-flight. /goal is the autopilot. ✅
What it isn’t
The feature is a few weeks old, and a wave of SEO blogs has rushed in to fill the documentation gap by inventing things:
- Not a token-budget command. As best I can tell from reviewing docs, there is no
/goal budgetin the docs or incodex --help - Not a separate
/goal statussubcommand. To check status, you type/goalalone with no arguments. - Not “Codex Managed Outcomes.” That is not a real OpenAI product from any legit source I have seen.
- Not a safety boundary. Permissions, sandbox mode, and explicit constraints in your goal text still matter.
/goalwill follow a bad prompt for an hour (or more lol).
When in doubt, reach for the primary doc: Follow a goal.
When /goal works (and when it doesn’t)
Codex DOES push back on lazy goals. The thing it’s actually good at is loops where the agent can check its own progress against evidence: tests pass, eval scores improve, Lighthouse numbers move, screenshots match, file outputs validate.
That clicks into a handful of archetypes:
- Migrations. “Move this codebase from Node 14 to Node 20 and keep the test suite green.”
- Test-suite expansion. “Lift coverage on
src/auth/from 38% to 75%.” - TDD-built features. “Implement the queue in PLAN.md, one milestone at a time, with tests.”
- Deployment retry loops. “Re-deploy until the health check passes.”
- Refactor-with-validation. “Optimize the prompts in
evals/until the eval suite scores at least 0.85.” - Performance work. Lighthouse loops with a stop threshold.
Where it fails (OpenAI’s own words from the docs: “loose list of unrelated work”):
- Exploratory design.
- Architecture decisions.
- Anything that requires taste or stakeholder input.
- Goals with vague stop conditions like “make it better.”
The unifying property of a good /goal task is one sentence long: Codex can check its own progress against evidence. If you can’t write that line, you don’t have a /goal. You have a prompt.
For real-world numbers: a thread on r/codex had two folks share their data points. A trading-app refactor with 210 backlog tasks ran 6.5 hours and burned ~20% of a $100-plan weekly quota. A larger experiment migrating an in-game economy ran 18 hours and ate ~9% in the first 8. Self-reported numbers, not benchmarks, but the reference points are informative.
Setup and commands
I’m running it on Ubuntu, but the CLI flow is the same on macOS and Windows.
Setup
- Check your version. Run
codex --version. You need0.128.0or higher. When I started writing this, my dev box was on0.125.0, which is why none of the goal stuff was showing up. - Upgrade if needed.
Latest at time of writing. Substitute whatever’s current vianpm install -g @openai/codex@0.129.0npm view @openai/codex version. - Restart Codex. Quit the CLI. If you use Codex via MCP from Claude Code, restart Claude Code too, or the MCP server keeps wrapping the old binary’s protocol and you’ll get cryptic flag errors. (Found this one out the hard way.)
- Edit
~/.codex/config.toml. Add (or extend) the features table:[features] goals = true - Type
/goal <your objective>at the start of your next prompt. The slash menu may not showgoaluntil you type it once. Don’t be alarmed.
If you’re on the Codex desktop app on macOS or Windows, same drill: enable in the CLI’s config.toml first, then the app picks it up. Per Ben Badejo on X, it works in both surfaces once enabled via CLI. I haven’t personally tested the Mac/Windows app behavior. Take his word for it.
Real commands (per OpenAI’s docs)
/goal <objective> set a goal
/goal check current goal status
/goal pause pause the current run
/goal resume resume a paused run
/goal clear clear the current goal
That’s the whole list as of 5/8/2026. The fabricated subcommands flagged earlier are not on it.
/goal is experimental and persistent. It is not safer than a regular Codex run. A vague goal can burn weekly quota or produce broad off-target changes since Codex keeps going until it decides it is done. Run it read-only or on a scratch branch. Define one measurable stop condition. Use /goal pause or /goal clear if it drifts.
I gave /goal a real task
For the test I picked a side project: a cast-metal nameplate on a broken sign (it was missing the ES from HODGES) that needs a near-exact font match before I commit to building a new 3d printed version. The project already had a photo, letter crops, a local scoring script, three eval’d candidate fonts, and a handover doc summarizing it all. Perfect substrate for a /goal.
The task (or contract) I gave it (redacted of project specifics):
/goal Complete a read-only font-match: identify whether the lettering has a
near-exact commercial or catalog font match before we commit to a custom
uppercase, with a measurable scoring contract and an explicit fallback if
no near-exact match exists.
Definition of "near-exact":
- HODG width-ratio delta to photo within +/- 0.02 of 1.0 AND
- per-letter (D, G, A, R) glyph-shape distance better than the current
best Barlow row in local_candidate_scores.csv
Required workflow loop:
1. Read CLAUDE_HANDOVER.md and the existing report.
2. Use built-in browser_use to upload prepared crops to Fontspring
Matcherator and MyFonts WhatTheFont. Capture top-5 per crop.
3. Cross-reference returned candidates against catalog leads.
4. For any FREE-licensed candidate file, STOP and add an AWAITING
APPROVAL entry. Do not auto-download.
5. Write reports/claude_external_match_results.md ending with the
literal sentence "Phase A complete; no commercial fonts purchased,
redistributed, or repackaged."
Hard constraints (fail closed): no login, no cart, no checkout, no
purchase, no auto-download, no writes outside this scratch dir.
Stop signal: report file exists with all required sections AND the
literal final sentence. Hard cap 60 minutes wall-clock.
I ran it via codex exec --sandbox workspace-write against a fresh scratch copy at /tmp/dr-jay-font-match-goal/ (not the canonical project path, so zero side-effect risk).
What happened
Wall-clock: about 5 minutes. Tokens used: 188,832, which lands in the single-digit-percent range of a $100-plan weekly quota.
The plan-act-test loop ran exactly as advertised:
- Read the handover and prior report.
- Recomputed local per-letter shape scores across 7 candidates.
- Tried to invoke its built-in
browser_usetool. Not exposed in this session. - Checked for fallbacks: chromium, firefox, playwright, selenium. None installed.
- Tried direct HTTPS GETs to fontspring.com and myfonts.com. DNS blocked from inside the sandbox.
- Wrote the final ranked report stating honestly that no external matcher upload was performed, rather than inventing candidate names.
That last line is the part to underline. /goal did not fabricate matcher results when it couldn’t get them. It told me what it tried, why it failed, and produced a report ranking only the leads it could verify locally. The literal final sentence I’d contracted for (“Phase A complete; no commercial fonts purchased, redistributed, or repackaged.”) was there, exactly on cue.
The local rescore was useful too. Liberation Sans Bold has the smallest HODG width delta in the local pool (0.0027, near-perfect), but its per-letter shape distance on D, G, and A is worse than Barlow’s. Neither hits the “near-exact” definition I encoded. /goal‘s recommended next path: hand-traced custom uppercase, with Barlow retained only as an OFL-derived reference for rhythm. Good call CODEX.
Goal-quality rubric (recommended before you fire /goal)
- Measurable artifact: what file or output marks done?
- Verification command: what one-liner proves the artifact is correct?
- Allowed write scope: exact directory; everything else read-only.
- Stop condition: the literal sentence or schema the agent must produce.
- Pause condition: what triggers
/goal pause(N attempts, X minutes, paid-action-needed).
Concrete: same task, two different prompts.
- Bad:
/goal improve auth - Good:
/goal raise src/auth coverage from 38% to 75%, only edit src/auth and tests, stop when npm test passes and the coverage threshold is met
If you can fill in all five rubric items, your goal is contracted. If you can’t, you’re hoping. Hope is fine for a normal prompt. It is not fine for an autonomous loop.
If your Codex CLI throws bwrap or AppArmor sandbox errors on Ubuntu 24.04, /goal won’t get past its first shell action. There’s a specific AppArmor profile fix that resolves it without weakening the rest of your sandbox. How I fixed Codex sandbox errors on Ubuntu 24.04.
What I didn’t test
To be precise about what this article does and doesn’t tell you:
- Long multi-day runs. My contract allowed 60 minutes; the actual run was about 5. People on r/codex have run for 18+ hours; I haven’t yet.
- Code-writing goals. I gave it a research-and-score task, not a TDD feature build or a migration. The plan-act-test loop should generalize, but file-edit volume and quota burn won’t match what I saw.
- The Codex desktop app. CLI only, on Linux.
- Windows. Linux only.
- Heavy quota burn. One short run on one model (gpt-5.5 at xhigh reasoning). Your numbers will vary.
The setup is straightforward and the loop is real. The above is what I haven’t validated. If you have any validations or thoughts to share, make sure to comment below! 💪
FAQ
What is Codex /goal?
A slash command in the Codex CLI (v0.128.0+) that turns a normal Codex prompt into a persistent, self-checking agent. It loops plan → act → test → review → iterate until your stop condition is met or you pause it. Practically: let Codex grind on a contracted task without babysitting.
How do I enable /goal in Codex CLI?
Three steps. Upgrade to 0.128.0+ (npm install -g @openai/codex@0.129.0 works as of writing). Add [features] with goals = true to ~/.codex/config.toml. Restart the CLI (and Claude Code if you use Codex via MCP). Then prefix your next prompt with /goal.
Why doesn’t /goal work or appear for me?
Three usual suspects. (1) You’re on a Codex CLI older than 0.128.0. (2) You’ve upgraded but haven’t restarted, so a stale process still holds the old binary. (3) You added the config flag but typed /goals (with an s) instead of /goal. The slash menu may also not show goal until you type it once.
What are the real /goal commands?
/goal <objective>, /goal (status), /goal pause, /goal resume, /goal clear. Five total, per OpenAI’s docs. Anything else you read is fan fiction until OpenAI documents it.
Does /goal have a token budget or /goal status command?
No. There is no /goal budget in the docs or in codex --help. To check status you type /goal alone, not /goal status. If a goal is running too long, /goal pause is the canonical way to stop it.
Bottom line
If your day has a category of work that looks like “Codex would handle this great if it just kept going,” turn /goal on. (Big migrations. Big test-suite expansions. Optimization loops that run dozens of times.) It is a small config change for a feature that genuinely does what it advertises. I feel like the are shoving it in Anthropic’s face since Claude has been doing everything possible to REDUCE usage (at least prior to their access to SpaceX/xAI compute that they recently secured).
If your day is mostly architecture decisions, exploratory tinkering, or small-targeted edits, you could probably just leave it off. A regular Codex prompt works fine for those, and /goal would just amplify any vagueness in your prompt across hours of compute.
The thing worth bookmarking from this whole exercise isn’t a slash command, IMHO it’s the goal-quality rubric in the firsthand-experiment section above. Reusable for any agent feature like this, not just /goal.
This was my 2cents, feel free to share your thoughts below!