Skip to main content
Back to Pulse
shipped
The Decoder

OpenAI turns Codex into an always-on coding agent that watches your screen

Read the full articleOpenAI turns Codex into an always-on coding agent that watches your screen on The Decoder

What Happened

OpenAI is massively expanding its developer tool Codex: the AI can now control a Mac on its own, generate images, remember preferences, and keep working on tasks autonomously for weeks. The move takes direct aim at Anthropic's Claude Code. The article OpenAI turns Codex into an always-on coding agen

Our Take

OpenAI expanded Codex from a code-completion API into a persistent agent with Mac control, image generation, and memory that survives across sessions — targeting the same workflow that Claude Code owns today.

For teams running multi-step agentic coding pipelines, screen-level OS control changes the scope of what can be delegated — no more context resets between tasks. Most developers underestimate how much friction comes from agents losing state mid-task, not from generation quality. Cost structure is unknown, which is the only real blocker.

Engineering teams already paying for Claude Code subscriptions should benchmark task completion rates on week-long workflows before committing to either platform. Single-developer shops on GPT-4o can ignore this until pricing is public.

What To Do

Run Codex against Claude Code on a real multi-session task before your next subscription renewal, because OS-level persistence changes completion rates more than token speed.

Builder's Brief

Who

teams running agentic coding workflows across multi-session tasks

What changes

session persistence and OS-level task delegation replace per-request code completions

When

now

Watch for

Codex pricing per multi-week task vs Claude Code monthly subscription — first public benchmark on task completion rates

What Skeptics Say

Screen control and weeks-long autonomy sound powerful until the first hallucinated git push hits a production branch. OpenAI has no track record running persistent agents inside real codebases at scale.

2 comments

P
Priya Nambiar

yeah no. my screen has bank tabs open, slack dms, everything. hard pass on 'always watching'

T
Tomas Červenák

'working autonomously for weeks' lmaooo my agent can't survive a single context switch without hallucinating

Cited By

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...