OpenAI's Codex Whitepaper: Durable Threads, Voice Steering, and the Loop That Never Ends
GPT-5.6 Not Released Yet? But You Can Check Out the Codex Whitepaper First.
On June 22, 2026, OpenAI released a new whitepaper titled Codex-maxxing for long-running work.
The double 'x' is a highlight, making the whitepaper less stiff.
In short, the content of this whitepaper is that OpenAI wants everyone to live in Codex.
This reminds me of how Mac invaded users' minds, and now Codex wants to do the same.
Let me explain in detail what this whitepaper contains.
The whitepaper begins by acknowledging that Codex was born for coding work: you can use Codex to modify repos, create diffs, review, and help with releases.
But OpenAI says that once Codex has durable threads, shared memory, tools, recurrence, and a place to review artifacts, the focus shifts.
You might wonder, why use these fancy English terms? Is OpenAI inventing new words again? Don't worry, I'll explain what these are for below.
For now, just remember one thing: Codex wants to land on every office worker's desktop.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Durable Threads
The whitepaper first introduces durable thread.
As you know, in Codex, each task is called a thread.
Simply put, it's a chat window.
Except WeChat's chat window can only chat, while Codex's chat window can actually help you work.
Every time you open a new thread, it's like a new colleague joining the company.
Although they are capable, you have to re-explain the context, like where the project is, what pitfalls were encountered before, which areas to avoid, and what someone said earlier. They don't know any of this.
Durable threads solve this problem.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
OpenAI gives several examples: social feedback monitoring, open-source project maintenance, OpenAI CLI, Agents SDK, Chief of Staff.
These tasks share a characteristic: they can't be finished in a day.
For example, open-source project maintenance. You don't just ask Codex to fix one issue and be done.
It might need to monitor issues long-term, read release notes, understand contributor habits, and remember how to review.
If you start a new thread for such tasks each time, the previous memory is lost.
So this section of Codex is more about saying: for important work, you can pin a thread first so it's easy to find.
Then you handle the same thing long-term within it, like context and preferences. As things slowly settle, this thread becomes a durable thread, gaining a story and background. It's that simple.
Of course, durable threads come at a cost. Because they carry more context, they might be more expensive to run.
Voice Input
The second point is voice input.
This term is easily misunderstood as speech-to-text, but it's not.
OpenAI provides a scenario:
Imagine you are looking at a page created by Codex.
As you look, you say aloud:
"Make this button a bit smaller."
"This copy is wrong."
"I remember someone named cxuan mentioned this in WeChat, but I'm not sure. Go find it."
If you had to type these words, you would instinctively organize them, deleting uncertain parts.
For example, you'd think about which button needs to be smaller, which copy is wrong, and who cxuan really is.
But voice input is different. It likes these vague, uncertain clues.
Don't think these vague details are useless. On the contrary, they are very useful for the Agent.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Jason Liu's usage in the whitepaper is like this:
He looks at the page the Agent created in his browser, simultaneously recording voice complaints and suggestions.
After recording, he presses Enter to send.
Codex then turns this voice into actionable feedback.
It's about stuffing the review process back into the same thread.
A sentence casually spoken during a phone call, meeting, or hallway conversation can be used the same way.
Codex then organizes this raw speech into plans, drafts, pages, reports, or next actions.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Steering
The theme of this section is Steering: Shape the queue while Codex works.
While Codex is working, you can continue to add next steps to the queue, letting Codex prioritize your inserted commands.
For example, it's modifying a page.
As you watch, you suddenly notice the button is too big.
You don't have to wait for it to finish the entire round and then start a new command.
You can directly add:
"Make this button a bit smaller."
"This copy is wrong."
"After this is done, open a PR."
"Wait for the preview to deploy first."
"Show me the preview link before releasing."
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
This is steering.
It can work better with voice input.
Voice input is responsible for capturing the messy thoughts in your head.
Steering is responsible for turning those words into Codex's next steps during work.
Memory
--
Let's talk about memory.
I mentioned back in April this year that memory would become a priority topic this year.
This whitepaper provides a further explanation of memory.
It says: important information in long threads cannot only exist in chat history.
Chat history is too long; people won't read it line by line, and the Agent itself might miss the key points.
So useful information should be written out, becoming files you can open, edit, see diffs for, and reuse.
The whitepaper gives a structure:
vault/
TODO.md
people/
projects/
agent/
notes/
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
You can create a vault/ folder.
The code repository holds the code.
The vault holds the work context.
For example, if you have Codex maintain an open-source project long-term, the vault could look like this:
vault/
TODO.md
people/
alex.md
maintainer-lucy.md
projects/
codex-cli.md
agent/
review-rules.md
release-checklist.md
notes/
2026-06-24.md
TODO.md holds unfinished tasks.
For example:
- After the preview is deployed, send the link to cxuan for review.
- Check the parameter issue mentioned by the user in issue #184.
- Before the next release, confirm the changelog hasn't missed CLI parameter changes.
people/maintainer-lucy.md holds people's preferences.
For example:
cxuan doesn't like large PRs.
When he reviews, he looks at tests and changelog first.
For changes involving releases, it's best to give him a brief summary first.
projects/codex-cli.md holds project status.
For example:
Current focus: Reduce reconnection-related issues.
Decision made: Fix log observability first, then adjust retry strategy.
Blocker: Still missing reproduction logs from Windows users.
agent/review-rules.md holds rules for Codex itself.
For example:
Don't modify authentication logic.
Don't refactor unrelated files on the side.
After changing CLI behavior, must update help documentation and tests.
This way, when you return to this thread next time, Codex doesn't have to guess based solely on chat history.
It can directly open these files to see what the people, projects, rules, and to-dos are.
If this stuff is buried in chat history, it's hard to find after a few days.
Putting it in vault/ turns it into a project ledger.
If this vault is on GitHub, there's another benefit: you can see diffs.
That is, the "memory" Codex writes down—which file it changed, which record it added—you can see it all.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Computer and Browser Use
After memory, the next section in the whitepaper is Computer and browser use.
This section looks like a tool introduction, but it's actually about permission boundaries.
OpenAI clearly separates these entry points:
browser is suitable for local web pages, previews, and annotations.
chrome is suitable for web pages that require login, like your already-logged-in admin panel, workspace, or internal systems.
computer use is for tasks that can only be done via GUI.
connectors are work entry points like Slack, Gmail, Calendar, and GitHub.
skills are reusable workflows.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Remote Control
This section is Remote control.
There's not much worth detailing here.
It simply means you can use a mobile device to monitor and review the progress of Codex working on your desktop in real-time.
You don't have to be at your desk, but the work can still be done by Codex on your desk.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Thread Automations
The whitepaper also includes something called thread automations.
It lets Codex periodically return to the same thread to check if anything has changed.
A normal prompt is:
Do this now.
A thread automation is more like:
Come back every 30 minutes, and if there's something new, prepare the next step.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
The whitepaper gives an example of a Chief of Staff.
Codex checks Slack and Gmail every 30 minutes to see if there are messages needing a reply.
It can find context, draft responses, and list questions for you to decide on.
However, it cannot send them on your behalf.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Three Loops
Later in the whitepaper, there is a section called Three examples of loops.
This section essentially ties together the previous concepts to form a closed loop.
Threads, memory, tools, automation, review—looking at them individually, they seem like features.
But the truly useful thing is the loop.
That is: Codex returns on a schedule, checks the context, uses tools to do part of the work, and then hands over the parts needing human judgment to you.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
The whitepaper gives three examples.
The first is the Chief of Staff mentioned earlier: periodically check Slack and Gmail, find messages needing replies, fill in context, and draft responses.
But whether to send, when to send, and what tone to use are still for the human to decide.
The second is monitor for feedback.
Imagine a team giving feedback on an animation in a platform like Slack.
Codex periodically checks this thread.
When there's new feedback, it first organizes it into a modification list.
Then it goes to modify the Remotion project.
You can think of Remotion as a tool for creating videos and animations with code.
After modification, Codex re-renders a version, clearly states what was changed, and provides a review link.
So Codex handles reading feedback, modifying the animation, and producing a new version.
Meanwhile, the human reviews the effect, judges the creative direction, and decides whether to release.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
The third example is more relatable: get a refund.
This example is about the refund process.
For instance, you apply for a refund on a website, and the customer service system keeps showing "Waiting for human agent."
You don't want to stare at the page the whole time.
Codex can periodically check: has the customer service agent arrived, are there new messages in the chat, has the refund status changed.
Once the agent replies, it prepares the necessary information first.
Things like order number, payment record, previous communication content, and reason for refund.
Then it drafts a reply and gives you a suggestion: whether the next step should be to provide a screenshot or directly request a refund.
But it cannot click "Confirm" for you.

Source: OpenAI Official Whitepaper, local PDF rendering screenshot
Putting these three examples together, the whitepaper's message is clear.
Codex is eager to do everything it can, except for the final step: clicking the confirm button.
Goals Must Be Verifiable
The whitepaper then discusses goals.
I have a lot to say about this.
I once ran a 75-hour goal, and the output was terrible, really unspeakable.
A very poor goal looks like this:
Implement the plan in this Markdown file.
Translated: Do it according to the plan.
The problem is: when is it considered done?
No one knows.
So Codex can easily stay busy, and it's burning your SSD, looking very hardworking, but hard work doesn't always yield results.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
A better goal needs an acceptance baseline.
The whitepaper gives the example of Rich-to-Rust.
It's not simply "Migrate this library to Rust."
Its completion criteria are: after migration, it must still pass the original unit tests.
When I use Codex myself, I increasingly like to write things like this:
Completion criteria:
- Original behavior unchanged.
- Corresponding tests pass.
- Unrelated modules not modified.
- Output changed files, verification commands, and failed attempts.
Remember: goals must be verifiable.
For more on using goals, check out this article.
Wow, finally figured out how to use goals properly.
Side Panel
The final section is about the side panel.
This is also easy to understand.
You can't always review work within a chat window.
Markdown, tables, CSV, PDF, slides, web pages—these things are inherently multi-dimensional.
Describing them only through a chat window is unclear and insufficient.
Source: OpenAI Official Whitepaper, local PDF rendering screenshot
The value of the side panel is: you and Codex are looking at the same thing.
You look at a page and say this button is too cramped.
You look at a table and say this formula is wrong.
You look at a slide and say this title is too long.
These comments directly become executable next steps.
This is much more reliable than saying in the chat window, "That thing under the second block in the top left corner of the page I just mentioned."
I think this is a key step for Codex moving towards the office desktop.
It can't just be a chat window.
It needs to let you see files, web pages, tables, PPTs, and then continue making changes on these things.
At this point, Codex starts to become a real place where work gets done.
Summary
--
Up to Codex's current updates, the most notable features are all in this whitepaper.
So let me summarize this whitepaper. If you're too lazy to read the above, just read the summary.
- Codex wants to be your desktop operating system
The entire whitepaper is about one thing: how to let AI participate in long-term work that can't be finished in one go. OpenAI wants you to keep Codex resident on your desktop, like how Mac invaded users' minds, making it the default entry point for your work.
- Giving AI a memory
Previously, every new conversation was a "new colleague." Now, through durable threads and a memory vault, Codex has long-term memory. Project background, personnel preferences, and records of pitfalls can all be accumulated, turning the new colleague into an old colleague.
- Interaction methods have changed
Voice input can handle vague instructions. Steering lets you interrupt and command without waiting for it to finish. The side panel lets you and it look at things on the same screen. The interaction method has shifted from sending commands and waiting for results to real-time preview.
- The biggest breakthrough is the Loop
Codex can periodically (e.g., every 30 minutes) automatically return to check emails, Slack, and web page status. It can read feedback, modify code, and draft responses on its own. It can prepare everything for you but will never click the confirm button for you. The final decision-making power is always yours.
- Goals must be verifiable
Poor goals make Codex busy without purpose, while good goals must come with an acceptance baseline.
I'm cxuan, someone who has been tinkering with AI tools and Agent workflows for a long time. For more real usage records, post-mortems, and tool collections, you can search for the WeChat public account "cxuanAI".
Sources
- OpenAI Official Page: Codex-maxxing for long-running work, published June 22, 2026.
- OpenAI Official PDF: OAI_WhitePaper_Codex-maxxing26.pdf.
- Local Assets:
codex-maxxing-assets/, containing the original PDF and 27 page screenshots.