“Software engineering is changing, and by the end of 2025 it’s going to look fundamentally different.” Greg Brockman’s opening line at OpenAI’s launch event set the tone for what followed. OpenAI released Codex, a cloud‑native software agent designed to work alongside developers.
Codex is not a single product but a family of agents powered by codex‑1, OpenAI’s latest coding model. Codex CLI, arrived a few weeks ago as a lightweight companion that runs inside your terminal. Today the spotlight shifts to its bigger, remote agent that is available entirely on ChatGPT. You can spin up several “mini‑computers” and tackle multiple tasks while you’re off grabbing coffee. This article is going to be an overview of Codex on ChatGPT, and we will soon be releasing some project based articles on the topic.
OpenAI started working toward AI-assisted coding back in 2021, when the original Codex model launched and powered tools like GitHub Copilot. Back then, it functioned more like an autocomplete support for developers.
Since then, a lot has changed. Thanks to major advances in reinforcement learning, Codex has grown far more capable.
Now, in a world where vibe coding is becoming the new normal, you can just describe what you want in native language and Codex figures out how to build it. The newest version, Codex‑1, is built on OpenAI’s o3 architecture and fine-tuned using real-world pull requests. It doesn’t just generate code; it follows best practices like linting, writing tests, and keeping a consistent style, making it genuinely useful for real development work.
Also Read: A Guide to Master the Art of Vibe Coding
Codex is currently available to ChatGPT Pro, Enterprise, and Team users. Plus and EDU users are expected to gain access soon. During the research preview, usage is subject to generous limits, but these may evolve based on demand. Future plans include an API for Codex, integration into CI pipelines, and unification between the CLI and ChatGPT versions to allow seamless handoffs between local and cloud development.
Time needed: 5 minutes
Follow these simple steps to start using Codex:
Open ChatGPT and go to “Codex” sidebar in the left navigation rail you’ll see a new “Codex (beta)” icon. Click it to reveal the agent dashboard.
Click “Set up MFA to continue,” scan the QR code with your preferred authentication app (like Google Authenticator or Authy), then enter the code to verify. That’s it, you’re all set.
A single OAuth click authorises Codex to read/write on your repos. You can restrict it to specific organisations or personal projects.
Pick the project you’d like Codex to work on. The agent clones this branch into its own sandbox.
Add environment variables, secrets, or setup commands just like you would in a CI job. Linters and formatters come preinstalled, but you can override the versions if needed.
Ask: “Explain the architecture.”
Code: “Find and fix the flakey test in test_api.py.”
Suggest: Let Codex scan the repo and propose maintenance chores.
Or just type a custom instruction in natural language.
Press “Launch”. Each job spins up its own micro‑VM; you can queue dozens in parallel and continue chatting elsewhere in ChatGPT.
Green check‑marks indicate passing tests. Click a task card to see the diff, the model’s explanation, and the full work‑log.
Hit “Open PR” to push the branch back to GitHub or reply to the task with follow‑up instructions if changes are needed.
In this section, I am sharing the different examples demonstrating how this new software development agent can sort your life!
OpenAI engineer Nacho Soto shows how Codex helps him start new tasks faster by setting up project scaffolding, like Swift packages. With simple prompts, he was able to offload the setup and focus on building features while Codex handles the rest in the background.
Codex supports more than just code generation. It also fits into review workflows, where developers check AI-generated pull requests, spot issues like formatting problems, and prompt Codex to make fixes.
Engineer Max Johnson explains how Codex helps fix small bugs and code quality issues without breaking his focus. Rather than switching contexts, he hands off these tasks to Codex and reviews the results later to improve the codebase.
Calvin shares how Codex helps with urgent tasks during on-call shifts. By sending stack traces to Codex, he can quickly get diagnostics or fixes. It also helps fine-tune alerts and handle routine ops work, cutting down on manual effort.
Prompt: “Please fix the following issue in the matplotlib/matplotlib repository. Please resolve the issue in the problem below by editing and testing code files in your current code execution session. The repository is cloned in the /testbed folder. You must fully solve the problem for your answer to be considered correct.”
Problem statement:[Bug]: Windows correction is not correct in `mlab._spectral_helper`
### Bug summary
Windows correction is not correct in `mlab._spectral_helper`:
https://github.com/matplotlib/matplotlib/blob/3418bada1c1f44da1f73916c5603e3ae79fe58c1/lib/matplotlib/mlab.py#L423-L430
The `np.abs` is not needed, and give wrong result for window with negative value, such as `flattop`.
For reference, the implementation of scipy can be found here :
https://github.com/scipy/scipy/blob/d9f75db82fdffef06187c9d8d2f0f5b36c7a791b/scipy/signal/_spectral_py.py#L1854-L1859
### Code for reproduction
```python
import numpy as np
from scipy import signal
window = signal.windows.flattop(512)
print(np.abs(window).sum()**2-window.sum()**2)
```
### Actual outcome
4372.942556173262
### Expected outcome
0
### Additional information
_No response_
### Operating system
_No response_
### Matplotlib Version
latest
### Matplotlib Backend
_No response_
### Python version
_No response_
### Jupyter version
_No response_
### Installation
None
Output:
Observation:
The fix generated by Codex is more accurate and complete than the one from o3. It correctly removes the unnecessary use of np.abs() in window normalization within mlab._spectral_helper, which had caused incorrect results for windows with negative values like flattop. Codex replaces the faulty logic with a proper mathematical expression, using (window2).sum() instead of (np.abs(window)2).sum(), which matches best practices seen in SciPy’s implementation. It also includes a unit test to confirm the behavior, making the fix both verifiable and reliable. In comparison, the o3 output seems incomplete and doesn’t clearly resolve the core issue, making Codex the stronger option.
Codex‑1 outperforms previous models both in standardized benchmarks and internal OpenAI workflows. As shown below, it achieves higher accuracy on the SWE-Bench Verified benchmark across all attempt counts and leads in OpenAI’s internal software engineering tasks. This highlights Codex‑1’s real-world reliability, especially for developers integrating it into daily workflows.
Every time you press Run in the Codex sidebar, the system creates a micro‑VM sandbox: its own file‑system, CPU, RAM, and locked‑down network policy. Your repository is cloned, environment variables injected, and common developer tools (linters, formatters, test runners) pre‑installed. That isolation delivers two immediate benefits:
An optional AGENTS.md file acts like a README for robots: you describe the project layout, how to run tests, preferred commit style, even a request to print ASCII cats between steps. The richer the instructions, the smoother Codex behaves.
Also Read:
“I just landed a multi‑file refactor that never touched my laptop.”
– OpenAI Engineer
Stories like that hint at a future where coding resembles high‑level orchestration: you provide intent, the agent grinds through the details. Codex represents a shift in how developers interact with code, moving from writing everything manually to orchestrating high-level tasks. Engineers now focus more on intent and validation, while Codex handles execution. For many, this signals the beginning of a new development workflow, where human and agent collaboration becomes the standard rather than the exception.
How are you planning to use Codex? Let me know in the comment section below!