OpenAI has officially designated the creation of a fully automated AI researcher as its “North Star” goal for the next two years. According to Chief Scientist Jakub Pachocki in a recent interview with MIT Technology Review on March 21, the company is shifting from building reactive chatbots to developing autonomous agents that can plan, execute, and refine complex scientific research without step-by-step human intervention.
The “North Star” Roadmap
OpenAIโs strategy involves a tiered rollout of autonomous capabilities, moving from “intern-level” tasks to running entire research labs.
| Phase | Target Date | Capability |
| The “AI Intern” | September 2026 | An agent capable of handling small, multi-day research assignments (data cleaning, basic literature synthesis, running ablations). |
| The “AI Associate” | Early 2027 | Capable of generating novel hypotheses and designing end-to-end experimental frameworks in specialized fields. |
| Fully Automated Lab | March 2028 | A multi-agent system operating within data centers that coordinates across weeks of complex tasks to solve “open problems” in physics and math. |
Core Technologies Powering “North Star”
The initiative is not a single model but a “unified system” that brings together several of OpenAI’s recent breakthroughs:
- Advanced Reasoning (o-series): Utilizing “slow thinking” and test-time compute scaling (like the o1 and o3 models) to self-correct mistakes and explore alternative solutions before finalizing an answer.
- Deep Research Agent: An evolution of the “Deep Research” tool launched in 2025, which uses reinforcement learning to navigate the web and technical databases more like a human academic than a search engine.
- Multi-Agent Coordination: A framework allowing different AI instances to play “roles” (e.g., one agent writes code, another peer-reviews it, a third runs the simulation).
- “Titan” On-Device Support: Rumors suggest OpenAI is co-developing a custom “Titan” processor with Broadcom (using Samsung’s HBM4 memory) specifically to handle the massive compute required for these persistent, long-running research agents.
The “Novelty Gap” Controversy
While the technology is advancing, the AI community remains divided on whether a “human-free” team can truly innovate.
- The Proponents: Argue that AI can solve the “bottleneck of science” by reading millions of papers and running millions of simulations in a single dayโcompressing months of work into hours.
- The Skeptics: Point out a “Novelty Gap”โAI excels at synthesizing existing knowledge but struggles to identify which questions are worth asking or recognizing a “dead end” through creative intuition.
Safety and “Restricted Environments”
To mitigate the risks of a team that “needs no help” (and thus might ignore human warnings), OpenAI is testing Deliberative Alignment. This involves keeping highly capable research systems in “restricted data center environments” and monitoring their internal “chains of thought” in real-time to ensure they don’t develop deceptive behaviors or pursue dangerous bio/nuclear research.


