OpenAI's new AI agent enables ChatGPT to conduct in-depth research
What's the story
OpenAI has introduced a new AI agent for its ChatGPT chatbot, called deep research.
The tool "independently discovers, reasons about, and consolidates insights from across the web." It can also adapt and respond to real-time information, if needed.
Unlike conventional text generation, this capability offers a summary of its work in a sidebar for user reference, with citations and synopsis of the methodology used.
Last month, OpenAI launched Operator, an AI agent that can browse web for you.
Functionality
User interaction and response time
Deep research is powered by a version of o3 model optimized for web browsing and python analysis.
Users can use the feature by asking questions via text, images, or other files like PDFs or spreadsheets.
The tool then takes five to 30 minutes to generate a response, which appears in the chat window.
OpenAI plans to improve this feature by adding embedded images and charts in future updates.
Challenges
OpenAI acknowledges limitations and future improvements
Despite its advanced capabilities, the deep research feature does have some limitations.
It can sometimes hallucinate or fabricate facts, struggle to differentiate between authoritative information and rumors, and may not accurately gage the certainty of a response.
It may even make formatting errors in reports and citations.
However, OpenAI is confident that these issues will be resolved over time with increased usage.
Applications
A tool for professionals and consumers
The deep research feature would be particularly useful for professionals working in knowledge-intensive domains like finance, science, policy, and engineering.
It can also help consumers making big purchases that usually require thorough research such as cars, appliances, and furniture.
To use ChatGPT deep research, users just have to choose "Deep research" in the composer and type a query with an option to attach files/spreadsheets.
Performance
Deep research model outperforms others in AI benchmark
The model powering deep research has achieved a new record for accuracy on an AI benchmark called "Humanity's Last Exam," which evaluates responses to expert-level questions.
With browsing and Python tools enabled, the OpenAI deep research model scored an impressive 26.6% accuracy, far ahead of GPT-4o's 3.3%, and the next best, its o3-mini (high) model assessed solely on text, at 13%.
Deep research is rolling out to Pro users. It will then expand to Plus and Team, followed by Enterprise.
Twitter Post
Take a look at OpenAI's announcement
Today we are launching our next agent capable of doing work for you independently—deep research.
— OpenAI (@OpenAI) February 3, 2025
Give it a prompt and ChatGPT will find, analyze & synthesize hundreds of online sources to create a comprehensive report in tens of minutes vs what would take a human many hours. pic.twitter.com/03PPi4cdqi