OpenAI announces new AI agent for “deep research”

The new AI agent will help ChatGPT users to conduct complex, in-depth research.

Image:
Deep research will help users compile data from multiple sources

New feature will enable users to compile research from multiple sources, but OpenAI has a warning for early adopters.

OpenAI said in a blog post yesterday that the new capability of Chat-GPT is for “people who do intensive knowledge work in areas like finance, science, policy, and engineering and need thorough, precise, and reliable research.”

“Every output is fully documented, with clear citations and a summary of its thinking, making it easy to reference and verify the information. It is particularly effective at finding niche, non-intuitive information that would require browsing numerous websites. Deep research frees up valuable time by allowing you to offload and expedite complex, time-

OpenAI said the feature is available to ChatGPT Pro users from now, limited to 100 queries per month, with support for Plus and Team users coming next, followed by Enterprise.

However, the feature will not be available to Chat-GPT users in the UK, Switzerland, and the EU, probably due to limited datacentre capacity. The blog post acknowledges that deep research is very compute intensive. Outsourcing your research on any given subject to OpenAI is going to be significantly more carbon intensive than doing it yourself.

OpenAI said that the output of every deep research query will be fully documented with citations so that users can verify the information. Presumably this is an attempt to get ahead of criticisms if deep research makes stuff up, a problem still experienced by users of Chat-GPT Search. The implications of hallucinations sneaking into much more detailed research are potentially much greater than hallucinations in shorter answers and summaries.

OpenAI admits as such in yesterday’s post.

“Deep research unlocks significant new capabilities, but it’s still early and has limitations. It can sometimes hallucinate facts in responses or make incorrect inferences, though at a notably lower rate than existing ChatGPT models, according to internal evaluations. It may struggle with distinguishing authoritative information from rumors, and currently shows weakness in confidence calibration, often failing to convey uncertainty accurately. At launch, there may be minor formatting errors in reports and citations, and tasks may take longer to kick off.

“We expect all these issues to quickly improve with more usage and time.”

Early adopters beware.