OpenAI unveils new AI model that can mimic human thought processes
Model capable of complex problem solving
OpenAI has announced the release of o1, the first AI model in a new series designed to excel at complex reasoning tasks.
The highly anticipated model, codenamed "Strawberry," demonstrates "exceptional capabilities in fields like science, coding, and math," according to the company.
"We've developed a new series of AI models designed to spend more time thinking before they respond," OpenAI explained.
"They can reason through complex tasks and solve harder problems than previous models in science, coding, and math."
Unlike traditional large language models (LLMs) that rely on vast amounts of training data to generate answers, o1 employs a novel approach that involves reinforcement learning and reasoning step-by-step.
This enables the model to solve problems step-by-step, mimicking human thought processes.
o1 carefully considers problems and refines its strategies before providing a solution.
"We have noticed that this model hallucinates less," OpenAI's research lead, Jerry Tworek, told The Verge.
However, the issue still persists. "We can't say we solved hallucinations," Tworek added.
In rigorous testing, the model outperformed PhD students in challenging benchmarks across physics, chemistry, and biology. Additionally, it achieved remarkable results in math and coding, surpassing even the International Mathematics Olympiad (IMO) qualifiers.
In the American Invitational Mathematics Examination (AIME), o1 achieved an impressive 83% accuracy rate, significantly outperforming GPT-4o.
While o1 is still in its early stages, its reasoning abilities hold immense promise for various applications, OpenAI says.
Healthcare researchers can leverage it to analyse cell sequencing data, physicists can generate intricate mathematical formulas, and developers can streamline complex workflows.
It is also important to note that o1 has many limitations.
It lacks the ability to browse the internet or process images, and it may occasionally generate incorrect or misleading information. OpenAI says it is actively working to address these issues and improve o1's capabilities.
o1 is considerably more expensive than GPT-4o, with developer pricing starting at $15 per 1 million input tokens and $60 per 1 million output tokens, according to The Verge.
In contrast, GPT-4o is priced at $5 per 1 million input tokens and $15 per 1 million output tokens.
OpenAI says it has prioritised safety in the development of o1. Through testing, the company has demonstrated that o1 is significantly more resistant to adversarial attempts to manipulate its behaviour.
To further advance AI safety, OpenAI has formalised agreements with the US and the UK AI Safety Institutes. These partnerships will enable independent research and evaluation of future models, ensuring their safe and responsible deployment.
OpenAI is currently working on GPT-5, a larger model that will incorporate both scaling and reasoning techniques.
In July, Google announced AlphaProof, a project that integrates language models with reinforcement learning to tackle complex math problems. According to the company, AlphaProof has developed the ability to reason through mathematical challenges by analysing correct solutions.