Unlock the Editor’s Digest for free
Roula Khalaf, Editor of the FT, selects her favourite stories in this weekly newsletter.
OpenAI will launch an AI product it claims is capable of reasoning, allowing it to solve hard problems in maths, coding and science in a critical step towards achieving humanlike cognition in machines.
The AI models, known as o1, are touted as a sign of the progression of technological capabilities over the past few years as companies race to create ever more sophisticated AI systems. In particular, there is a fresh scramble under way among tech groups, including Google DeepMind, OpenAI and Anthropic, to create software that can act independently as so-called agents — personalised bots that are supposed to help people work, create or communicate better and interface with the digital world.
According to OpenAI, the models will be integrated into ChatGPT Plus starting on Thursday. They are designed to be useful for scientists and developers, rather than general users. The company said the o1 models far outperformed existing models such as GPT-4o in a qualifying exam for the International Mathematics Olympiad, where it scored 83 per cent compared with 13 per cent for the latter.
Mira Murati, the company’s chief technology officer, said the models also opened up avenues in understanding how AI works. “We get visibility into the model’s thinking . . . we can observe its thought process, step by step,” she told the Financial Times.
The new models use a technique called reinforcement learning to approach problems. They take a longer time to analyse queries, which makes them more costly than GPT models, but are more consistent and sophisticated in their responses.
“What it’s doing during that time is . . . exploring different strategies for answering your query,” said Mark Chen, the lead researcher on the project. “If it realises it’s made mistakes, it can go and correct those things.”
For applications such as online search — which OpenAI is experimenting with via its SearchGPT tool — Murati said this set of models could open up “a new search paradigm”, enabling better research and information retrieval.
Teaching computer software to conduct step-by-step reasoning and plan ahead are important milestones in inventing artificial general intelligence — machines with humanlike cognitive capabilities — according to experts in the field.
If AI systems were to demonstrate genuine reasoning, it would enable “consistency of facts, arguments and conclusions made by the AI, [and] advances in agency and autonomy of AI, probably the main obstacles to AGI”, said Yoshua Bengio, a computer scientist at the University of Montreal who has won of the prestigious Turing Award.
There has been steady progress in this area with models such as GPT, Google’s Gemini and Anthropic’s Claude exhibiting some nascent reasoning capabilities, according to Bengio. But the scientific consensus is that AI systems fall short of true general-purpose reasoning.
“The right way to assess the advances is to have independent evaluations by scientists and academics, without conflicts of interest,” he added.
Gary Marcus, cognitive science professor at New York University, and author of Taming Silicon Valley, warned: “We have seen claims about reasoning over and over that have fallen apart upon careful, patient inspection by the scientific community, so I would view any new claims with scepticism.”
Bengio also pointed out that software with more advanced capabilities posed an increased risk of misuse in the hands of bad actors. OpenAI said it had “bolstered” its safety tests to match the advances, including providing the independent UK and US AI safety institutes early access to a research version of this model.
In the coming years, advances in this area will drive AI progress forward, according to technologists.
According to Aidan Gomez, chief executive of AI start-up Cohere and one of the Google researchers who helped build the transformer technology that underpins chatbots such as ChatGPT, teaching models to work through problems has shown “dramatic” improvements in their capabilities.
Speaking at an FT event on Saturday, he said: “It’s also considerably more expensive, because you’re spending a lot of compute planning and thinking and reasoning before actually giving an answer. So models are becoming more expensive in that dimension, but dramatically better at problem solving.”
Read the full article here