Strawberry Ai Is Here Openai Introduces O1 Advanced Reasoning Models


OpenAI says o1 models take some time to think before generating a response, but they can “reason through complex tasks” and solve harder problems in math, science, and coding. In addition, OpenAI says that new reasoning models perform on par with PhD students on challenging science topics.
To give you a benchmark, the OpenAI o1 model scored 83% in a rigorous exam like the International Mathematics Olympiad (IMO) whereas GPT-4o could only solve 13% of problems. And in the Codeforces competition, the new o1 model reached the 89th percentile whereas GPT-4o stood at the 11th percentile.
In the MMLU benchmark, OpenAI o1 scored 92.3 and on the MATH benchmark, it scored 94.8. OpenAI says in tasks where heavy reasoning is required, o1 closely matches the performance of human experts, which is pretty significant.
The o1 models have been trained using a chain-of-thought technique through reinforcement learning. It breaks down the steps into simpler ones and approaches each step through different strategies until it reaches the correct conclusion. By the way, currently, o1 models only support textual input. You can’t use the model to browse the web or analyze files and images.