OpenAI’s o3 Model Scores Lower Than Expected on Independent Benchmark

Monday, 21 April, 2025

OpenAI’s o3 Model Scores Lower Than Expected on Independent Benchmark

OpenAI's o3 AI model, initially claimed to solve over 25% of problems on the challenging FrontierMath benchmark, has underperformed in independent evaluations. Epoch AI, the institute behind FrontierMath, reported that o3 achieved around 10% accuracy, significantly lower than OpenAI's earlier assertions. The discrepancy is attributed to differences in testing conditions, with OpenAI's internal assessments possibly utilizing more computational resources and different problem subsets.

Read full story at TechCrunch

Tags:openai benchmark FrontierMath

Categories

OpenAI’s o3 Model Scores Lower Than Expected on Independent Benchmark

Also Read

US Tariff Uncertainty Threatens Key Revenue Streams for Indian IT Firms

How Businesses Can Integrate AI for Real-World Impact

OpenAI’s o3 Model Scores Lower Than Expected on Independent Benchmark

Big Tech's 'Magnificent Seven' Face $3.8 Trillion Loss Amid Trump-Era Tariffs and Antitrust Scrutiny

Palantir Defends $30M ICE Contract Amid Surveillance Criticism

Tesla Explores Indian Semiconductor Partnerships to Diversify Supply Chain

BluSmart Bondholders Demand Immediate Repayment Amid Governance Concerns

Infosys Lays Off 240 Trainees After Assessment; Offers Upskilling Support

Zepto CEO Aadit Palicha Emphasizes Intense Work Culture: 'Not for Everyone'

Subscribe To Our Newsletter.