Which lab's AI will be the first to score over 10% on FrontierMath benchmark? | Manifold

Which lab's AI will be the first to score over 10% on FrontierMath benchmark?

Plus

29

Ṁ1045

2099

34%

OpenAI

21%

Google

0.4%

Microsoft

1.6%

Meta AI

17%

Anthropic

6%

xAI

19%

Other

Which lab's AI will be the first to score over 10% on FrontierMath benchmark?

Resolution Criteria: Official announcement from Epoch AI or the achieving lab.

This question is managed and resolved by Manifold.

#️ Technology

#Technical AI Timelines

#IMO Grand Challenge

Get

1,000

and

3.00

Sort by:

Isn't AlphaProof much closer than anything Open AI has come up with?

Open AI was testing their o1 model on AIME problems, and AlphaProof is already close to gold level on IMO problems, which I understand are vastly harder.

@TimothyJohnson5c16 The problem with AlphaProof is the need for formalization. Concepts required for high-school level math have been formalized in proof systems like Lean, these are enough for IMO, but many advanced concepts are still missing which might be required to solve problems in FrontierMath.

Another aspect of AlphaProof that I don't see people mentioning is that it's extremely slow, which makes sense because the state space of math proofs is much larger than the state spaces of Go or Chess. It took 3 days for it to solve IMO problems, a competition of 9 hours. Mathematicians say it might take them a week to solve a single problem from FrontierMath. You can see the issue here.

Personally, I think that explicit tree search might not be the way to go to scale test-time-compute, I'm more optimistic about o1-style approach of CoT + RL.

@NeuralBets I see your point about speed, but o1 also needed exponentially increasing amounts of compute to solve harder AIME problems.

Open AI didn't label the x-axis in this chart, but I suspect part of the reason they still haven't released the full version of o1 is because it's currently too expensive for practical use cases:

That's part of the reason that I'm betting NO here: Will an AI score over 10% on FrontierMath Benchmark in 2025 | Manifold.

@TimothyJohnson5c16 I created a market based on this discussion:

Related questions

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

What will be the top-3 AI labs in 2025?

What will be the top-3 AI labs in 2024?

Will xAI AI be a Major AI Lab by 2025?

+19% 1d47% chance

Will an AI score over 10% on FrontierMath Benchmark in 2025

By when will AI score >= 80% on FrontierMath

Will an AI score over 30% on FrontierMath Benchmark in 2025

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

-5% 1d51% chance

What will be the best performance on FrontierMath by December 31st 2025?

Which AI lab is the most underrated of 2024?

Related questions

Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?

By when will AI score >= 80% on FrontierMath

What will be the top-3 AI labs in 2025?

Will an AI score over 30% on FrontierMath Benchmark in 2025

What will be the top-3 AI labs in 2024?

Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?

Will xAI AI be a Major AI Lab by 2025?

What will be the best performance on FrontierMath by December 31st 2025?

Will an AI score over 10% on FrontierMath Benchmark in 2025

Which AI lab is the most underrated of 2024?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules