Will the first AI to get IMO gold tree-search the formal proof space?

Plus

Ṁ2852

2030

55%

chance

ALL

This question resolves when an AI first gets IMO Gold. It resolves YES if this AI is explicitly coded to do the following:

Receive as input (or construct using natural language processing), problems which take the form of mathematical propositions encoded in a formal language.
Construct a tree of logical transformations of that initial proposition (e.g. in the form of tactic applications), and iteratively extend this tree with the goal of identifying a subtree which comprises a complete formal proof of the proposition.

Note that an AI which is not coded to do this explicitly, but instead identifies this strategy emergently does not count.

For example, an AI as described in the paper "HyperTree Proof Search for Neural Theorem Proving" would satisfy these conditions.

Close date updated to 2030-12-31 3:59 pm

This question is managed and resolved by Manifold.

#Technical AI Timelines

#IMO Grand Challenge

Get

1,000

and

3.00

7 Comments

22 Holders

87 Trades

Sort by:

what if it's a model which operates in the same style as o1? Where it runs a single, long, chain of thought (and can backtrack / plan in-context) instead of constructing an explicit tree?

I take it "Note that an AI which is not coded to do this explicitly, but instead identifies this strategy emergently does not count" excludes this class of model, right?

This is not how the recent AlphaGeometry paper works, so updating towards no

@jonsimon It seems to me that an IMO bot that worked this way would definitely count as YES, per the criteria. Is there something I'm missing about AlphaGeometry that makes you think it doesn't count here?

To explicitly lay this out

Receive as input (or construct using natural language processing), problems which take take the form of mathematical propositions encoded in a formal language.

This seems pretty much like what is described in this figure:

Construct a tree of logical transformations of that initial proposition (e.g. in the form of tactic applications), and iteratively extend this tree with the goal of identifying a subtree which comprises a complete formal proof of the proposition.

It seems like this matches the "symbolic deduction through infinite branching points".

Does stuff like tree of thoughts count?

@acertain I believe tree-of-thoughts is informal, so no.

Seems likely, proofs require a lot of multi-step organization and planning. So far the approaches that have worked best for these have used RL-guided monte carlo tree searches (e.g. AlphaZero, AlphaTensor, etc).

Related questions

Related questions