Which of these models have an ELO Rating in the LMARENA (formerly known as LMSYS) by the end of January 2025? | Manifold

Which of these models have an ELO Rating in the LMARENA (formerly known as LMSYS) by the end of January 2025?

Premium

6

Ṁ28k

Feb 1

70%

Gemini 2 (flagship)

52%

DeepSeek's r1

32%

Openai's o1 Pro

Resolved

YES

OpenAI's o1

If on January 31st 2025 or earlier a model has a score in the LMARENA leaderboard, the respective market resolves to YES.

Gemini 2.0 (flagship) resolves to YES if Google DeepMind implies that the model is their best Gemini 2.0 version, whatever that is called.

This question is managed and resolved by Manifold.

#️ Technology

#Technical AI Timelines

Get

1,000

and

3.00

Sort by:

Have you ever been duped off your funds all in the name of investment and investing in companies and getting a certain percentage in return or your bitcoin account was hacked and your funds was stolen, any which ways i am here with a way you can get your stolen funds back which is you contacting (dorisashley71 (@) gmail. Com) also Whatsapp +1---(404)--721--56--08 and following all their instructions because this is something i did and i got my stolen funds back from scammers in the form of a company, they also offer other cyber technology services you just present it before them and you will get the solution you desire that i can assure you of.

bought Ṁ9,000 YES

@MP resolves Yes

opened a Ṁ103 NO at 90% order

o1 has a rating now!

bought Ṁ500 YES

Another question: You say the Gemini 2.0 resolves according to the "best" model. Does this include the "Thinking Mode"? So if Gemini 2.0 has an Elo, but a "Gemini 2.0 Thinking Mode" (analogous to "Gemini 2.0 Flash Thinking Mode") has been announced but does not have an Elo yet, will the Gemini 2 question resolve Yes or No?

bought Ṁ350 YES

For r1, would "r1-lite" count? Would "r1-preview" count?

For Gemini, would "Gemini-2.0-Exp" count, or does it have to be "Gemini-2.0" without the "-Exp" marker?

@MP Friendly ping! Would be lovely to get a clarification on the resolution criteria.

Related questions

Will an LLM break 1400 ELO on LMSys before February?

Who will have the highest ranking model on web.lmarena.ai by March 2025?

Who will have the highest ranking model on web.lmarena.ai by end of June 2025?

Who will have the highest ranking model on web.lmarena.ai by EOY 2025?

What Elo will a Four Player Chess (Free For All, Rapid) engine achieve by 2025-07-01?

What organization will top the LLM leaderboards on LMArena at end of 2025? 🤖📊

Who will ever rank #1 in LMSYS Chatbot Arena Leaderboard in 2025?

Who will ever rank Top 10 in LMSYS Chatbot Arena Leaderboard in 2025?

Will Mistral's next model make it to the top 10 models in LLM Arena by the end of 2025?

Which chess prodigy will have the highest Elo rating at the end of 2024?

Related questions

Will an LLM break 1400 ELO on LMSys before February?

What organization will top the LLM leaderboards on LMArena at end of 2025? 🤖📊

Who will have the highest ranking model on web.lmarena.ai by March 2025?

Who will ever rank #1 in LMSYS Chatbot Arena Leaderboard in 2025?

Who will have the highest ranking model on web.lmarena.ai by end of June 2025?

Who will ever rank Top 10 in LMSYS Chatbot Arena Leaderboard in 2025?

Who will have the highest ranking model on web.lmarena.ai by EOY 2025?

Will Mistral's next model make it to the top 10 models in LLM Arena by the end of 2025?

What Elo will a Four Player Chess (Free For All, Rapid) engine achieve by 2025-07-01?

Which chess prodigy will have the highest Elo rating at the end of 2024?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules