Will an LLM with less than 10B parameters beat GPT4 by EOY 2025?
Basic
15
Ṁ1812
resolved Jan 10
Resolved
YES

How much juice is left in 10B parameters?

the original GPT4-0314 (ELO 1188)

judged by Lmsys Arena Leaderboard

current SoTA: Llama 3 8B instruct (1147)

Get
Ṁ1,000
and
S3.00
Sort by:
bought Ṁ500 YES

@mods Resolves as YES.. The creator's account is deleted, but Gemini 1.5 Flash-8B easily clears the bar (ELO higher than 1188) with an ELO of 1212, see https://lmarena.ai/

bought Ṁ400 YES

The market creator is banned, but it seems the YES resolution criterion has been met because Gemma-2-9B-it is at elo 1188, above GPT-4-0314 at 1886. However, one could argue it needs to be 1189 or higher because GPT-4-0314 was at 1188 when the market was created.

bought Ṁ10 YES

Open source models have an inherent advantage in that they don't need to conform to censorship/"safety" policies, making them much more useful and thus getting them a higher ELO. Having to conform to policies is a major handicap for the larger closed-source models.

@singer There is a without refusal board on lmsys. The disparity between main and that is how much advantages not having censorship’s give u

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules