Is Nick Cammarata right: LLM will be able to mechanistically audit own circuits and explain ghiblification in 2Y?
6
Ṁ109
2027
34%
chance

Background

This is based on this Nick Cammarata tweet thread.

https://x.com/nickcammarata/status/1905321653401518517?s=46

“I’m at like 80% that in two years we’ll be able to get 4o to mechanistically audit its own circuits and explain what the core ideas behind ghiblification are“

Resolution Criteria

Resoloves to YES if there exists publicly available evidence (e.g., peer-reviewed paper, detailed research report, or credible demonstration verified by domain experts) demonstrating that a large language model (LLM) has successfully:

Mechanistically audited its own circuits, meaning it has identified, described, and explained the functional roles of specific internal computations or neuron groups within its own neural architecture at a detailed, circuit-level granularity. And is able to explain concepts like “ghiblification”, meaning it has clearly articulated how it internally represents, processes, or produces specific outputs.

Get
Ṁ1,000
and
S3.00
Sort by:

Can you specify a bit more in detail what level of detail of the explanation must be to resolve yes? Is it enough to give generic description? ("A neutral network of this and this architecture needs to be train on this data")? Does the LLM need to be able to write a program with no training data that can do the job? (The other extreme)

@Irigi Can you elaborate and help me answer this. What makes more sense in your opinion? I don't think I understand what you mean by "give generic description? ("A neutral network of this and this architecture needs to be train on this data")?"

@paw I don't know how to fix it, it is more about what is your definition of explanation. For example, I could explain gravitation on many levels, e.g.

1] The force that makes things fall down to Earth
2] The attractive force between two massive bodies proportional to mass of both bodies.
3] (give vector equations for Newton's theory of gravitation)
4] (give full equations of Einstein's general relativity)

Each of those is explanation of gravitation on some level. To come up with explanation 1 is relatively easy, to be able to describe precisely level 4 is quite hard (both to figure out initially and then to explain even if you already know it).

Unless you specify how detailed explanation you expect, I think it is impossible to bet precisely. You should at least give some examples of what would be resolved as YES and what as NO. I am sure that current LLMs already can explain ghiblification at some level.

@Irigi thanks with the help of gpt 4.5 and updated resolution to the following:

Resoloves to YES if there exists publicly available evidence (e.g., peer-reviewed paper, detailed research report, or credible demonstration verified by domain experts) demonstrating that a large language model (LLM) has successfully:

Mechanistically audited its own circuits, meaning it has identified, described, and explained the functional roles of specific internal computations or neuron groups within its own neural architecture at a detailed, circuit-level granularity. And is able to explain concepts like “ghiblification”, meaning it has clearly articulated how it internally represents, processes, or produces specific outputs.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules