In the Gemini 1.5 report, then have mentioned multiple times that "compared to previous version (gemini 1), 1.5 used MoE, which boosts efficiency etc."
Would this be enough for you to resolve the market? @ahalekelly
@Sss19971997 It definitely seems implied in https://arxiv.org/pdf/2403.05530, e.g.
Gemini 1.5 Pro is a sparse mixture-of-expert (MoE) Transformer-based model that builds on Gemini 1.0’s (Gemini-Team et al., 2023) research advances and multimodal capabilities. Gemini 1.5 Pro also builds on a much longer history of MoE research at Google [...]
However, I haven't found quotes that are quite as clear about the question as you make it sound. Are there any quotes which imply this more strongly?
Pretty sure that Gemini 1.0 is not MoE, but 1.5 is. Would perfectly explain comparative performance and "53% increase in parameters" yet "using less compute".
Gemini 1.5 boasts a remarkable 53% increase in parameters compared to Gemini 1, empowering it to handle more complex language tasks and store a wealth of information. (source: random linkedin post)
It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute. (source: Google)
No previous Google model was a MoE. There was no hint of MoE in the paper. The fact that they trained differently-sized versions of the model (Nano/Pro/Ultra) feels like evidence against: that's hard to do with experts.
How will this be resolved? Does Google have to officially confirm it, or is plausible leak enough?
@Coagulopath yeah all the speculation I can find is saying it’s not MoE.
If there’s no official statement then this resolves based on the most plausible leaks/speculation 1 year after launch (Dec 6, 2024), mods can reresolve if Google confirms it after that point
I will not bet on this market.