Sessione Bigger models or more data? The new scaling laws for LLMs - AI Conf 2024
Questo sito si serve dei cookie per fornire servizi. Utilizzando questo sito acconsenti all'utilizzo dei cookie. Ulteriori informazioni Ok

Eventi>

Bigger models or more data? The new scaling laws for LLMs

(xtream)
Lingua: Inglese
Orario: 11:30  -  12:15

Play video

The incredibly famous Chinchilla paper changed the way we train LLMs. The authors - including the current Mistral CEO - outlined the scaling laws to maximise your model performance under a compute budget, balancing the number of parameters and training tokens.

Today, these heuristics are in jeopardy. LLaMA-3, for one, is trained on an unreasonable amount of tokens of text - but this is why it's so good. How much data do we actually need to train LLMs? This talk will shed light on the latest trends in model training and perhaps suggest newer scaling laws.