Luca Baggi
(xtream)
Lingua:
Inglese
Orario: 15:30
- 16:15
With o1, OpenAI ushered a new era of LLMs: reasoning capabilities. This new breed of models broadened the concept of scaling laws, shifting focus from **train-time** to **test-time** (or inference-time) compute. How do these models work? What do we think their architectures look like, and what data do we use to train them? And finally - and perhaps more importantly: how expensive can they get, and what can we use them for?