Recent Posts

Optimal Formats and the Cube Root of the PDF

9 minute read

Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But it’s too bi...

February Papers: Learning to Scale

17 minute read

Welcome to Papers of the Month! This time around, our monthly selection of ML papers revolves around the central theme of scale – and learning how to scale e...