Recent Posts

Scale-preserving nonlinearities for u-μP

5 minute read

My colleagues and I always get excited when, every once in a while, deep learning research throws up a fun little maths problem. Our recent work on u-μP does...

July Papers: All About Scaling

17 minute read

Scaling continues to be a super hot topic of research and our selection of papers for this month all tackle different angles of how to scale models efficient...