January Papers: Great Teachers & Beyond Chinchilla
For the research community, 2023 was dominated by large transformers and the associated challenges with training, tuning and deploying them. This trend has c...
For the research community, 2023 was dominated by large transformers and the associated challenges with training, tuning and deploying them. This trend has c...
The last month saw impressive developments in the space of efficient transformers and applied ML, from materials discovery to chip design.
TL;DR: Scaled dot product attention isn’t properly scaled, and that’s a good thing!