November Papers: Perspectives on efficiency
November is back to a favourite topic of ours: efficiency. We reviewed three of our favorite papers looking on LLM efficiency from different angles:
- First up, How to Scale Second-Order Optimization is looking at optimal tuning of second order optimizers such as Muon.
- Intelligence per Watt discusses our favorite metric on large language models: energy efficiency. And how to take advantage of edge AI inference.
- Finally, Int vs FP is contributing to an old-timer topic in quantization: integer vs floating (block) point formats.