Skip to content
Sylvain Viguier

Sylvain Viguier

Director, Applied AI

Posts

November Papers: Perspectives on efficiency

November is back to a favourite topic of ours: efficiency. We reviewed three of our favorite papers looking on LLM efficiency from different angles:

  • First up, How to Scale Second-Order Optimization is looking at optimal tuning of second order optimizers such as Muon.
  • Intelligence per Watt discusses our favorite metric on large language models: energy efficiency. And how to take advantage of edge AI inference.
  • Finally, Int vs FP is contributing to an old-timer topic in quantization: integer vs floating (block) point formats.

November Papers: An LLM Feast

This month we've got an all-LLM menu of papers for you, with summaries of four great works exploring many different aspects of crafting systems for LLM training and inference.

We start with the surprising result that removing a single weight out of billions can completely ruin a model's ability to generate coherent text. Dubbed "super weights", preserving these weights is essential when quantising models to lower precision.

Also, we discuss how researchers at Meta explored using context parallelism, where the hidden states of the tokens are split across multiple processors and attention is computed using collective operations. They experiment with multiple strategies and find that different strategies should be used during different phases of inference.

Next, we cover an extension of scaling laws to account for numerical precision. The authors find, among other things, that neither 16-bit precision (as in current practice) nor very narrow bit widths (e.g. 4-bit precision) seem to be optimal.

Finally, we have a paper about the critical batch size in LLM training, the point at which increasing the global batch size is no longer helpful. The authors investigate how this value scales with the size of the model and the amount of training data, finding that the amount of training data has a much bigger effect.

We hope you enjoy these month's papers as much as we did! If you have thoughts or questions, please reach out to us at @GCResearchTeam.