Johanna Vielhaben

Research Scientist

Posts

January 13, 2026
in Papers of the Month
11 min read

December Papers: MoE, Fact-storing and Byteifying Language Models

Despite the holiday season and the busy NeurIPS period, December closed the year with set of insightful papers. Our team reviewed the following three papers:

First up, SonicMoE tackles issues of fine-grained and sparse MoEs using hardware-aware optimizations to restore efficiency.

Next, Constructing Efficient Fact-Storing MLPs for Transformers shows how MLP layers can be explicitly constructed as key–value stores to achieve high facts-per-parameter efficiency.

Finally, Bolmo presents a method for "byteifying" existing subword-level language models that improves character-level understanding while achieving comparable performance to subword-level models.