Skip to content
Johanna Vielhaben

Johanna Vielhaben

Research Scientist

Posts

December Papers: MoE, Fact-storing and Byteifying Language Models

Despite the holiday season and the busy NeurIPS period, December closed the year with set of insightful papers. Our team reviewed the following three papers:

  • First up, SonicMoE tackles issues of fine-grained and sparse MoEs using hardware-aware optimizations to restore efficiency.
  • Finally, Bolmo presents a method for "byteifying" existing subword-level language models that improves character-level understanding while achieving comparable performance to subword-level models.