Posts by Category

papers-of-the-month

November Papers: An LLM Feast

18 minute read

This month we’ve got an all-LLM menu of papers for you, with summaries of four great works exploring many different aspects of crafting systems for LLM train...

September Papers: Proper Conditioning

15 minute read

We’re pleased to share four papers from different domains: LLM self-correction, FP8 training, generative crystals and optimisation. They are united, somewhat...

July Papers: All About Scaling

17 minute read

Scaling continues to be a super hot topic of research and our selection of papers for this month all tackle different angles of how to scale models efficient...

June Papers: Mamba-2 & Matmul-free Models

14 minute read

Improving transformers is now not “just one area” of machine learning research. This is illustrated by the breadth of papers we got excited about this month,...

Back to Top ↑

posts

Graphcore Research is hiring!

2 minute read

We are pleased to have announce we have open positions for Research Scientists and Engineers to join our team.

Scale-preserving nonlinearities for u-μP

5 minute read

My colleagues and I always get excited when, every once in a while, deep learning research throws up a fun little maths problem. Our recent work on u-μP does...

A transformer walk-through, with Gemma

36 minute read

Transformer-based LLMs seem mysterious, but they don’t need to. In this post, we’ll walk through a modern transformer LLM, Google’s Gemma, providing bare-bon...

Back to Top ↑