Graphcore Research

Our mission is to advance AI research and characterise the computational requirements of machine intelligence.

Posts by Category

papers-of-the-month 19
posts 9

papers-of-the-month

June Papers: Gradient Norms, LLM Reasoning and Video Generation

13 minute read

This June not only brought us very hot and sunny days (at least here in the UK), but also an excellent selection of new and exciting ML research! Out of the ...

May Papers: Parallel scaling, Evolving code, Understanding LLM reasoning

15 minute read

Hurtling past the NeurIPS submission deadline into the summer months, we switch from huddling around server rooms to keep warm to babysitting experiments whi...

April Papers: Motion Prompting, Mamba Reasoning and Modeling Rewards

13 minute read

April has been a busy month for the AI research community, with ICLR (the first of the “big three” AI conferences of the year) taking place in Singapore. We’...

March Papers: De-Norming, Skill-Scaling, Over-Training and Drug-Generating

21 minute read

We’ve enjoyed March, bringing improving weather and many excellent ML papers to keep us busy. As usual, we’re here to share summaries of four of our favourit...

February Papers: Learning to Scale

17 minute read

Welcome to Papers of the Month! This time around, our monthly selection of ML papers revolves around the central theme of scale – and learning how to scale e...

January Papers: More Like “Reas-anuary Papers”

21 minute read

New year, new Papers of the Month! Kicking off 2025, it’s apparent that reasoning and test-time compute are the hot topics on the block, with much research i...

December Papers: Spend Your FLOPs Wisely

22 minute read

Welcome to Papers of the Month — Graphcore Research’s effort to bring you our pick of the most interesting ML papers. In December we noted a collection of pa...

November Papers: An LLM Feast

18 minute read

This month we’ve got an all-LLM menu of papers for you, with summaries of four great works exploring many different aspects of crafting systems for LLM train...

October Papers: Improving image generation & making LLMs think

12 minute read

This month brought us some exciting developments in improving image-generating models, as well as some interesting insights into how to make large language m...

September Papers: Proper Conditioning

15 minute read

We’re pleased to share four papers from different domains: LLM self-correction, FP8 training, generative crystals and optimisation. They are united, somewhat...

August Papers: Hallucinations, Quantisations and Test-Time Computations

13 minute read

If there’s one thing you can count on from Graphcore Research, it’s tireless enthusiasm for effective compute utilsation! Our favourite papers from August i...

July Papers: All About Scaling

17 minute read

Scaling continues to be a super hot topic of research and our selection of papers for this month all tackle different angles of how to scale models efficient...

June Papers: Mamba-2 & Matmul-free Models

14 minute read

Improving transformers is now not “just one area” of machine learning research. This is illustrated by the breadth of papers we got excited about this month,...

May Papers: xLSTM, Schedule-Free Optimizers, and Multi-token prediction

12 minute read

May is always an eventful time of year for ML researchers, with final ICML paper decisions and ICLR taking place in early May, and NeurIPS submission deadlin...

April Papers: TriForce, QuaRot & Mixture-of-Depths

13 minute read

For our April selection of AI research papers, there is a clear common thread: efficient LLM inference. But as it happens, ML researchers are showing there a...

March Papers: Low-Rank Galore & 1.58-Bit Weights

17 minute read

March was a fruitful month for AI research, with plenty of papers for us to choose from. A trend in the work we’ve selected is the pushing of previously publ...

February Papers: Longer RoPEs & Better Quantisation

17 minute read

Improving LLM inference is a key research topic at the moment, and something we’re particularly interested in at Graphcore because of its hardware implicatio...

January Papers: Great Teachers & Beyond Chinchilla

15 minute read

For the research community, 2023 was dominated by large transformers and the associated challenges with training, tuning and deploying them. This trend has c...

December Papers: FP8 Training & Simpler Transformers

10 minute read

The last month saw impressive developments in the space of efficient transformers and applied ML, from materials discovery to chip design.

Graphcore Research

Posts by Category

papers-of-the-month

June Papers: Gradient Norms, LLM Reasoning and Video Generation

May Papers: Parallel scaling, Evolving code, Understanding LLM reasoning

April Papers: Motion Prompting, Mamba Reasoning and Modeling Rewards

March Papers: De-Norming, Skill-Scaling, Over-Training and Drug-Generating

February Papers: Learning to Scale

January Papers: More Like “Reas-anuary Papers”

December Papers: Spend Your FLOPs Wisely

November Papers: An LLM Feast

October Papers: Improving image generation & making LLMs think

September Papers: Proper Conditioning

August Papers: Hallucinations, Quantisations and Test-Time Computations

July Papers: All About Scaling

June Papers: Mamba-2 & Matmul-free Models

May Papers: xLSTM, Schedule-Free Optimizers, and Multi-token prediction

April Papers: TriForce, QuaRot & Mixture-of-Depths

March Papers: Low-Rank Galore & 1.58-Bit Weights

February Papers: Longer RoPEs & Better Quantisation

January Papers: Great Teachers & Beyond Chinchilla

December Papers: FP8 Training & Simpler Transformers

posts

Optimal Formats and the Cube Root of the PDF

Llama 3.2 Vision — A Deep Dive

Graphcore Research is hiring!

Speeding up LLM inference using SparQ Attention & llama.cpp

Scale-preserving nonlinearities for u-μP

Our ICML 2024 roundup: sparsity, speculative sampling and schnitzel

Sparser llamas run faster — speed up LLM inference with SparQ Attention

A transformer walk-through, with Gemma

Almost-scaled dot-product attention