Foundational ML Papers Every Engineer Should Read

A curated list of the 10 most impactful machine learning papers that shaped modern AI.

13 items

Attention is All You NeedURL

Attention is All You Need

Transformer architecture paper (Vaswani et al.)

arxiv.org
ImageNet-21k Pretraining for the MassesURL

ImageNet-21k Pretraining for the Masses

Transfer learning at scale

arxiv.org
URL

Language Models are Unsupervised Multitask Learners

GPT-2 whitepaper

d4mucfpksywv.cloudfront.net
Batch Normalization: Accelerating Deep Network TrainingURL

Batch Normalization: Accelerating Deep Network Training

Normalization techniques

arxiv.org
ResNet: Deep Residual Learning for Image RecognitionURL

ResNet: Deep Residual Learning for Image Recognition

Residual networks

arxiv.org
BERT: Pre-training of Deep Bidirectional TransformersURL

BERT: Pre-training of Deep Bidirectional Transformers

Language model pretraining

arxiv.org
The Lottery Ticket HypothesisURL

The Lottery Ticket Hypothesis

Neural network pruning

arxiv.org
Scaling Laws for Neural Language ModelsURL

Scaling Laws for Neural Language Models

Understanding model scaling

arxiv.org
An Image is Worth 16x16 Words: Transformers for Image RecognitionURL

An Image is Worth 16x16 Words: Transformers for Image Recognition

Vision Transformers (ViT)

arxiv.org
Chinchilla: Training-Compute-Optimal Large Language ModelsURL

Chinchilla: Training-Compute-Optimal Large Language Models

Optimal scaling laws

arxiv.org
URL

Deep Learning by Goodfellow, Bengio & Courville

The definitive deep learning textbook

www.deeplearningbook.org
URL

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Regularization technique

jmlr.org
Adam: A Method for Stochastic OptimizationURL

Adam: A Method for Stochastic Optimization

Most popular optimizer

arxiv.org

Create your own collection

Start curating and sharing your links, files, and resources.

Get Started Free