Foundational ML Papers Every Engineer Should Read

A curated list of the 10 most impactful machine learning papers that shaped modern AI.

13 items

URL

Attention is All You Need

Transformer architecture paper (Vaswani et al.)

arxiv.org

URL

ImageNet-21k Pretraining for the Masses

Transfer learning at scale

arxiv.org

URL

Language Models are Unsupervised Multitask Learners

GPT-2 whitepaper

d4mucfpksywv.cloudfront.net

URL

Batch Normalization: Accelerating Deep Network Training

Normalization techniques

arxiv.org

URL

ResNet: Deep Residual Learning for Image Recognition

Residual networks

arxiv.org

URL

BERT: Pre-training of Deep Bidirectional Transformers

Language model pretraining

arxiv.org

URL

The Lottery Ticket Hypothesis

Neural network pruning

arxiv.org

URL

Scaling Laws for Neural Language Models

Understanding model scaling

arxiv.org

URL

An Image is Worth 16x16 Words: Transformers for Image Recognition

Vision Transformers (ViT)

arxiv.org

URL

Chinchilla: Training-Compute-Optimal Large Language Models

Optimal scaling laws

arxiv.org

URL

Deep Learning by Goodfellow, Bengio & Courville

The definitive deep learning textbook

www.deeplearningbook.org

URL

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Regularization technique

jmlr.org

URL

Adam: A Method for Stochastic Optimization

Create your own collection

Start curating and sharing your links, files, and resources.

Get Started Free