Skip to main content
•3 min read

How to Read a Research Paper (Badly)

researchlearningAI

i decided to start reading AI research papers. how hard could it be?

hard. very hard. i have not understood most of what i've read.

but i'm getting better? maybe?

the problem

academic papers are not written for first-year CS students. they're written for experts who already understand the field, the notation, and the last 50 papers in the area.

i am not that person. i lack context.

my first attempts

paper #1: "attention is all you need"

the famous transformer paper. everyone references it. how hard could it be?

page 1: okay, transformers are for sequence tasks. makes sense. page 2: equations intensify page 3: what is a positional encoding and why are we using sine waves? page 4: i'm lost. completely lost.

paper #2: "deep residual learning for image recognition" (resnets)

skip connections. seems simple enough. introduction: okay i get the problem methods: wait what's happening with these identity mappings experiments: at least i can read the charts

what i'm learning about reading papers

1. read the abstract and conclusion first get the main point before diving into details. often you only need to understand the core idea, not every equation.

2. it's okay to not understand everything seriously. even experts skip papers or sections. you don't have to fully grok every proof.

3. related work sections are gold they cite other papers that explain prerequisites. build your reading queue from there.

4. youtube explanations help there's no shame in watching someone explain a paper before reading it. the goal is understanding, not suffering.

5. read actively take notes. draw diagrams. re-read paragraphs. ask "why?" constantly.

my current approach

  1. read title + abstract (what is this about?)
  2. look at figures and tables (what are the results?)
  3. read intro + conclusion (what problem, what solution?)
  4. skim methods (how does it work, roughly?)
  5. deep dive if i need more detail

this takes hours. for one paper. it's slow.

but it's worth it

reading "attention is all you need" (multiple times, with youtube help) gave me actual understanding of transformers. not perfect understanding, but enough to implement one. enough to read follow-up papers.

that knowledge compounds. paper 2 is easier than paper 1. paper 10 is easier than paper 2.

the goal

i want to be fluent in this literature. not overnight, but eventually. i want to read a new AI paper and mostly understand it on first pass.

we're not there yet. but we're working on it.


currently reading papers with four tabs open: the pdf, wikipedia, youtube, and a lot of coffee.