Skip to content

azeebneuron/GPTFromScratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This repository documents my journey of implementing a GPT model from the ground up.
Every step is explained in Jupyter notebooks with code, notes, and experiments, making it a resource for anyone curious about how GPTs really work under the hood.

Current Progress

  • micrograd (autograd engine from scratch)
  • makemore Part 1 (building a character-level language model)

What’s Next

  • Tokenizer and transformer implementation
  • Profiling and benchmarking
  • Custom Triton kernels (e.g., FlashAttention2)
  • Distributed and memory-efficient training
  • Scaling experiments
  • Data preprocessing and filtering from raw sources
  • Alignment methods: supervised finetuning, reinforcement learning, and DPO

Acknowledgments

Following Andrej Karpathy’s neural net series for inspiration and guidance.
Can’t thank him enough for making this stuff feel fun instead of intimidating.

About

Building a GPT model from the ground up, implementing self-attention, multi-head attention, and transformer blocks. Developed a custom autograd engine for neural networks, applying it to binary classification using micrograd.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors