Skip to content

azeebneuron/Go-GPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Go-GPT

A minimal GPT implementation in pure Go with no external dependencies.

Inspired by Karpathy's minGPT — the most atomic way to train and run inference for a GPT, ported from Python to idiomatic Go.

What's inside

  • Autograd engine — reverse-mode automatic differentiation over scalar values
  • Character-level tokenizer — maps characters to token IDs with BOS support
  • Transformer model — GPT-2 architecture with multi-head attention, RMSNorm, and ReLU
  • Adam optimizer — with bias correction and linear learning rate decay
  • Training + Inference — trains on a names dataset, generates new hallucinated names

Project Structure

├── autograd/        # Scalar autograd engine (Value type + backprop)
├── tokenizer/       # Character-level tokenizer
├── model/
│   ├── layers.go    # Linear, Softmax, RMSNorm primitives
│   ├── gpt.go       # GPT forward pass with KV cache
│   └── state.go     # Model config and parameter initialization
├── training/        # Adam optimizer
├── data/            # Dataset loader (auto-downloads names.txt)
└── main.go          # Training loop + inference

Usage

go build -o go-gpt .
./go-gpt

The program will:

  1. Download the names dataset (32K names) if not present
  2. Train a tiny GPT (1 layer, 16-dim embeddings, 4 heads) for 1000 steps
  3. Generate 20 new hallucinated names

Model Details

Parameter Value
Layers 1
Embedding dim 16
Attention heads 4
Context length 16
Vocab size 27 (a-z + BOS)

This is deliberately tiny — the goal is clarity, not performance. Everything runs on scalar autograd, so training is slow but the code is readable.

Differences from GPT-2

  • LayerNorm → RMSNorm
  • GELU → ReLU
  • No biases
  • Character-level tokenization (no BPE)
  • Scalar autograd (no tensor ops)

About

A minimal GPT implementation in pure Go with zero dependencies — scalar autograd engine, character-level tokenizer, transformer with multi-head attention, and Adam optimizer. Trains on names and generates new ones. Inspired by Karpathy's minGPT.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages