Go-GPT

A minimal GPT implementation in pure Go with no external dependencies.

Inspired by Karpathy's minGPT — the most atomic way to train and run inference for a GPT, ported from Python to idiomatic Go.

What's inside

Autograd engine — reverse-mode automatic differentiation over scalar values
Character-level tokenizer — maps characters to token IDs with BOS support
Transformer model — GPT-2 architecture with multi-head attention, RMSNorm, and ReLU
Adam optimizer — with bias correction and linear learning rate decay
Training + Inference — trains on a names dataset, generates new hallucinated names

Project Structure

├── autograd/        # Scalar autograd engine (Value type + backprop)
├── tokenizer/       # Character-level tokenizer
├── model/
│   ├── layers.go    # Linear, Softmax, RMSNorm primitives
│   ├── gpt.go       # GPT forward pass with KV cache
│   └── state.go     # Model config and parameter initialization
├── training/        # Adam optimizer
├── data/            # Dataset loader (auto-downloads names.txt)
└── main.go          # Training loop + inference

Usage

go build -o go-gpt .
./go-gpt

The program will:

Download the names dataset (32K names) if not present
Train a tiny GPT (1 layer, 16-dim embeddings, 4 heads) for 1000 steps
Generate 20 new hallucinated names

Model Details

Parameter	Value
Layers	1
Embedding dim	16
Attention heads	4
Context length	16
Vocab size	27 (a-z + BOS)

This is deliberately tiny — the goal is clarity, not performance. Everything runs on scalar autograd, so training is slow but the code is readable.

Differences from GPT-2

LayerNorm → RMSNorm
GELU → ReLU
No biases
Character-level tokenization (no BPE)
Scalar autograd (no tensor ops)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Go-GPT

What's inside

Project Structure

Usage

Model Details

Differences from GPT-2

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
autograd		autograd
data		data
model		model
tokenizer		tokenizer
training		training
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
main.go		main.go

Folders and files

Latest commit

History

Repository files navigation

Go-GPT

What's inside

Project Structure

Usage

Model Details

Differences from GPT-2

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages