GitHub - NahimNasser/pu

A zero-package-dependency coding agent under 50KB. Pronounced exactly how you think.

Finally, a slop cannon small enough to fit in your pocket.

curl -sL pu.dev/pu.sh -o pu.sh && chmod +x pu.sh
./pu.sh

That's the entire install. No npm. No pip. No Docker. No Node. One shell file, common Unix tools, curl, awk, and an API key.

What

# Zero package dependencies. Literally.
curl -sL pu.dev/pu.sh > pu.sh && chmod +x pu.sh

# First run walks you through provider, key, model, and effort.
./pu.sh

# One-shot task.
./pu.sh "find bugs in pu.sh"

# Interactive multi-turn session.
./pu.sh
> write a REST API server in Go
> now add rate limiting
> write tests for it

# Pipe agents together because we're adults.
./pu.sh "write the code" | ./pu.sh --pipe "review it for security bugs"

# Env-only setup still works.
OPENAI_API_KEY=sk-... AGENT_PROVIDER=openai AGENT_MODEL=gpt-5.5 ./pu.sh "your task"
ANTHROPIC_API_KEY=sk-ant-... AGENT_PROVIDER=anthropic AGENT_MODEL=claude-opus-4-7 ./pu.sh "your task"

Why

We ran 30+ experiments to answer a question: what's the most portable agentic harness that can run anywhere?

The answer is a shell script. The agent loop itself — send prompt, parse response, execute tool, append to history, repeat — is tiny. Everything else is developer experience and hardening.

Here's the thing nobody tells you: the node_modules folder of a typical coding agent weighs more than the entire Doom source code. Three times over. pu.sh weighs less than many README files.

Features

What	How
7 tools	`bash` `read` `write` `edit` `grep` `find` `ls` — Pi-shaped surface area
Interactive REPL	Multi-turn with memory; `/model` `/effort` `/login` `/logout` `/flush` `/compact` `/export` `/skill:name` `/quit`
First-run login	API-key wizard for Anthropic/OpenAI, optional private `~/.pu.env` save
Dual provider	Anthropic Messages API + OpenAI Responses API
OpenAI tool loop	Preserves `reasoning`, `function_call`, and `function_call_output` items across turns
Reasoning effort	`AGENT_EFFORT=none
File editing	Surgical `oldText` → `newText` replacement; rejects empty or non-unique matches
Safer file writes/edits	Preserves trailing newlines, uses temp files, keeps executable mode on edits
Context files	Auto-loads `AGENTS.md` / `CLAUDE.md` from cwd upward, plus global Pi agent context if present
Auto-compaction	Summarizes older turns when approximate context budget is exceeded; `/compact [focus]` manually compacts
Context/status line	Shows cwd, git branch, token counts, context usage, provider, model, effort
!command	`!ls -la` runs shell inline from the REPL
Prompt templates	`/name` expands `.pi/prompts/name.md` or `~/.pi/agent/prompts/name.md`
Skills	`/skill:name` loads `SKILL.md` from local or user skill directories
Session export	`/export` writes markdown from `.pu-events.jsonl`
Pipe mode	`--pipe` for clean stdout, composable with other tools/agents
Checkpoint/resume	Writes `.pu-history.json` by default; override with `AGENT_HISTORY=file.json`
Confirmation mode	`AGENT_CONFIRM=1` asks before every tool execution; safely denies when no TTY
Event log	Every step logged to `.pu-events.jsonl` as structured JSONL
Regression tests	`bash eval/test_real.sh` runs 105 no-API behavioral tests

What it can't do

Let's be honest. The remaining gap to a production harness needs a real runtime:

No TUI (it's a shell script, not a lifestyle)
No streaming display (curl waits for the full response like a patient person)
No image input
No OAuth/browser login; API keys only
No native Windows support
No keyboard shortcuts, path completion, themes, or raw-terminal editor
No package manager or TypeScript plugin SDK
No full model registry/pricing database
No general JSON parser; it uses targeted awk parsing for provider shapes

pu.sh is the same slop cannon but small enough to inspect end to end and know exactly where the slop is coming from.

The Size

pu.sh              < 50 KB            █  (sh + curl + awk + common Unix tools)
Claude Code         209 MB            ██████████████████████████
Goose CLI           237 MB            █████████████████████████████
Pi + Node           281 MB            ███████████████████████████████████
SWE-agent Docker    1.8 GB            ██████████████████████████████████████████████████████████████...

Measured locally on macOS arm64. Current generated pu.sh is 49,276 bytes (48.12 KiB) by wc -c; the headline stays under 50 KB. Larger tools include their runtime/package footprints as described in final_report.md.

Configuration

All env vars. Optional ~/.pu.env is created by /login/first run with 0600-style permissions and parsed with a tiny allowlist loader.

Variable	Default	What
`AGENT_PROVIDER`	auto from key/model, else `anthropic`	`anthropic` or `openai`
`AGENT_MODEL`	`claude-opus-4-7` or `gpt-5.5`	Model id
`ANTHROPIC_API_KEY`	—	Anthropic API key
`OPENAI_API_KEY`	—	OpenAI API key
`AGENT_EFFORT`	`medium`	`none
`AGENT_REASONING_SUMMARY`	`auto`	OpenAI reasoning summary request: `auto
`AGENT_THINKING`	—	Legacy/Anthropic thinking hint; falls back into effort behavior
`AGENT_MAX_STEPS`	`100`	Max API/tool-loop steps before stopping
`AGENT_MAX_TOKENS`	`4096`	Base visible-output budget; raised for higher effort
`AGENT_CONTEXT_LIMIT`	`400000` OpenAI-ish / `272000` Opus-ish	Approximate context budget in bytes/chars
`AGENT_RESERVE`	`16000`	Reserved context budget before compaction
`AGENT_KEEP_RECENT`	`80000`	Approx bytes/chars of recent transcript to keep after compaction
`AGENT_TOOL_TRUNC`	`100000`	Max non-read tool output before truncation
`AGENT_READ_MAX`	`1000000`	Require offset/limit for larger file reads
`AGENT_LOG_TRUNC`	`20000`	Max event-log payload before trace-only truncation
`AGENT_CONFIRM`	`0`	`1` = ask before each tool call
`AGENT_LOG`	`.pu-events.jsonl`	Event/debug JSONL log file
`AGENT_HISTORY`	`.pu-history.json`	Checkpoint file for automatic resume
`AGENT_SYSTEM`	built-in	Custom system prompt
`AGENT_PRICE_IN_PER_MTOK` / `AGENT_PRICE_OUT_PER_MTOK`	`0`	Optional cost display with `--cost`
`AGENT_DEBUG_API`	—	Directory to capture per-call input/response JSON for debugging

Commands

Command	What
`/model [id]`	Show or switch model; guesses provider from `gpt-`/`o`/`claude-*`
`/effort [level]`	Show or set reasoning effort (`none`, `low`, `medium`, `high`, `xhigh`, etc.)
`/reasoning [mode]`	Show or set OpenAI reasoning summaries (`auto`, `concise`, `detailed`, `off`)
`/login`	Run API-key setup wizard
`/logout`	Remove `~/.pu.env` and unset in-process keys
`/flush`	Reset the session: clear memory/history, remove history metadata, and truncate event log
`/compact [focus]`	Summarize older context, optionally with focus text
`/export [file]`	Export event log to markdown
`/skill:name`	Load `name/SKILL.md` into the system prompt
`/quit`	Exit
`!cmd`	Run a shell command directly
`/template`	If `.pi/prompts/template.md` exists, run it as a prompt

How it works

┌─────────────────────────────────────────┐
│  You type a thing                       │
│  ↓                                      │
│  curl sends it to Claude/GPT            │
│  ↓                                      │
│  Model asks for a tool                  │
│  ↓                                      │
│  Shell runs read/write/edit/bash/etc.   │
│  ↓                                      │
│  Result goes back to model              │
│  ↓                                      │
│  Model says done                        │
└─────────────────────────────────────────┘

Zero package dependencies. Under 50KB. 7 tools. 2 providers. 1 file.

OpenAI uses /v1/responses with Responses-style tools and max_output_tokens. Anthropic uses /v1/messages. The parser is targeted awk, not a general JSON implementation.

How it works

The short version:

prompt → provider → tool call → shell tool → tool result → repeat

pu.sh writes .pu-history.json for resumable model memory and .pu-events.jsonl for event replay/export. Long sessions auto-compact by summarizing older transcript entries and keeping a bounded recent tail.

For details, see How pu works.

Testing

# No API calls, no cost. Current expected result: PASS: 105 FAIL: 0.
bash eval/test_real.sh

# Shell syntax.
sh -n pu.sh

The current regression suite covers:

JSON escaping and targeted JSON extraction
Anthropic and OpenAI response parsing
OpenAI Responses request shape and reasoning gating
OpenAI tool continuation with reasoning + function_call_output
API-error reporting, curl transport failures, model-error hints, and non-retryable auth errors
first-run key sanitization and safe allowlist ~/.pu.env loading
default history save/resume of final assistant responses
context compaction invariants
tool truncation
edit uniqueness/mode preservation and actionable edit-failure guidance
grep/find noisy-directory exclusions and /effort command
trailing-newline preservation for write/edit
read limit:0
spinner quietness on non-TTY stderr

The bugs we found so you don't have to

set -e is a serial killer. [ -f file ] && do_thing returns 1 when the file doesn't exist. set -e treats that as fatal and silently kills your script. We use set -u, not set -e.
macOS sed ≠ GNU sed. The classic multiline sed trick breaks on BSD sed. Use awk.
jq was a dependency. We wrote targeted awk JSON extraction to keep install at zero.
Heredocs don't survive JSON reliably. The system prompt steers models to the write tool instead.
OpenAI tool calling is not Chat Completions anymore. For reasoning + tools, pu.sh uses Responses API, carries reasoning items forward, and sends function_call_output items.
Shell command substitution eats trailing newlines. write and edit use sentinel capture to preserve final \n.
Generic status spam is worse than silence. If the model doesn't provide a real pre-tool preamble, pu.sh just prints the actual tool call instead of Inspecting with tools... forever.

Prior art & credits

pu.sh is a derived work, and we want to be loud about it. The system prompt structure, 7-tool surface (bash read write edit grep find ls), exact-text editing model, context-file convention, and skill/template ideas are inspired by Pi. Huge thanks and respect to the Pi team.

We compare against Pi feature-by-feature. Pi wins on extensibility, TUI, providers, safety, and production polish. pu.sh wins on portability and inspectability.

FAQ

Is this production-ready? It's called pu.sh. It's an under-50KB slop cannon that talks to LLM APIs via curl. You tell me.

Should I use this instead of Pi/Claude Code/Cursor? For daily coding, probably not. Use a real tool. For CI/CD, containers, edge boxes, quick scripts, or understanding how agents actually work — ./pu.sh and see what happens.

How do I pronounce it? However makes your coworkers the most uncomfortable.

Did you really name a coding agent after feces? It's pu.sh. As in push. As in ./pu.sh "deploy to prod". The fact that it sounds like something else is entirely coincidental and we are very serious engineers.

Did an AI write this? An AI and a human ran experiments, argued with shell, broke OpenAI schemas, fixed them, and learned once again that the real production incident was set -e all along.

License

MIT — see LICENSE. It's under 50KB. Go nuts.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
docs		docs
eval		eval
scripts		scripts
site		site
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bugs.md		bugs.md
final_report.md		final_report.md
logo.png		logo.png
pu-unminified.sh		pu-unminified.sh
pu.sh		pu.sh
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What

Why

Features

What it can't do

The Size

Configuration

Commands

How it works

How it works

Testing

The bugs we found so you don't have to

Prior art & credits

FAQ

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What

Why

Features

What it can't do

The Size

Configuration

Commands

How it works

How it works

Testing

The bugs we found so you don't have to

Prior art & credits

FAQ

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages