A zero-package-dependency coding agent under 50KB. Pronounced exactly how you think.
Finally, a slop cannon small enough to fit in your pocket.
curl -sL pu.dev/pu.sh -o pu.sh && chmod +x pu.sh
./pu.shThat's the entire install. No npm. No pip. No Docker. No Node. One shell file, common Unix tools, curl, awk, and an API key.
# Zero package dependencies. Literally.
curl -sL pu.dev/pu.sh > pu.sh && chmod +x pu.sh
# First run walks you through provider, key, model, and effort.
./pu.sh
# One-shot task.
./pu.sh "find bugs in pu.sh"
# Interactive multi-turn session.
./pu.sh
> write a REST API server in Go
> now add rate limiting
> write tests for it
# Pipe agents together because we're adults.
./pu.sh "write the code" | ./pu.sh --pipe "review it for security bugs"
# Env-only setup still works.
OPENAI_API_KEY=sk-... AGENT_PROVIDER=openai AGENT_MODEL=gpt-5.5 ./pu.sh "your task"
ANTHROPIC_API_KEY=sk-ant-... AGENT_PROVIDER=anthropic AGENT_MODEL=claude-opus-4-7 ./pu.sh "your task"We ran 30+ experiments to answer a question: what's the most portable agentic harness that can run anywhere?
The answer is a shell script. The agent loop itself — send prompt, parse response, execute tool, append to history, repeat — is tiny. Everything else is developer experience and hardening.
Here's the thing nobody tells you: the node_modules folder of a typical coding agent weighs more than the entire Doom source code. Three times over. pu.sh weighs less than many README files.
| What | How |
|---|---|
| 7 tools | bash read write edit grep find ls — Pi-shaped surface area |
| Interactive REPL | Multi-turn with memory; /model /effort /login /logout /flush /compact /export /skill:name /quit |
| First-run login | API-key wizard for Anthropic/OpenAI, optional private ~/.pu.env save |
| Dual provider | Anthropic Messages API + OpenAI Responses API |
| OpenAI tool loop | Preserves reasoning, function_call, and function_call_output items across turns |
| Reasoning effort | `AGENT_EFFORT=none |
| File editing | Surgical oldText → newText replacement; rejects empty or non-unique matches |
| Safer file writes/edits | Preserves trailing newlines, uses temp files, keeps executable mode on edits |
| Context files | Auto-loads AGENTS.md / CLAUDE.md from cwd upward, plus global Pi agent context if present |
| Auto-compaction | Summarizes older turns when approximate context budget is exceeded; /compact [focus] manually compacts |
| Context/status line | Shows cwd, git branch, token counts, context usage, provider, model, effort |
| !command | !ls -la runs shell inline from the REPL |
| Prompt templates | /name expands .pi/prompts/name.md or ~/.pi/agent/prompts/name.md |
| Skills | /skill:name loads SKILL.md from local or user skill directories |
| Session export | /export writes markdown from .pu-events.jsonl |
| Pipe mode | --pipe for clean stdout, composable with other tools/agents |
| Checkpoint/resume | Writes .pu-history.json by default; override with AGENT_HISTORY=file.json |
| Confirmation mode | AGENT_CONFIRM=1 asks before every tool execution; safely denies when no TTY |
| Event log | Every step logged to .pu-events.jsonl as structured JSONL |
| Regression tests | bash eval/test_real.sh runs 105 no-API behavioral tests |
Let's be honest. The remaining gap to a production harness needs a real runtime:
- No TUI (it's a shell script, not a lifestyle)
- No streaming display (curl waits for the full response like a patient person)
- No image input
- No OAuth/browser login; API keys only
- No native Windows support
- No keyboard shortcuts, path completion, themes, or raw-terminal editor
- No package manager or TypeScript plugin SDK
- No full model registry/pricing database
- No general JSON parser; it uses targeted
awkparsing for provider shapes
pu.sh is the same slop cannon but small enough to inspect end to end and know exactly where the slop is coming from.
pu.sh < 50 KB █ (sh + curl + awk + common Unix tools)
Claude Code 209 MB ██████████████████████████
Goose CLI 237 MB █████████████████████████████
Pi + Node 281 MB ███████████████████████████████████
SWE-agent Docker 1.8 GB ██████████████████████████████████████████████████████████████...
Measured locally on macOS arm64. Current generated pu.sh is 49,276 bytes (48.12 KiB) by wc -c; the headline stays under 50 KB. Larger tools include their runtime/package footprints as described in final_report.md.
All env vars. Optional ~/.pu.env is created by /login/first run with 0600-style permissions and parsed with a tiny allowlist loader.
| Variable | Default | What |
|---|---|---|
AGENT_PROVIDER |
auto from key/model, else anthropic |
anthropic or openai |
AGENT_MODEL |
claude-opus-4-7 or gpt-5.5 |
Model id |
ANTHROPIC_API_KEY |
— | Anthropic API key |
OPENAI_API_KEY |
— | OpenAI API key |
AGENT_EFFORT |
medium |
`none |
AGENT_REASONING_SUMMARY |
auto |
OpenAI reasoning summary request: `auto |
AGENT_THINKING |
— | Legacy/Anthropic thinking hint; falls back into effort behavior |
AGENT_MAX_STEPS |
100 |
Max API/tool-loop steps before stopping |
AGENT_MAX_TOKENS |
4096 |
Base visible-output budget; raised for higher effort |
AGENT_CONTEXT_LIMIT |
400000 OpenAI-ish / 272000 Opus-ish |
Approximate context budget in bytes/chars |
AGENT_RESERVE |
16000 |
Reserved context budget before compaction |
AGENT_KEEP_RECENT |
80000 |
Approx bytes/chars of recent transcript to keep after compaction |
AGENT_TOOL_TRUNC |
100000 |
Max non-read tool output before truncation |
AGENT_READ_MAX |
1000000 |
Require offset/limit for larger file reads |
AGENT_LOG_TRUNC |
20000 |
Max event-log payload before trace-only truncation |
AGENT_CONFIRM |
0 |
1 = ask before each tool call |
AGENT_LOG |
.pu-events.jsonl |
Event/debug JSONL log file |
AGENT_HISTORY |
.pu-history.json |
Checkpoint file for automatic resume |
AGENT_SYSTEM |
built-in | Custom system prompt |
AGENT_PRICE_IN_PER_MTOK / AGENT_PRICE_OUT_PER_MTOK |
0 |
Optional cost display with --cost |
AGENT_DEBUG_API |
— | Directory to capture per-call input/response JSON for debugging |
| Command | What |
|---|---|
/model [id] |
Show or switch model; guesses provider from gpt-*/o*/claude-* |
/effort [level] |
Show or set reasoning effort (none, low, medium, high, xhigh, etc.) |
/reasoning [mode] |
Show or set OpenAI reasoning summaries (auto, concise, detailed, off) |
/login |
Run API-key setup wizard |
/logout |
Remove ~/.pu.env and unset in-process keys |
/flush |
Reset the session: clear memory/history, remove history metadata, and truncate event log |
/compact [focus] |
Summarize older context, optionally with focus text |
/export [file] |
Export event log to markdown |
/skill:name |
Load name/SKILL.md into the system prompt |
/quit |
Exit |
!cmd |
Run a shell command directly |
/template |
If .pi/prompts/template.md exists, run it as a prompt |
┌─────────────────────────────────────────┐
│ You type a thing │
│ ↓ │
│ curl sends it to Claude/GPT │
│ ↓ │
│ Model asks for a tool │
│ ↓ │
│ Shell runs read/write/edit/bash/etc. │
│ ↓ │
│ Result goes back to model │
│ ↓ │
│ Model says done │
└─────────────────────────────────────────┘
Zero package dependencies. Under 50KB. 7 tools. 2 providers. 1 file.
OpenAI uses /v1/responses with Responses-style tools and max_output_tokens. Anthropic uses /v1/messages. The parser is targeted awk, not a general JSON implementation.
The short version:
prompt → provider → tool call → shell tool → tool result → repeat
pu.sh writes .pu-history.json for resumable model memory and .pu-events.jsonl for event replay/export. Long sessions auto-compact by summarizing older transcript entries and keeping a bounded recent tail.
For details, see How pu works.
# No API calls, no cost. Current expected result: PASS: 105 FAIL: 0.
bash eval/test_real.sh
# Shell syntax.
sh -n pu.shThe current regression suite covers:
- JSON escaping and targeted JSON extraction
- Anthropic and OpenAI response parsing
- OpenAI Responses request shape and reasoning gating
- OpenAI tool continuation with
reasoning+function_call_output - API-error reporting, curl transport failures, model-error hints, and non-retryable auth errors
- first-run key sanitization and safe allowlist
~/.pu.envloading - default history save/resume of final assistant responses
- context compaction invariants
- tool truncation
- edit uniqueness/mode preservation and actionable edit-failure guidance
grep/findnoisy-directory exclusions and/effortcommand- trailing-newline preservation for
write/edit read limit:0- spinner quietness on non-TTY stderr
set -eis a serial killer.[ -f file ] && do_thingreturns 1 when the file doesn't exist.set -etreats that as fatal and silently kills your script. We useset -u, notset -e.- macOS sed ≠ GNU sed. The classic multiline sed trick breaks on BSD sed. Use
awk. - jq was a dependency. We wrote targeted
awkJSON extraction to keep install at zero. - Heredocs don't survive JSON reliably. The system prompt steers models to the
writetool instead. - OpenAI tool calling is not Chat Completions anymore. For reasoning + tools,
pu.shuses Responses API, carriesreasoningitems forward, and sendsfunction_call_outputitems. - Shell command substitution eats trailing newlines.
writeandedituse sentinel capture to preserve final\n. - Generic status spam is worse than silence. If the model doesn't provide a real pre-tool preamble,
pu.shjust prints the actual tool call instead ofInspecting with tools...forever.
pu.sh is a derived work, and we want to be loud about it. The system prompt structure, 7-tool surface (bash read write edit grep find ls), exact-text editing model, context-file convention, and skill/template ideas are inspired by Pi. Huge thanks and respect to the Pi team.
We compare against Pi feature-by-feature. Pi wins on extensibility, TUI, providers, safety, and production polish. pu.sh wins on portability and inspectability.
Is this production-ready?
It's called pu.sh. It's an under-50KB slop cannon that talks to LLM APIs via curl. You tell me.
Should I use this instead of Pi/Claude Code/Cursor?
For daily coding, probably not. Use a real tool. For CI/CD, containers, edge boxes, quick scripts, or understanding how agents actually work — ./pu.sh and see what happens.
How do I pronounce it? However makes your coworkers the most uncomfortable.
Did you really name a coding agent after feces?
It's pu.sh. As in push. As in ./pu.sh "deploy to prod". The fact that it sounds like something else is entirely coincidental and we are very serious engineers.
Did an AI write this?
An AI and a human ran experiments, argued with shell, broke OpenAI schemas, fixed them, and learned once again that the real production incident was set -e all along.
MIT — see LICENSE. It's under 50KB. Go nuts.
