SNES/SFC ROM hacking toolkit — v0.8.1
A Python library that consolidates the tooling scattered across multiple ROM-hacking projects into a single installable package: address math, ROM header handling, tile/palette/sprite codecs, compression (LZSS + RLE), script dumping/insertion, pointer-table heuristics, Mesen2-Diz debugger IPC, asar patching, and Godot/Tiled/C++/Python export emitters.
Built with automation and streamlined romhacking in mind — primarily targeting SNES/SFC work, wiring static analysis, live debugger IPC, patch builds, and asset export into a single scriptable pipeline instead of a pile of one-off tools.
v0.8 is the first published version of the post-rewrite scope. The 0.1 line (address-only) still works through compatibility shims; 1.0 will land after examples + CLI + test suite.
pip install retrotool
Requires Python 3.12+ (uses stdlib tomllib).
retrotool/
├── core/ # primitives — ROM, addressing, binary, cache
├── project/ # TOML-based project + data-definition files
├── graphics/ # tile/palette/sprite/tilemap codecs
├── compression/ # LZSS (3 presets), RLE, registry, detector
├── script/ # .tbl codec, extractor, inserter, DTE, validator
├── debugger/ # Mesen2-Diz IPC client + automation
├── heuristics/ # pointer/text/gfx scanners + region mapper
├── asm/ # codegen, asar patcher, freespace, templates
├── extraction/ # Level/Entity/Behavior models + Pipeline
├── export/
│ ├── godot/ # .tscn / .tres / TileSet / SpriteFrames
│ ├── tiled/ # .tmx / .tsx
│ ├── cpp/ # struct headers
│ └── python/ # dataclass modules
├── ai/ # prompt templates + workflow steps
├── snes.py # back-compat shim → core.address/pointer
└── script.py # back-compat shim (via script/ package __init__)
Each submodule is importable on its own — from retrotool.compression import LZSSCodec does not
pull in the debugger, exporters, or any of graphics. Keeps CLI and GUI integrations cheap.
Address math and ROM loading. The only truly foundational module — everything else builds on it.
SFCAddress,SFCAddressType— conversion between PC, LoROM1, LoROM2, HiROM, ExLoROM, ExHiROM.SFCPointer— 24-bit pointer with per-byte access and flexible constructors.Rom— file loader that strips SMC headers and scores candidate internal headers (LoROM/HiROM/ExHiROM) by checksum-complement XOR + map-mode sanity + printable title.BuildCache— SHA-256 keyed filesystem cache used by the asar patcher (and available to consumers that want to skip expensive regeneration steps).binary.*—integer_or_hex,hex_fmt, low/high/bank byte helpers, LE u8/u16/u24 read+write.
Project definition files. A project is a project.toml plus a tree of per-table .toml
data definitions that describe where things live in the ROM.
ProjectConfig— parsed root document: rom info, build config, debugger config.DataDef— a single table/block: encoding, pointer table, data block, optional relocation, display constraints.load_project(path)/load_datadef(path)/load_datadefs(project).parse_snes_addr("$1B:8000")/parse_size("2M")— literals used in the TOML files.
All pixel-level SNES formats.
Palette— BGR555 ↔ RGB888 (bit-replicated), with transparent-index support.Tile/decode_tile/encode_tile— 1BPP-IL, 2BPP, 4BPP, 8BPP planar codec, withflipped(h=, v=)and grid compositor.TilemapEntry— 16-bit SNES tilemap word (tile 10b, palette 3b, priority, H-flip, V-flip).SpriteFrame/render_frame— compose frames from positioned 8×8 tiles;pack_atlasreturns anAtlaswith per-frameAtlasEntryorigin metadata.
Not yet in v0.8: font.py (1BPP-IL VWF + 2BPP 16x16 glyph pipelines), animation.py.
Unified codec framework. Parameterized LZSS covers all three variants shipped to date.
LZSSCodec(params)— greedy longest-match compressor + table-driven decompressor.- Presets:
PARAMS_RBSHURA(fill 0x00,u16_leheader),PARAMS_ZAMN(fill 0x20,u16_le_chain15header),PARAMS_LEGACY(fill 0x00, no header). decompress_chain(data, offset, resolve_next)— handles ZAMN's bit-15-chained blocks.
- Presets:
RLECodec(params, size=-1)— ctrl-byte RLE (run_flag=0x80, length_mask=0x7F).registry.get(name, params)— schemes:lzss,lzss-rbshura,lzss-zamn,lzss-legacy,rle.scan_lzss(data, presets, ...)— brute-force candidate scanner with size/ratio filters.
Not yet in v0.8: Huffman, Nintendo LZ77.
Text extraction + insertion using .tbl files.
Table(path)— loads a.tbl(HH=charlines,**variable substitution,%%double substitution). Providesinterpret_binary_data(bytes → text, longest-match decode) andencode_text(text → bytes, with[HH]hex-literal escape).extract_script(rom, datadef, table, address_type)— reads the pointer table described in theDataDef, walks each string to its terminator, and returnsScriptEntry[]with both the raw bytes and the decoded text.compile_script(texts, datadef, table, ...)— compiles strings back to bytes and emits a pointer table targeting a relocation address.find_digraphs/build_dte_table/apply_dte/savings_estimate— DTE overflow helpers for tight text budgets.round_trip(texts, table)— validator that encode→decode→compares every string.
Mesen2-Diz IPC client. Transport is a newline-delimited JSON protocol over Unix domain
sockets at /tmp/CoreFxPipe_{pipeName} (Windows support currently stub).
MesenClient(pipe_name=...)— low-levelcall(command, **params)plus wrappers:read_memory,write_memory,get_cpu_state,pause,resume,step,add_breakpoint,remove_breakpoint,evaluate,take_screenshot,get_rom_info,get_status.derive_pipe_name("Rushing Beat Shura.sfc")→"Mesen2Diz_RushingBeatShurasfc".paused(client)— context manager that pauses emulation during a block.run_until_breakpoint(client, addr, ...)— install one-shot breakpoint, resume, poll.MemoryRegion+watch(...)— tick/diff loop over a ROM or RAM range.
Untested against a live Mesen process in v0.8 — wire format is implemented per the documented protocol.
Static ROM-scanning heuristics.
scan_pointer_tables(rom, entry_size=2, bank=?, valid_range=?, ...)— slides across the ROM looking for runs of pointers whose targets all resolve into a valid range. Reports entry count, target-range bounds, and monotonic fraction.scan_text(rom, min_length=16, ...)— printable-byte runs separated by terminators.scan_graphics(rom, bpp=4, window_tiles=32, ...)— entropy + plane-pair correlation. Intended as a first-pass filter; confirm by rendering.shannon_entropy(data)— byte-distribution entropy (0–8).Region/merge_regions/fill_gaps— region-map builder that combines results from multiple scanners.
Assembly patching + codegen.
AsmBuilder— fluent builder:.label().instr().db().comment().render().AsarPatch(asm_file, includes=..., defines=...)+apply_patch(rom, patch, out, cache=)— shells out to theasarbinary, skips work when cache key matches.FreeSpace(regions)—.allocate(length, align, tag)with coalescing and used/free bookkeeping; use it to lay out data/code placements before emittingorgdirectives.templates.hook_jsl / redirect_pointer_table / freespace_block / data_block— string templates for common patterns.
Dataclass models for the things a disassembly typically produces. Pipeline is a dependency-ordered runner so extraction can be staged.
Level— layers + collision + triggers + spawns + palette zones.EntityDef/EntityRegistry— entity catalog.Behavior/BehaviorState— state-machine skeleton to annotate from disasm.Pipeline/PipelineStage— topologically-ordered runner over a shared context dict.
Text emitters for common downstream formats. Pure stdlib — no Godot/Tiled install required.
export.godot.GdScene/GdResource—.tscn/.trestext generator with Godot's inlineExtResource("…"),SubResource("…"),Vector2(x, y)literal syntax.export.godot.build_tileset—TileSetAtlasSourceresources + physics-layer specs.export.godot.build_sprite_frames—SpriteFramesresource fromAnimation[].export.godot.scaffold_project—project.godotboilerplate.export.tiled.build_tmx— Tiled.tmxwith CSV layer data + object group for triggers and spawns.export.tiled.build_tsx— Tiled.tsxtileset.export.cpp.render_header— namespaced header withu8/u16/u24/u32 → uint*_ttypes.export.python.render_module—@dataclassmodule.
LLM-assisted reverse-engineering scaffolding.
prompts.*— templates for compression identification, text-table location, level-format discovery, asar hook generation.workflows.*— orderedWorkflowStep[]sequences.ipc_prompt.IpcPlan— structured Mesen-IPC command sequence an LLM can fill in and aMesenClientcan consume directly.build_context(project, ...)— project → prompt prelude.
from retrotool import SFCAddress, SFCAddressType
addr = SFCAddress(0x5F800, SFCAddressType.PC)
print(addr.all()) # show all applicable conversions
print(addr.hirom_address) # '0xC5F800'
print(addr.lorom1_address) # '0x0BF800'from retrotool import Rom
rom = Rom.load("lm3.sfc")
print(rom.header.title, rom.header.mapping_name) # e.g. 'LITTLE MASTER III' 'lorom'
print(f"{rom.header.rom_size_bytes:#x}")
some_bytes = rom.read_snes(0x81_8000, length=16) # reads by SNES addr via detected mappingproject.toml:
data_dirs = ["scripts"]
[rom]
name = "Little Master III"
file = "lm3.sfc"
mapping = "lorom"
size = "2M"
expanded_size = "4M"
[rom.vectors]
reset = "$80:FFFE"
nmi = "$80:FFEA"
[build]
assembler = "asar"
output_dir = "out/"
cache_dir = ".cache/"
[debugger]
type = "mesen-diz"scripts/main_dialog.toml:
[table]
name = "main-dialog"
type = "pointer"
[encoding]
table_file = "tables/eng.tbl"
terminator = 0x00
[pointers]
address = "$1B:8000"
count = 512
size = 2
bank_override = "$1B"
[data]
start = "$1B:8400"
[relocation]
target = "$C1:8000"
pointer_size = 3from retrotool import load_project
from retrotool.project import load_datadefs
proj = load_project("path/to/project.toml")
for d in load_datadefs(proj):
print(d.name, hex(d.pointers.address), d.pointers.count)from pathlib import Path
from retrotool import Table, extract_script, load_project
from retrotool.project import load_datadefs
from retrotool.core import SFCAddressType
proj = load_project("examples/lm3")
rom = Path(proj.rom_path).read_bytes()
datadefs = load_datadefs(proj)
main = next(d for d in datadefs if d.name == "main-dialog")
tbl = Table(proj.root / main.encoding.table_file)
script = extract_script(rom, main, tbl, SFCAddressType.LOROM1)
for entry in script.entries[:5]:
print(entry.id, entry.text)from retrotool.compression import LZSSCodec, PARAMS_ZAMN, PARAMS_RBSHURA
codec = LZSSCodec(PARAMS_RBSHURA)
blob = b"Hello, World! " * 10
packed = codec.compress(blob).data
assert codec.decompress(packed).data == blob
# ZAMN chain handling:
zamn = LZSSCodec(PARAMS_ZAMN)
def resolve(data, ptr_off):
# read the 4-byte LoROM pointer at ptr_off, return its PC offset in `data`
...
all_bytes = zamn.decompress_chain(rom_data, first_block_offset, resolve).datafrom retrotool.graphics import Palette, decode_tiles, tile_to_rgba
palette = Palette.from_bytes(rom_data, offset=0x14_2000, count=16)
tiles = decode_tiles(rom_data, offset=0x14_4000, count=64, bpp=4)
first_rgba = tile_to_rgba(tiles[0], palette) # 8*8*4 bytesfrom retrotool.debugger import MesenClient, derive_pipe_name, paused
with MesenClient(derive_pipe_name("Rushing Beat Shura (J).sfc")) as mesen:
with paused(mesen):
regs = mesen.get_cpu_state()
print(regs["pc"], regs["a"])
data = mesen.read_memory("SnesWorkRam", 0x7E_1000, 128)
bp = mesen.add_breakpoint(0xC0_8000, memory_type="SnesPrgRom", break_on="exec")
mesen.resume()
# ... poll get_status, then:
mesen.remove_breakpoint(bp)from retrotool.heuristics import (
scan_pointer_tables, scan_text, scan_graphics,
Region, merge_regions, fill_gaps,
)
rom = open("lm3.sfc", "rb").read()
ptrs = scan_pointer_tables(rom, entry_size=2, bank=0x1B,
valid_range=(0xD_8000, 0xE_8000), min_entries=16)
texts = scan_text(rom, min_length=16)
gfx = scan_graphics(rom, bpp=4, window_tiles=32)
regions = (
[Region(p.offset, p.count * p.entry_size, "pointer_table", p.monotonic_fraction) for p in ptrs]
+ [Region(t.offset, t.length, "text", t.printable_ratio) for t in texts]
+ [Region(g.offset, g.length, "graphics", g.plane_correlation) for g in gfx]
)
classified = fill_gaps(merge_regions(regions, gap_tolerance=4), len(rom))from pathlib import Path
from retrotool import BuildCache
from retrotool.asm import AsarPatch, apply_patch
cache = BuildCache(".cache")
result = apply_patch(
rom=Path("lm3.sfc"),
patch=AsarPatch(asm_file=Path("patches/main.asm"),
includes=[Path("patches/lib.asm")],
defines={"VERSION": "english"}),
out=Path("out/lm3.patched.sfc"),
cache=cache,
)
print("cache hit" if result.cache_hit else "rebuilt", result.ok)from retrotool.export.godot import GdScene, GdNode, build_tileset, TileAtlas
from retrotool.export.tiled import build_tmx, build_tsx
scene = GdScene(
root_name="Stage1", root_type="Node2D",
nodes=[
GdNode("TileMap", "TileMapLayer"),
GdNode("Player", "CharacterBody2D", properties={"position": (16, 32)}),
],
)
open("stage1.tscn", "w").write(scene.render())
tileset = build_tileset([TileAtlas("res://tiles.png", (8, 8), 32, 256)])
open("tileset.tres", "w").write(tileset.render())
# level is a retrotool.extraction.Level
open("stage1.tmx", "w").write(build_tmx(level, tileset_source="tiles.tsx"))
open("tiles.tsx", "w").write(build_tsx("tiles", "tiles.png", 256, 256))v0.1 import paths still work:
from retrotool.snes import SFCAddress, SFCAddressType, SFCPointer, lorom_to_hirom
from retrotool.script import TableThese re-export from the new modules; existing scripts don't need updating to load under 0.8.
See project-plan.md for the full 16-phase plan and per-phase status.
Short version:
- 0.8.1 (current) — 12 library modules scaffolded, core paths smoke-tested.
- 0.9 — CLI (
retrotool …subcommands) + example projects (lm3, rbshura, zamn, minimal)- pytest suite.
- 1.0 — GUI shell with project explorer, game-specific script editor, graphics extractor, built-in hex editor polling the debugger, pointer-table inspector, asar build panel; and runtime-guided heuristics that combine static scans with live Mesen state (write-breakpoint pointer discovery, DMA-trace data localization, glyph-correlation text discovery, LZSS fingerprinting via ring-buffer detection).
See LICENSE.