Standalone validator and export tool for Spur-format Smalltalk image files. Walks the entire heap and checks structural integrity, class table consistency, method validity, and more. Reports errors (fatal corruption) and warnings (suspicious but non-fatal conditions).
Also supports exporting image contents in diff-friendly formats: per-object SHA-256 manifests (for change detection), full object catalogs, class hierarchy trees, and object reference graphs.
Format Description Dialect
68021 Spur 64-bit Pharo 5+, Squeak 5.3+, Cuis 6+
68533 Spur 64-bit + Sista V1 Pharo 9+ (Sista bytecodes)
6521 Spur 32-bit Squeak 5.x (32-bit), Cuis 5.x
6505 Spur 32-bit + Sista V1 (if used)
Also supports the iOS-variant immediate tag encoding used by iospharo at runtime — see iOS Tag Variant below.
Requires CMake 3.16+ and a C++17 compiler.
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build .
validate_smalltalk_image [options] <image-file>
Options:
--ios-tags Use iOS tag variant (see below)
--reachability Run reachability analysis (finds unreachable objects)
--json Output JSON instead of human-readable text
--verbose Print findings as they are discovered
--max-errors N Stop after N errors (default: unlimited)
--help Show help
Export options (mutually exclusive):
--export-shasum Per-object SHA-256 manifest (for diff)
--export-catalog Full object metadata catalog
--export-hierarchy Class inheritance tree
--export-graph Object reference graph
Export modifiers:
--csv CSV output (with --export-catalog)
--filter-class NAME Restrict to instances of named class
--graph-root OFFSET Hex offset of subgraph root (with --export-graph)
--graph-depth N Max traversal depth (with --export-graph)
Exit codes:
0 No errors (or export succeeded)
1 Errors found (image has structural problems)
2 Usage error
Export output goes to stdout; progress and validation messages go to stderr. This lets you pipe or redirect exports cleanly:
validate_smalltalk_image --export-shasum image.image > shasums.txt
Check Description
heap.objectOverflow Object extends past end of heap
alignment.misaligned Object header not 8-byte aligned
classIndex.outOfRange Class index exceeds 22-bit maximum
forwarded.inSavedImage Forwarded object (classIndex=8) found in saved image
oop.invalidPointer Pointer slot contains invalid Oop (not immediate, not in heap)
oop.misalignedPointer Pointer slot contains unaligned heap address
oop.notObjectStart Pointer targets mid-object, not a header
classTable.missingClass Object references class index not in class table
classTable.badPage Class table page pointer invalid
classHierarchy.nonPointer Class object has non-pointer format
classHierarchy.tooFew Class has fewer than 3 slots
classHierarchy.superCycle Cycle detected in superclass chain
classHierarchy.badSuper Superclass is not a valid Oop
classHierarchy.badMethodDict Method dictionary is not a valid pointer
specialObjects.outOfRange Special objects Oop outside heap
specialObjects.notStart Special objects Oop not an object start
specialObjects.badFormat Special objects array is not pointer format
specialObjects.tooSmall Special objects array has < 60 entries
specialObjects.badEntry Critical special object is not a heap pointer
method.badHeader Method header (slot 0) is not a SmallInteger
method.literalOverflow numLiterals exceeds available slots
method.literalPastBody Literal frame extends past method body
slot.pastHeap Object slot data extends past heap end
Check Description
format.reserved Object uses reserved format (6, 7, or 8)
format.zeroWithSlots ZeroSized format but has > 0 slots
format.bytePadding Byte object with 0 slots but non-zero unused count
freeChunk.nonZeroFormat Free chunk (classIndex=0) with non-zero format
overflow.smallCount Overflow header with slot count < 255
overflow.veryLarge Object larger than 100 MB
classTable.unexpectedFormat hiddenRootsObj has unexpected format
classHierarchy.deepChain Superclass chain depth >= 200
classHierarchy.badInstSpec Instance specification is not a SmallInteger
classHierarchy.badName Class name is not a byte or IndexableFixed object
classHierarchy.mdBadFormat Method dictionary has non-pointer format
classHierarchy.mdTooSmall Method dictionary has < 2 slots
method.argsVsTemps numArgs > numTemps in method header
method.badPrimBytecode hasPrimitive set but first byte is not 248
method.primOutOfRange Primitive index > 8191
header.veryLargeImage Image > 4 GB
image.truncated File is shorter than header claims
reachability.unreachable Objects not reachable from roots (with --reachability)
specialObjects.nilMismatch nil slot doesn't point to first heap object
specialObjects.falseMismatch false slot doesn't point to second heap object
specialObjects.trueMismatch true slot doesn't point to third heap object
Statistic Description
Total objects Count of non-free heap objects
Total size Sum of all non-free object sizes
Free chunks / bytes Count and size of free space
Heap utilization Percentage of heap that is live objects
Pointer/Byte/Word objects Breakdown by format category
Compiled methods Count of CompiledMethod objects
Weak objects Count of weak/ephemeron objects
Overflow headers Objects with > 254 slots
Pinned / Immutable Count of flagged objects
Max slot count / object size Largest object in the heap
Class table entries Number of classes in the class table
Top 20 classes Most-instantiated classes by count
Reachable / unreachable Object reachability (with --reachability)
One line per non-free object: heap offset, identity hash, class name, and
SHA-256 of the object's content bytes. Designed for diff:
validate_smalltalk_image --export-shasum before.image > before.txt
# ... run tests in Pharo VM ...
validate_smalltalk_image --export-shasum after.image > after.txt
diff before.txt after.txt
Output format:
0000000000000000 0f4d32 UndefinedObject b73a37ba...
0000000000000010 063372 False 59dcf714...
The content hash covers the stable header (classIndex, format, identity hash, slot count — excluding mutable GC flags) plus all body bytes. Objects with changed content produce different hashes. New or deleted objects appear as added or removed lines.
Use --filter-class NAME to restrict output to instances of a specific class.
Full metadata for every non-free object. Default output is aligned text;
use --json for JSON or --csv for CSV.
validate_smalltalk_image --export-catalog image.image
validate_smalltalk_image --export-catalog --json image.image
validate_smalltalk_image --export-catalog --csv image.image
Fields: offset, classIndex, className, format, slots, bytes, identityHash, contentHash, flags (immutable, pinned, overflow).
Prints the full class inheritance tree with method counts and instance
variable counts. Default is indented text; use --json for a flat JSON array.
validate_smalltalk_image --export-hierarchy image.image
Sample output:
ProtoObject (classIndex=3129, methods=51, instVars=0)
Object (classIndex=3161, methods=446, instVars=0)
Collection (classIndex=135, methods=112, instVars=0)
...
Adjacency list of object references. Each pointer object gets one line showing the offsets of all objects it references. Non-pointer objects (bytes, words) are omitted since they have no outgoing references.
validate_smalltalk_image --export-graph image.image
Use --graph-root OFFSET and --graph-depth N to export a subgraph rooted
at a specific object:
validate_smalltalk_image --export-graph --graph-root 1a340 --graph-depth 3 image.image
On a 52 MB Pharo 13 image (740K objects):
--export-shasum: ~0.7s
--export-catalog: ~0.7s
--export-hierarchy: ~0.4s
--export-graph: ~0.4s (full), varies for subgraphs
The --ios-tags flag is for validating the in-memory representation used
by iospharo, not a separate image file
format. Saved .image files are always standard Spur format, interchangeable
with any Pharo VM.
Standard Spur 64-bit uses single-bit tag positions to distinguish immediates:
Tag bits Value Meaning
000 0x0 Object pointer (heap address)
001 0x1 SmallInteger
010 0x2 Character
100 0x4 SmallFloat
On iOS, ASLR can place the heap at arbitrary addresses, so a fast "is this an immediate?" check cannot rely on specific high-address patterns. iospharo shifts the tag encoding so that all immediates have bit 0 set:
Tag bits Value Meaning
000 0x0 Object pointer (heap address)
001 0x1 SmallInteger
011 0x3 Character
101 0x5 SmallFloat
This makes the immediate check a single bit test: (oop & 1) != 0.
iospharo does not define a new image format. The conversion happens at load/save time:
- Load:
ImageLoaderreads a standard Spur.imageand converts standard tags (0x2, 0x4) to iOS tags (0x3, 0x5) as objects enter memory. - Save:
ImageWriterconverts iOS tags back to standard tags before writing. The resulting.imageis byte-identical in format to one saved by the reference Pharo VM.
A freshly downloaded Pharo image has standard tags. A running iospharo process has iOS tags in memory. A saved image from iospharo has standard tags again.
Use --ios-tags only if you are validating a raw memory dump from a
running iospharo process (e.g., captured via a debugger). For normal .image
files — whether saved by iospharo or any other Pharo VM — use the default
(standard) tag mode.
The validator loads the image file and processes it in phases:
- Parse and validate the Spur image header
- Linear heap walk: scan every object from start to end
- Per-object structural checks: format, alignment, overflow headers
- Pointer slot validation: every Oop in every pointer object is checked against the set of known object starts
- Class table: validate hiddenRootsObj pages and cross-reference all objects' class indices
- Special objects array: check critical entries (nil, false, true, SmallInteger class, etc.)
- Class hierarchy: superclass chains (cycle detection), method dictionaries, instance specifications, class names
- Compiled methods: header decoding, literal frames, primitive indices
- Reachability (optional): BFS from roots to find unreachable objects
The validator works on raw file offsets — it does NOT relocate pointers. All
Oop validation uses the original startOfMemory base address as stored in the
image header.
All Spur variants (32-bit and 64-bit) use a 64-bit object header:
Bits 0-21: Class index (22 bits, index into class table)
Bit 23: Immutable flag
Bits 24-28: Object format (5 bits, 0-31)
Bit 29: Remembered flag
Bit 30: Pinned flag
Bit 31: Grey flag (GC)
Bits 32-53: Identity hash (22 bits)
Bit 55: Marked flag (GC)
Bits 56-63: Slot count (0-254, or 255 = overflow header)
Objects with > 254 slots have a two-word header: the preceding 64-bit word contains the actual slot count in its low 56 bits.
On a 52 MB Pharo 13 image (740K objects):
Validation only: ~0.3s
With reachability: ~0.9s
SHA-256 export: ~0.7s
Catalog export: ~0.7s
Hierarchy export: ~0.4s
Graph export: ~0.4s
Other repos in this collection:
- iospharo — Pharo Smalltalk VM for iOS and Mac Catalyst (interpreter-only, low-bit oop encoding for ASLR compatibility).
- pharo-headless-test — Headless Pharo test runner with a fake GUI; clicks menus, takes screenshots, runs SUnit without a display.
- soogle — Smalltalk code search engine that indexes packages across Pharo, Squeak, GemStone and more.
- claude-skills — Open source skills for Claude Code: reusable knowledge and algorithms packaged as
.claude/skills/markdown files.
MIT License. See LICENSE.