Skip to content

test: zero tests verify compiled code produces correct results #74

@avrabe

Description

@avrabe

Summary

The entire Rust test suite (~895 tests) checks structural properties — opcode names, ELF format validity, no-crash — but none verify that compiled code actually computes correct results. There are no semantic correctness tests.


Current state

  • Structural tests: verify things like "the output contains an ADD instruction", "the ELF header is valid", "compilation does not panic"
  • Semantic tests: zero tests that compile a WASM program, execute the ARM output (or simulate it), and verify the result is correct
  • Renode tests (Bazel-only, tests/ directory): these may test end-to-end semantics via ARM Cortex-M4 emulation, but they are not part of the standard cargo test workflow

Why this matters

This is how issues like the silent NOP catch-all (#35) go undetected — if no test ever checks "did i32.add 2 3 produce 5?", a codegen bug that emits NOP instead of ADD passes the entire test suite.

The Rocq proofs provide formal assurance for a subset of operations, but as documented in the proof divergence issue, the proofs verify a different program than the Rust compiler actually produces. Without semantic tests, there is no automated check that the shipped compiler produces correct output for any operation.


Recommendation

Add tests that:

  1. Compile simple WASM programs (single-function, single-operation)
  2. Either simulate the ARM output or use a known-good interpreter to verify correctness
  3. At minimum, cover the i32 operations that have Rocq proofs (add, sub, mul, div, rem, and/or/xor, shl/shr, eq/ne/lt/gt/le/ge)

A lightweight approach: for each WASM op, compile (i32.const A) (i32.const B) (i32.OP) and verify the ARM instruction sequence would produce the correct result via a simple Thumb-2 interpreter or by cross-checking register values against expected outputs.

This would also serve as a regression test to detect any future divergence between the Rocq proofs and the Rust implementation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    qualityCode quality improvementstestingTest coverage and test infrastructure

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions