fix: async adapter string params + shim routing#96
Merged
Conversation
Add cross-memory copy for string/list INPUT params in the async callback adapter. When the call crosses a memory boundary, the adapter allocates in callee memory via cabi_realloc and copies string data from caller memory before calling [async-lift]. Text functions still fail because the text_processor uses intra- component forwarding functions (call_indirect through fixup table) that route to canonical task.return. These forwarding functions bypass our shim. Fix requires the component wrapper to provide shim functions instead of canonical task.return for shimmed imports. 73/73 P2 runtime tests pass. P3 compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Export task.return shims from fused module ($task_return_shim_N) - Component wrapper aliases shim exports instead of using canonical task.return for shimmed imports (fixes intra-component forwarding) - Fix component_realloc_index to prefer module 0's realloc Text functions: shim routing works (no more "invalid task.return signature") but param copy uses wrong pointer_pair_positions for mixed-type params (copies shift instead of text_ptr). Needs the sync adapter's position mapping logic. 73/73 P2 tests pass. P3 compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Export task.return shims ($task_return_shim_N) from fused module - Component wrapper aliases shim exports instead of canonical task.return - Add string param cross-memory copy (caller→callee via cabi_realloc) - Fix component_realloc_index to prefer module 0 - Debug logging for param copy positions Text functions: shim routing works but pointer_pair_positions has wrong offsets for mixed-type params (e.g., caesar(u32, string) reports position [0] instead of [1]). Needs resolver-level fix for async adapter param flattening. 73/73 P2 tests pass. P3 compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compute pointer pair positions from CALLER's flat param types
instead of the CALLEE's component type order. The caller's locals
are in canon-lower order which may differ from the callee's
component type param order.
Results:
caesar 3 "hello" → runs without crash (param copy works!)
but result not printed (string result delivery incomplete)
analyze "hello world" → partial output with garbage pointers
(complex record type not fully handled)
prime/fibonacci/factorial/collatz → all correct
73/73 P2 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverting the forwarding function body replacement approach — the forwarding functions have different types than the shims (dispatch type (i32,i32) vs actual task.return type). The wrapper-level shim routing handles the import table correctly. The forwarding function's call_indirect table[N] resolves to the import function, which the wrapper provides as the shim export. The types match for (i32,i32) → () shims (string results). Caesar runs without crash but produces empty output — the shim globals may not be receiving values. Needs deeper table/import tracing. 73/73 P2 tests pass. Compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Patch element segments to replace task.return import references with shim function references. This ensures call_indirect through element-segment-initialized tables calls the shim instead of the stub import. Root cause identified for remaining string function issue: the forwarding function's table index (i32.const 2) uses ORIGINAL component import numbering, but the element segment uses MERGED import numbering. Position 2 in the merged element segment is 'analyze', not 'caesar-cipher' as the forwarding function expects. This is a merger-level table index remapping issue. 73/73 P2 tests pass. Compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix task.return shim matching by using element segment positions instead of name-based matching. Each component's forwarding module has an element segment that maps table positions to task.return imports. After patching these with shim references, build a (comp_idx, func_name) → globals lookup using the correct positions. For components with direct task.return calls (no forwarding module), fall back to name-based matching using original function names extracted from the main module's imports. Results on stock wasmtime 41: prime 7 → 7 is prime ok fibonacci 10 → fibonacci(10) = 55 ok factorial 5 → 120 ok collatz 27 → 111 steps ok uppercase → HELLO ok reverse → dlrow olleh ok caesar → ifmmp ok count_chars → 11 ok search → returns ptr values todo frequencies → out-of-bounds todo 8/10 commands correct. 73/73 P2 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Plumb component-level result types from CanonicalEntry::TaskReturn
through to TaskReturnShimInfo and the adapter, so cross-memory copy
of list returns can use element-aware byte counts.
For list<T>, byte_count = len * cabi_size_of(T) instead of the
previous byte-string assumption. This fixes search-positions
(list<u32>: now 4 bytes per element) and unblocks analyze
(text-stats record return). Adds cabi_size_align() implementing
the canonical ABI size/alignment rules for value types.
Result-type plumbing strategy:
- For each async callee component, build name -> result_type map
from canonical Lift entries (core_func_index -> export name via
core_entity_order traversal of CoreInstanceExport aliases, plus
type_index -> ComponentTypeKind::Function results).
- Match each shim to a result type by original_func_name, falling
back to ordered claim from the source's TaskReturn list when
the shim's source core import was rerouted to a forwarding
function.
- Pre-resolve Type(idx) references into self-contained value
types via resolve_component_val_type so the adapter doesn't
need source-component access at codegen time.
Also fixes an index-space bug where merged_idx was compared to
import-vector position instead of merged function index, causing
original_func_name extraction to fail for any component with
non-function imports interleaved before its task-return imports.
Adds emit_patch_nested_indirections + collect_indirections for
walking list<record-with-string> elements and patching nested
(ptr, len) pairs after bulk copy. Currently disabled for nested
record types -- the inner cross-memory string copy traps OOB and
needs further investigation. Lists of plain types work end-to-end.
Results on stock wasmtime 41:
prime, fibonacci, factorial, collatz ok
transform uppercase/reverse, caesar ok
count_chars ok
analyze (text-stats record) ok (was: garbage)
search (list<u32>) ok (was: panic)
frequencies (list<record<string,u32>>) todo (still panics)
9/10 P3 commands correct. 203/203 P2 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes to the nested-indirection patching helper:
1. Read (string ptr, len) directly from CALLEE memory at
`list_ptr + i*elem_size` instead of from caller memory after
bulk copy. The bulk copy is still needed for the record body
(count field, etc.), but for indirect fields whose values are
callee-side addresses we want the source-of-truth value
regardless of what the bulk copy produced.
2. Wrap the inner cross-memory copy in a bounds check. If
`old_ptr + buf_len` exceeds the callee's current linear memory
size, skip the patch instead of triggering an unrecoverable
OOB trap inside the adapter. With the skip in place, frequencies
still fails (the unpatched callee pointer panics in the runner),
but the failure mode is now contained to the runner side rather
than the adapter, and other typed lists with valid pointer
fields will be patched correctly.
The frequencies failure root cause is still under investigation:
loaded values from the apparent-correct callee record offsets
exceed the callee memory size, suggesting either the layout is not
canonical or the list pointer doesn't point where expected. Search,
analyze, and the other 7 P3 commands continue to work.
276/276 P2 tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wit-bindgen async lowering for `list<record>` returning component functions allocates the records buffer via `Cleanup::new`, whose drop guard fires at the end of the async block — between EXIT and when our adapter reads the shim globals. By that point the records buffer has been freed and overwritten with allocator free-list patterns (0xFFFFFFFF sentinels for multi-record cases; less visible corruption for single records because the dealloc is lazy enough for the data to survive). The fix: extend the task.return shim to deep-copy both the records buffer and each indirect string into a fresh callee-side allocation before storing the stabilized pointer to globals. Because the stable buffer is owned by the callee allocator directly (not by the Cleanup guard), it survives the async-block teardown. The adapter's existing cross-memory copy then operates on stable data. Implementation: - generate_stabilizing_shim emits callee-side memcpy for the records buffer, then loops over indirection offsets to copy each sub-buffer (currently strings). Written into the shim body when result_type is list<T> with at least one indirection via collect_indirections. - collect_indirections / cabi_size_align in adapter/fact.rs are now pub(crate) so the shim generator can use them; adapter/fact is also pub(crate). - Adapter retains its memory-size bounds check on the inner patch copy as a safety net for other typed-list cases not yet stabilized. Results on stock wasmtime 41: prime, fibonacci, factorial, collatz ok transform uppercase/reverse, caesar ok analyze (text-stats record) ok search (list<u32>) ok frequencies (list<record<string,u32>>) ok (was: trap / garbage strings) 10/10 P3 commands correct. 276/276 P2 tests pass. Known follow-up: batch (list<u64> input + list<record<string,u64,u64>> output) still returns partial garbage because param-side copy of list<u64> uses byte-count = element-count instead of element-count * 8. Separate bug on the input-side typed copy — will fix in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The async adapter's caller→callee param copy treated len as a byte count, which only matches string (1 byte per code unit). For typed lists like list<u32> or list<u64>, len is the element count and the actual buffer size is len * sizeof(T) — so the adapter was copying too few bytes, truncating the input. Fix: consult `site.requirements.param_copy_layouts` (same source the sync adapter uses) and multiply len by the per-element byte size when allocating and copying. Mirrors the result-side typed sizing that landed in commits 4569b55 / e5cd80f for the output path. Before: `batch 5 7 10` returned (fibonacci(5), ..., collatz(5), then garbage for the next inputs because only the first u64 was copied). After: all 9 records correct. batch 5 10 15 → fibonacci: input=5 output=5 is_prime: input=5 output=1 collatz_steps: input=5 output=5 fibonacci: input=10 output=55 is_prime: input=10 output=0 collatz_steps: input=10 output=6 fibonacci: input=15 output=610 is_prime: input=15 output=0 collatz_steps: input=15 output=17 11/11 P3 commands correct on stock wasmtime 41. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address the confirmed Mythos finding LS-A-7: the three transcoding
emitters (emit_utf8_to_utf16_transcode, emit_utf16_to_utf8_transcode,
emit_latin1_to_utf8_transcode) were computing the destination
allocation size via LocalGet(len); I32Const(K); I32Mul; Call(realloc)
with no upper-bound check on len and no null check on the realloc
return.
Impact on unfixed code:
(a) For len > u32::MAX / K the i32.mul wraps mod 2^32 to a small
alloc_size; cabi_realloc returns a short buffer; the transcode
loop writes up to K*len bytes out of bounds.
(b) For OOM, cabi_realloc returns 0; without a null check the
transcode loop writes into callee memory starting at offset 0.
This commit adds two guards to each emitter:
1. Before the multiply:
LocalGet(len); I32Const(u32::MAX/K); I32GtU; If; Unreachable; End
2. After the realloc call:
LocalGet(out_ptr); I32Eqz; If; Unreachable; End
Both guards trap via wasm `unreachable`, which matches the Canonical
ABI's requirement that lift/lower trap rather than silently overwrite.
Added three byte-scan regression tests (LS-A-7 PoC):
- ls_a_7_utf8_to_utf16_emits_overflow_and_null_guards
- ls_a_7_utf16_to_utf8_emits_overflow_and_null_guards
- ls_a_7_latin1_to_utf8_emits_overflow_and_null_guards
Each test emits the corresponding transcode body and scans the raw
bytes for the two guard sequences. They fail on the unfixed emitter
and pass once both guards are present.
Also appended LS-A-7 as `approved` to safety/stpa/loss-scenarios.yaml.
169/169 lib tests pass. 11/11 P3 commands correct on stock wasmtime 41.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Commit fcad26a fixed the three transcoding emitters. A follow-up audit found 15 more sites across adapter/fact.rs and lib.rs with the same class of bug — an i32.mul on a caller- or callee-controlled length feeding cabi_realloc, with no overflow check and no null check on the return. Every fused adapter emitted by the affected sites was potentially exploitable for out-of-bounds writes into the target memory (wrap-to-small on large len; write-at-zero on OOM). Introduces two pub(crate) helpers in adapter/fact.rs: emit_checked_realloc(body, realloc_func, result_local) consume 4 stack-pushed realloc args, store result, trap via unreachable if the pointer is 0 (LS-A-7 leg b). emit_overflow_guard(body, len_local, k) trap via unreachable if len_local > u32::MAX / k before the multiply that feeds the allocation (LS-A-7 leg a). No-op when k <= 1. Applies both helpers at the audited sites. HIGH-severity (caller-len drives the mul): 13 sites including both bulk copies and variant-arm copies in generate_memory_copy_adapter, generate_params_ptr_adapter, generate_retptr_adapter, emit_inner_pointer_fixup, generate_async_callback_adapter, plus the stabilizing shim in lib.rs. Null-check only: 3 sites where length is constant or bounded (emit_patch_nested_indirections, a single-alloc result path in generate_memory_copy_adapter, and the outer params-area alloc in generate_params_ptr_adapter). 169/169 lib tests pass (including the 3 LS-A-7 byte-scan regressions) 11/11 P3 commands correct on stock wasmtime 41 clippy clean Follow-up: add a cross-file CI lint that scans any emitted adapter for a bare Call(realloc) without an immediately following null guard — catches future regressions without requiring each site to be re-audited. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… math Mythos discover on attestation.rs flagged real correctness bugs but couldn't satisfy the oracle because every defect depends on SystemTime::now(), which Kani can't model symbolically. This commit isolates the SystemTime dependency behind thin wrappers: chrono_timestamp() -> SystemTime::now() -> chrono_timestamp_from(secs: u64) generate_uuid() -> SystemTime::now() -> generate_uuid_from(entropy: u128) The pure functions become directly testable with pinned inputs. Additionally, chrono_timestamp_from now uses Howard Hinnant's civil_from_days algorithm (400-year Gregorian cycle) instead of the broken 365-days-per-year + 30-days-per-month approximation. The old code drifted one day every 4 years and produced invalid ISO 8601 dates like 2026-02-30 because every month was modeled as 30 days. New pinned tests (all 6 passing): - test_chrono_timestamp_from_epoch (1970-01-01T00:00:00Z) - test_chrono_timestamp_from_2025_new_year (boundary after 2024 leap) - test_chrono_timestamp_from_2025_march_boundary - test_chrono_timestamp_from_2024_leap_march (Feb 29 -> Mar 1) - test_generate_uuid_from_pinned_zero - test_generate_uuid_from_distinct_entropy_differs All 16 attestation tests pass. No external crate dependency added. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the portable Mythos-style pipeline scripts that were used to discover LS-A-7 (transcoder overflow) and its class-wide siblings: scripts/mythos/HOWTO.md the 4-phase workflow + rationale scripts/mythos/rank.md per-file threat-model ranking rubric scripts/mythos/discover.md Mythos-verbatim discovery prompt scripts/mythos/validate.md fresh-validator prompt scripts/mythos/emit.md loss-scenario YAML template First run outcome: 1 confirmed vulnerability + 15 class-sibling sites found and fixed on this branch. Keeping the scripts in-tree so future runs can replay the pipeline against new code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 21, 2026
Adds meld-core/tests/realloc_safety.rs as a cross-cutting regression gate for loss-scenario LS-A-7. Where the 3 byte-scan tests in adapter/fact.rs exercise each transcode emitter individually, this test fuses two programmatic components end-to-end and then walks every function body in the output, verifying that every call whose target is a cabi_realloc-family function is followed by a contiguous i32.eqz; if; unreachable; end null guard within 8 instructions. The intent is to catch future regressions without requiring every new emitter to be audited or have a bespoke byte-scan test — any site that forgets the guard shows up as a concrete (function_idx, byte_offset, target) entry when the test fails. Fixtures are a self-contained caller+callee pair (string -> u32) reused from the SR-12 shape in tests/adapter_safety.rs, built inline with wasm-encoder. No filesystem assets. Test name: ls_a_7_every_realloc_call_has_null_guard Run: cargo test --test realloc_safety On the current branch tip this scans 1 call site and passes; the test guards against vacuous success by asserting at least one realloc index was resolved and at least one call was scanned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to PR #95. Fixes task.return shim routing for the component wrapper and adds infrastructure for string param cross-memory copy.
Changes
$task_return_shim_N)task.returnfor shimmed imports (fixes intra-component forwarding through indirect table)cabi_realloc+memory.copycomponent_realloc_indexto prefer module 0's reallocStatus
P3 compute functions (prime, fibonacci, factorial, collatz) produce correct results on stock wasmtime. String-returning functions need a resolver-level fix for
pointer_pair_positionsoffset mapping with mixed-type params.73/73 P2 runtime tests pass.
🤖 Generated with Claude Code