fix: async adapter string params + shim routing by avrabe · Pull Request #96 · pulseengine/meld

avrabe · 2026-04-14T03:55:56Z

Summary

Follow-up to PR #95. Fixes task.return shim routing for the component wrapper and adds infrastructure for string param cross-memory copy.

Changes

Export task.return shims from fused module ($task_return_shim_N)
Component wrapper aliases shim exports instead of canonical task.return for shimmed imports (fixes intra-component forwarding through indirect table)
Add string param cross-memory copy via cabi_realloc + memory.copy
Fix component_realloc_index to prefer module 0's realloc

Status

P3 compute functions (prime, fibonacci, factorial, collatz) produce correct results on stock wasmtime. String-returning functions need a resolver-level fix for pointer_pair_positions offset mapping with mixed-type params.

73/73 P2 runtime tests pass.

🤖 Generated with Claude Code

Add cross-memory copy for string/list INPUT params in the async callback adapter. When the call crosses a memory boundary, the adapter allocates in callee memory via cabi_realloc and copies string data from caller memory before calling [async-lift]. Text functions still fail because the text_processor uses intra- component forwarding functions (call_indirect through fixup table) that route to canonical task.return. These forwarding functions bypass our shim. Fix requires the component wrapper to provide shim functions instead of canonical task.return for shimmed imports. 73/73 P2 runtime tests pass. P3 compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Export task.return shims from fused module ($task_return_shim_N) - Component wrapper aliases shim exports instead of using canonical task.return for shimmed imports (fixes intra-component forwarding) - Fix component_realloc_index to prefer module 0's realloc Text functions: shim routing works (no more "invalid task.return signature") but param copy uses wrong pointer_pair_positions for mixed-type params (copies shift instead of text_ptr). Needs the sync adapter's position mapping logic. 73/73 P2 tests pass. P3 compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Export task.return shims ($task_return_shim_N) from fused module - Component wrapper aliases shim exports instead of canonical task.return - Add string param cross-memory copy (caller→callee via cabi_realloc) - Fix component_realloc_index to prefer module 0 - Debug logging for param copy positions Text functions: shim routing works but pointer_pair_positions has wrong offsets for mixed-type params (e.g., caesar(u32, string) reports position [0] instead of [1]). Needs resolver-level fix for async adapter param flattening. 73/73 P2 tests pass. P3 compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Compute pointer pair positions from CALLER's flat param types instead of the CALLEE's component type order. The caller's locals are in canon-lower order which may differ from the callee's component type param order. Results: caesar 3 "hello" → runs without crash (param copy works!) but result not printed (string result delivery incomplete) analyze "hello world" → partial output with garbage pointers (complex record type not fully handled) prime/fibonacci/factorial/collatz → all correct 73/73 P2 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reverting the forwarding function body replacement approach — the forwarding functions have different types than the shims (dispatch type (i32,i32) vs actual task.return type). The wrapper-level shim routing handles the import table correctly. The forwarding function's call_indirect table[N] resolves to the import function, which the wrapper provides as the shim export. The types match for (i32,i32) → () shims (string results). Caesar runs without crash but produces empty output — the shim globals may not be receiving values. Needs deeper table/import tracing. 73/73 P2 tests pass. Compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Patch element segments to replace task.return import references with shim function references. This ensures call_indirect through element-segment-initialized tables calls the shim instead of the stub import. Root cause identified for remaining string function issue: the forwarding function's table index (i32.const 2) uses ORIGINAL component import numbering, but the element segment uses MERGED import numbering. Position 2 in the merged element segment is 'analyze', not 'caesar-cipher' as the forwarding function expects. This is a merger-level table index remapping issue. 73/73 P2 tests pass. Compute functions correct. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix task.return shim matching by using element segment positions instead of name-based matching. Each component's forwarding module has an element segment that maps table positions to task.return imports. After patching these with shim references, build a (comp_idx, func_name) → globals lookup using the correct positions. For components with direct task.return calls (no forwarding module), fall back to name-based matching using original function names extracted from the main module's imports. Results on stock wasmtime 41: prime 7 → 7 is prime ok fibonacci 10 → fibonacci(10) = 55 ok factorial 5 → 120 ok collatz 27 → 111 steps ok uppercase → HELLO ok reverse → dlrow olleh ok caesar → ifmmp ok count_chars → 11 ok search → returns ptr values todo frequencies → out-of-bounds todo 8/10 commands correct. 73/73 P2 tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Plumb component-level result types from CanonicalEntry::TaskReturn through to TaskReturnShimInfo and the adapter, so cross-memory copy of list returns can use element-aware byte counts. For list<T>, byte_count = len * cabi_size_of(T) instead of the previous byte-string assumption. This fixes search-positions (list<u32>: now 4 bytes per element) and unblocks analyze (text-stats record return). Adds cabi_size_align() implementing the canonical ABI size/alignment rules for value types. Result-type plumbing strategy: - For each async callee component, build name -> result_type map from canonical Lift entries (core_func_index -> export name via core_entity_order traversal of CoreInstanceExport aliases, plus type_index -> ComponentTypeKind::Function results). - Match each shim to a result type by original_func_name, falling back to ordered claim from the source's TaskReturn list when the shim's source core import was rerouted to a forwarding function. - Pre-resolve Type(idx) references into self-contained value types via resolve_component_val_type so the adapter doesn't need source-component access at codegen time. Also fixes an index-space bug where merged_idx was compared to import-vector position instead of merged function index, causing original_func_name extraction to fail for any component with non-function imports interleaved before its task-return imports. Adds emit_patch_nested_indirections + collect_indirections for walking list<record-with-string> elements and patching nested (ptr, len) pairs after bulk copy. Currently disabled for nested record types -- the inner cross-memory string copy traps OOB and needs further investigation. Lists of plain types work end-to-end. Results on stock wasmtime 41: prime, fibonacci, factorial, collatz ok transform uppercase/reverse, caesar ok count_chars ok analyze (text-stats record) ok (was: garbage) search (list<u32>) ok (was: panic) frequencies (list<record<string,u32>>) todo (still panics) 9/10 P3 commands correct. 203/203 P2 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two changes to the nested-indirection patching helper: 1. Read (string ptr, len) directly from CALLEE memory at `list_ptr + i*elem_size` instead of from caller memory after bulk copy. The bulk copy is still needed for the record body (count field, etc.), but for indirect fields whose values are callee-side addresses we want the source-of-truth value regardless of what the bulk copy produced. 2. Wrap the inner cross-memory copy in a bounds check. If `old_ptr + buf_len` exceeds the callee's current linear memory size, skip the patch instead of triggering an unrecoverable OOB trap inside the adapter. With the skip in place, frequencies still fails (the unpatched callee pointer panics in the runner), but the failure mode is now contained to the runner side rather than the adapter, and other typed lists with valid pointer fields will be patched correctly. The frequencies failure root cause is still under investigation: loaded values from the apparent-correct callee record offsets exceed the callee memory size, suggesting either the layout is not canonical or the list pointer doesn't point where expected. Search, analyze, and the other 7 P3 commands continue to work. 276/276 P2 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The wit-bindgen async lowering for `list<record>` returning component functions allocates the records buffer via `Cleanup::new`, whose drop guard fires at the end of the async block — between EXIT and when our adapter reads the shim globals. By that point the records buffer has been freed and overwritten with allocator free-list patterns (0xFFFFFFFF sentinels for multi-record cases; less visible corruption for single records because the dealloc is lazy enough for the data to survive). The fix: extend the task.return shim to deep-copy both the records buffer and each indirect string into a fresh callee-side allocation before storing the stabilized pointer to globals. Because the stable buffer is owned by the callee allocator directly (not by the Cleanup guard), it survives the async-block teardown. The adapter's existing cross-memory copy then operates on stable data. Implementation: - generate_stabilizing_shim emits callee-side memcpy for the records buffer, then loops over indirection offsets to copy each sub-buffer (currently strings). Written into the shim body when result_type is list<T> with at least one indirection via collect_indirections. - collect_indirections / cabi_size_align in adapter/fact.rs are now pub(crate) so the shim generator can use them; adapter/fact is also pub(crate). - Adapter retains its memory-size bounds check on the inner patch copy as a safety net for other typed-list cases not yet stabilized. Results on stock wasmtime 41: prime, fibonacci, factorial, collatz ok transform uppercase/reverse, caesar ok analyze (text-stats record) ok search (list<u32>) ok frequencies (list<record<string,u32>>) ok (was: trap / garbage strings) 10/10 P3 commands correct. 276/276 P2 tests pass. Known follow-up: batch (list<u64> input + list<record<string,u64,u64>> output) still returns partial garbage because param-side copy of list<u64> uses byte-count = element-count instead of element-count * 8. Separate bug on the input-side typed copy — will fix in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The async adapter's caller→callee param copy treated len as a byte count, which only matches string (1 byte per code unit). For typed lists like list<u32> or list<u64>, len is the element count and the actual buffer size is len * sizeof(T) — so the adapter was copying too few bytes, truncating the input. Fix: consult `site.requirements.param_copy_layouts` (same source the sync adapter uses) and multiply len by the per-element byte size when allocating and copying. Mirrors the result-side typed sizing that landed in commits 4569b55 / e5cd80f for the output path. Before: `batch 5 7 10` returned (fibonacci(5), ..., collatz(5), then garbage for the next inputs because only the first u64 was copied). After: all 9 records correct. batch 5 10 15 → fibonacci: input=5 output=5 is_prime: input=5 output=1 collatz_steps: input=5 output=5 fibonacci: input=10 output=55 is_prime: input=10 output=0 collatz_steps: input=10 output=6 fibonacci: input=15 output=610 is_prime: input=15 output=0 collatz_steps: input=15 output=17 11/11 P3 commands correct on stock wasmtime 41. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Address the confirmed Mythos finding LS-A-7: the three transcoding emitters (emit_utf8_to_utf16_transcode, emit_utf16_to_utf8_transcode, emit_latin1_to_utf8_transcode) were computing the destination allocation size via LocalGet(len); I32Const(K); I32Mul; Call(realloc) with no upper-bound check on len and no null check on the realloc return. Impact on unfixed code: (a) For len > u32::MAX / K the i32.mul wraps mod 2^32 to a small alloc_size; cabi_realloc returns a short buffer; the transcode loop writes up to K*len bytes out of bounds. (b) For OOM, cabi_realloc returns 0; without a null check the transcode loop writes into callee memory starting at offset 0. This commit adds two guards to each emitter: 1. Before the multiply: LocalGet(len); I32Const(u32::MAX/K); I32GtU; If; Unreachable; End 2. After the realloc call: LocalGet(out_ptr); I32Eqz; If; Unreachable; End Both guards trap via wasm `unreachable`, which matches the Canonical ABI's requirement that lift/lower trap rather than silently overwrite. Added three byte-scan regression tests (LS-A-7 PoC): - ls_a_7_utf8_to_utf16_emits_overflow_and_null_guards - ls_a_7_utf16_to_utf8_emits_overflow_and_null_guards - ls_a_7_latin1_to_utf8_emits_overflow_and_null_guards Each test emits the corresponding transcode body and scans the raw bytes for the two guard sequences. They fail on the unfixed emitter and pass once both guards are present. Also appended LS-A-7 as `approved` to safety/stpa/loss-scenarios.yaml. 169/169 lib tests pass. 11/11 P3 commands correct on stock wasmtime 41. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Commit fcad26a fixed the three transcoding emitters. A follow-up audit found 15 more sites across adapter/fact.rs and lib.rs with the same class of bug — an i32.mul on a caller- or callee-controlled length feeding cabi_realloc, with no overflow check and no null check on the return. Every fused adapter emitted by the affected sites was potentially exploitable for out-of-bounds writes into the target memory (wrap-to-small on large len; write-at-zero on OOM). Introduces two pub(crate) helpers in adapter/fact.rs: emit_checked_realloc(body, realloc_func, result_local) consume 4 stack-pushed realloc args, store result, trap via unreachable if the pointer is 0 (LS-A-7 leg b). emit_overflow_guard(body, len_local, k) trap via unreachable if len_local > u32::MAX / k before the multiply that feeds the allocation (LS-A-7 leg a). No-op when k <= 1. Applies both helpers at the audited sites. HIGH-severity (caller-len drives the mul): 13 sites including both bulk copies and variant-arm copies in generate_memory_copy_adapter, generate_params_ptr_adapter, generate_retptr_adapter, emit_inner_pointer_fixup, generate_async_callback_adapter, plus the stabilizing shim in lib.rs. Null-check only: 3 sites where length is constant or bounded (emit_patch_nested_indirections, a single-alloc result path in generate_memory_copy_adapter, and the outer params-area alloc in generate_params_ptr_adapter). 169/169 lib tests pass (including the 3 LS-A-7 byte-scan regressions) 11/11 P3 commands correct on stock wasmtime 41 clippy clean Follow-up: add a cross-file CI lint that scans any emitted adapter for a bare Call(realloc) without an immediately following null guard — catches future regressions without requiring each site to be re-audited. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… math Mythos discover on attestation.rs flagged real correctness bugs but couldn't satisfy the oracle because every defect depends on SystemTime::now(), which Kani can't model symbolically. This commit isolates the SystemTime dependency behind thin wrappers: chrono_timestamp() -> SystemTime::now() -> chrono_timestamp_from(secs: u64) generate_uuid() -> SystemTime::now() -> generate_uuid_from(entropy: u128) The pure functions become directly testable with pinned inputs. Additionally, chrono_timestamp_from now uses Howard Hinnant's civil_from_days algorithm (400-year Gregorian cycle) instead of the broken 365-days-per-year + 30-days-per-month approximation. The old code drifted one day every 4 years and produced invalid ISO 8601 dates like 2026-02-30 because every month was modeled as 30 days. New pinned tests (all 6 passing): - test_chrono_timestamp_from_epoch (1970-01-01T00:00:00Z) - test_chrono_timestamp_from_2025_new_year (boundary after 2024 leap) - test_chrono_timestamp_from_2025_march_boundary - test_chrono_timestamp_from_2024_leap_march (Feb 29 -> Mar 1) - test_generate_uuid_from_pinned_zero - test_generate_uuid_from_distinct_entropy_differs All 16 attestation tests pass. No external crate dependency added. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the portable Mythos-style pipeline scripts that were used to discover LS-A-7 (transcoder overflow) and its class-wide siblings: scripts/mythos/HOWTO.md the 4-phase workflow + rationale scripts/mythos/rank.md per-file threat-model ranking rubric scripts/mythos/discover.md Mythos-verbatim discovery prompt scripts/mythos/validate.md fresh-validator prompt scripts/mythos/emit.md loss-scenario YAML template First run outcome: 1 confirmed vulnerability + 15 class-sibling sites found and fixed on this branch. Keeping the scripts in-tree so future runs can replay the pipeline against new code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds meld-core/tests/realloc_safety.rs as a cross-cutting regression gate for loss-scenario LS-A-7. Where the 3 byte-scan tests in adapter/fact.rs exercise each transcode emitter individually, this test fuses two programmatic components end-to-end and then walks every function body in the output, verifying that every call whose target is a cabi_realloc-family function is followed by a contiguous i32.eqz; if; unreachable; end null guard within 8 instructions. The intent is to catch future regressions without requiring every new emitter to be audited or have a bespoke byte-scan test — any site that forgets the guard shows up as a concrete (function_idx, byte_offset, target) entry when the test fails. Fixtures are a self-contained caller+callee pair (string -> u32) reused from the SR-12 shape in tests/adapter_safety.rs, built inline with wasm-encoder. No filesystem assets. Test name: ls_a_7_every_realloc_call_has_null_guard Run: cargo test --test realloc_safety On the current branch tip this scans 1 call site and passes; the test guards against vacuous success by asserting at least one realloc index was resolved and at least one call was scanned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

avrabe and others added 15 commits April 13, 2026 21:25

This was referenced Apr 21, 2026

Cosmetic: compare_version drops pre-release suffixes #98

Open

Design: resource-fallback sentinel in build_resource_type_to_import #99

Open

avrabe merged commit 30cf088 into main Apr 21, 2026
4 checks passed

avrabe deleted the fix/async-string-params-and-issue-92 branch April 21, 2026 19:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: async adapter string params + shim routing#96

fix: async adapter string params + shim routing#96
avrabe merged 16 commits intomainfrom
fix/async-string-params-and-issue-92

avrabe commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented Apr 14, 2026

Summary

Changes

Status

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant