Skip to content

feat(#374): lower memory.copy/memory.fill (bulk-memory) — bounds-checked, trap-correct + v0.11.49#376

Merged
avrabe merged 3 commits into
mainfrom
feat/374-bulk-memory
Jun 19, 2026
Merged

feat(#374): lower memory.copy/memory.fill (bulk-memory) — bounds-checked, trap-correct + v0.11.49#376
avrabe merged 3 commits into
mainfrom
feat/374-bulk-memory

Conversation

@avrabe

@avrabe avrabe commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

#374 — bulk-memory lowering (memory.copy / memory.fill)

Closes the largest remaining falcon on-target gap: 19 sites (11 memory.fill + 8 memory.copy). Like the scalar floats (#369) and the former i64.load/store (#372), bulk-memory fell through the decoder _ => None and (since v0.11.46 / GI-FPU-001) loud-skipped the whole function. Unlike #372 the lowering did not exist at all.

What landed

  • WasmOp::MemoryCopy / MemoryFill + decoder arms (memory 0 only — a non-zero memory index still loud-skips, preserving the GI-FPU-001 honesty contract)
  • stack effect pop 3, push 0
  • optimizer decline → direct-selector fallback (#120/#188/#372 pattern)
  • bounds-checked byte-loop lowering in select_with_stack:
    • fillSTRB loop writing the low byte of val
    • copymemmove byte loop; dst > src copies backward so overlapping copies don't corrupt the source
    • --safety-bounds software: inline UDF guarded by a local skip branch, end-exclusive trap (off+len > size or u32 overflow; == size ok), matching wasmtime. (Self-contained image doesn't relocate a body Trap_Handler branch, so UDF; on silicon UsageFault/HardFault routes to Trap_Handler via the vector table.)
    • 3 dead popped operands reused as walking pointers — only R12 extra, no temp allocation

Verification

  • scripts/repro/bulk_memory_374_differential.py: 16/16 vs wasmtime over discriminating vectors (forward, overlap dst>src backward, overlap dst<src forward, self-copy, len==0, dst+len==size & src+len==size boundaries NO-trap, OOB dst/src TRAP, low-byte fill) — compares full 64 KiB image and trap outcome.
  • Frozen-safe: control_step 0x00210A55 (13/13), flight_seam 0x07FDF307, div_const 338/338 byte-identical.
  • Unit tests: decoder / stack-check / selector (*_374).
  • RISC-V continues to loud-skip bulk-mem (symbol absent, warning names the op) — silent-drop class not reintroduced.
  • cargo test --workspace --exclude synth-verify green; fmt + clippy clean; rivet GI-MEM-002 (+VER-001), non-xref errors 0.

Release gate

v0.11.49 is prepped (pin sweep + CHANGELOG). Holding the tag for gale's falcon-v1.56.fused.wasm G474RE round-trip before closing #374 — the one path unicorn can't see is UDF → UsageFault/HardFault → vector-table → Trap_Handler (the in-bounds copy/fill loop is the high-fidelity, solidly-verified part).

🤖 Generated with Claude Code

avrabe and others added 3 commits June 19, 2026 06:30
19 falcon sites (11 fill + 8 copy). Same _ => None class as #369/#372; loud-skips
on v0.11.47. No WasmOp::MemoryCopy/MemoryFill, no decoder arm, no lowering — a real
lowering to build (decode + stack-effect + bounds-checked copy/fill loop with
memmove-overlap + OOB trap). Repro only; fix = next block, own release.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ked, trap-correct

memory.copy/memory.fill fell through the decoder `_ => None` and (since
v0.11.46/GI-FPU-001) loud-skipped the whole function — 19 falcon sites, the
largest remaining bulk-mem gap. Unlike #372 the lowering did not exist at all.

- WasmOp::MemoryCopy / MemoryFill + decoder arms (memory 0 only; non-zero
  memory index loud-skips, preserving the GI-FPU-001 honesty contract)
- stack effect: pop 3, push 0 (wasm_stack_check)
- optimizer decline -> direct selector fallback (#120/#188/#372 pattern)
- select_with_stack lowering: fill = STRB byte loop (low byte of val); copy =
  memmove byte loop with direction by dst/src order (dst>src copies backward).
  Bounds (Software mode) trap via inline UDF guarded by a LOCAL skip branch,
  end-EXCLUSIVE (off+len>size or u32-overflow traps; ==size ok), matching
  wasmtime. The 3 dead popped operands are reused as walking pointers (only R12
  extra) — no temp allocation.

Gate (value-level, no silicon): bulk_memory_374_differential.py 16/16 vs
wasmtime over discriminating vectors (forward, overlap dst>src backward, overlap
dst<src forward, self-copy, len==0, dst/src+len==size boundaries NO trap, OOB
dst/src TRAP, low-byte fill). Frozen-safe: control_step 0x00210A55 13/13,
flight_seam 0x07FDF307, div_const 338/338 byte-identical. Unit tests for
decoder/stack-check/selector. rivet GI-MEM-002 (+VER-001).

Falcon silicon (falcon-v1.56.fused.wasm) gates the release before #374 closes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…sweep + changelog

Pin sweep 0.11.48 -> 0.11.49 (workspace.package + 10 path-dep pins + MODULE.bazel
+ Cargo.lock synth-* packages). CHANGELOG v0.11.49 with falsification.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@avrabe avrabe merged commit 39010ce into main Jun 19, 2026
13 checks passed
@avrabe avrabe deleted the feat/374-bulk-memory branch June 19, 2026 05:22
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.40162% with 69 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/synth-synthesis/src/instruction_selector.rs 80.54% 64 Missing ⚠️
crates/synth-synthesis/src/optimizer_bridge.rs 50.00% 4 Missing ⚠️
crates/synth-core/src/wasm_stack_check.rs 92.85% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant