Skip to content

arm backend uses soft-float for all f32/f64 — hardware FPU never used (cortex-m4f/m7/m7dp identical); blocks REQ-PIX-001 hard-float #369

@avrabe

Description

@avrabe

arm backend emits soft-float for all f32/f64 — the Cortex-M hardware FPU is never used (m4f/m7/m7dp identical)

synth compiles every floating-point op to inline integer soft-float; it never emits VFP/FPU instructions, so the cortex-m4f / cortex-m7 / cortex-m7dp targets are FPU-in-name-only and produce byte-identical output. For a floating-point-heavy real-time flight core (falcon: 2044 f32 + 480 f64 ops) on a board whose whole point is the hardware FPU, this is a major on-target performance gap.

Reproduce (synth v0.11.45)

# any falcon fused core; committed public artifact:
#   pulseengine/jess repro/synth-underflow/falcon-v1.56.fused.wasm
synth compile falcon-v1.56.fused.wasm --target cortex-m7dp --cortex-m -o out.elf
arm-none-eabi-objdump -d out.elf | grep -cE 'v(add|sub|mul|div|ldr|str|cvt)\.(f32|f64)'
#   => 0   (no VFP single OR double instructions)

Measured on the falcon-v1.66 loom-fused core (14954-instruction ELF): 0 VFP insns, 0 __aeabi_d* soft-float library calls — the FP is all inline integer (op mix dominated by ldr.w/str.w/movw/add.w/cmp/ite). cortex-m7 (sp) and cortex-m7dp (dp) outputs are byte-identical (58682 B each), confirming the target's FPU is not consulted.

Impact (jess / Pixhawk 6X-RT, REQ-PIX-001 / AFD-024)

The RT1176 M7 has an fpv5-d16 double-precision FPU; the STM32F4 (Phase-1) has fpv4-sp. falcon's cascade is FP-heavy and hard-real-time. Soft-float will be far slower than the hardware FPU and may threaten the control-loop deadline on-target — and it makes REQ-PIX-001's "fpv5-d16 hard-float" unachievable as specified. There is also no --fpu/hard-float flag and the ELF carries no ARM FP build attributes (Tag_FP_arch), so a consumer can't even tell the FPU mode from the artifact.

Suggested fix

On FPU targets (cortex-m4f → fpv4-sp; cortex-m7 → fpv5-sp; cortex-m7dp → fpv5-d16): lower f32/f64 ops to hardware VFP (vadd/vmul/vldr/vstr/vcvt on s/d registers) instead of inline soft-float, and emit the matching ARM EABI FP build attributes (Tag_FP_arch, Tag_ABI_VFP_args) so the mode is verifiable. Keep soft-float for the no-FPU targets (cortex-m3, rv32i). Happy to test against the committed falcon core same-day — falcon is a good FP-heavy fixture.

Filed from the jess feature loop (REQ-PIX-001); tracked jess-side as AFD-024.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions