Motivation
Borrowed from lua-cjson's test methodology, complementing (not overlapping) qjson's existing proptest + JSONTestSuite coverage:
Exhaustive enumeration of the encoding space. lua-cjson's tests/genutf8.pl generates every UTF-16 code point and every surrogate pair, plus all 0–255 octets, and roundtrips them. qjson currently relies on proptest (random sampling) for the string/escape space — statistically strong, but it can miss specific boundary code points (e.g. a single mishandled value near U+10FFFF, the surrogate range D800–DFFF, or a specific \uXXXX pair). A deterministic enumeration test gives exact, reproducible boundary coverage and pinpoints regressions instantly.
Note: qjson already has data-driven file-corpus tests (tests/json_test_suite.rs, tests/third_party_fixtures.rs) that assert accept/reject + roundtrip outcomes. This issue is specifically about exhaustive value-space enumeration, which those corpora do not provide.
Proposed work
Exhaustive roundtrip test (deterministic)
- Enumerate all Unicode scalar values (
0x0000..=0x10FFFF minus the surrogate range) and assert:
- encode → decode roundtrip is identity;
\uXXXX and surrogate-pair escape decoding produces the correct scalar (and lone/invalid surrogates are rejected in EAGER, surface lazily in LAZY).
- Enumerate all 0–255 raw bytes for the string-content / control-char / UTF-8 validation paths.
- Run under both scalar and AVX2/NEON string validators (
src/validate/strings/) so it doubles as a backend cross-check, like scanner_crosscheck.rs.
Affected files
tests/ (new exhaustive roundtrip test, e.g. tests/unicode_exhaustive.rs)
Complements #65 (Lua encode error coverage) — this is about value-space roundtrip correctness.
Motivation
Borrowed from lua-cjson's test methodology, complementing (not overlapping) qjson's existing proptest + JSONTestSuite coverage:
Exhaustive enumeration of the encoding space. lua-cjson's
tests/genutf8.plgenerates every UTF-16 code point and every surrogate pair, plus all 0–255 octets, and roundtrips them. qjson currently relies on proptest (random sampling) for the string/escape space — statistically strong, but it can miss specific boundary code points (e.g. a single mishandled value nearU+10FFFF, the surrogate rangeD800–DFFF, or a specific\uXXXXpair). A deterministic enumeration test gives exact, reproducible boundary coverage and pinpoints regressions instantly.Note: qjson already has data-driven file-corpus tests (
tests/json_test_suite.rs,tests/third_party_fixtures.rs) that assert accept/reject + roundtrip outcomes. This issue is specifically about exhaustive value-space enumeration, which those corpora do not provide.Proposed work
Exhaustive roundtrip test (deterministic)
0x0000..=0x10FFFFminus the surrogate range) and assert:\uXXXXand surrogate-pair escape decoding produces the correct scalar (and lone/invalid surrogates are rejected in EAGER, surface lazily in LAZY).src/validate/strings/) so it doubles as a backend cross-check, likescanner_crosscheck.rs.Affected files
tests/(new exhaustive roundtrip test, e.g.tests/unicode_exhaustive.rs)Complements #65 (Lua encode error coverage) — this is about value-space roundtrip correctness.