Skip to content

Fix postfix-sep handling for optional elements with discriminators#1680

Open
olabusayoT wants to merge 1 commit into
apache:mainfrom
olabusayoT:daf-3085-usmtf-bug
Open

Fix postfix-sep handling for optional elements with discriminators#1680
olabusayoT wants to merge 1 commit into
apache:mainfrom
olabusayoT:daf-3085-usmtf-bug

Conversation

@olabusayoT

Copy link
Copy Markdown
Contributor
  • Primary bug (seq2): in a postfix-separated anyEmpty sequence, when an optional complex element contains a dfdl:discriminator that evaluates to true (committing to that element) and the child parses successfully, but the postfix separator is subsequently not found, Daffodil aborted with an internal Assert.invariant failure instead of producing a clean "Failed to find postfix separator" parse error. The discriminator had already resolved the outer PoU, so the subsequent unconditional Assert.invariant(!pstate.isPointOfUncertaintyResolved(pou)) in the child-successful+sep-failed path fired. This was fixed by checking isPointOfUncertaintyResolved before deciding to resetToPointOfUncertainty (undo the entire parse attempt since the mark was created) vs resolvePointOfUncertainty (commit to whatever was just parsed, just clean up the checkpoint).
  • Related bug (seq3): in the same postfix+anyEmpty configuration, an optional simple-delimited element with zero-length content followed by its separator which should be AbsentRep per DFDL spec 9.2.4 was not recognized as absent, causing the sequence to stop prematurely instead of continuing to the next child. Two necessary fixes:
    -- PostfixSeparatorHelper's child-failed+sep-found path called computeParseAttemptStatus with the post-separator bit position, making the isZL flag incorrect. Switch to computeFailedParseAttemptStatus with hasZLChildAttempt (captured before sep.parse1). A second Assert.invariant in the isSimpleDelimited guard of the same path is replaced by folding the PoU-resolved check into the condition. Then promote MissingItem to AbsentRep for non-positional optional elements per the spec.
    -- The outer scalar loop's AbsentRep case unconditionally set isDone=true. Check Separated.isPositional, and for non-positional (anyEmpty) elements we call moveOverOneGroupIndexOnly and continue, while for positional elements we keep the existing trailing-suppression isDone=true behavior.
  • removed an erroneous setBitPos0b(currentPos) from the MissingItem+optional+pou branch in parseOneInstanceWithMaybePoU that incorrectly skipped past a prefix/infix separator needed by the next element.
  • added tests seq2–seq7: discriminator+missing-sep error, absent-then-present non-positional, required-element-absent error, positional trailingEmpty isDone path, both-absent non-positional, and present-then-absent non-positional.

DAFFODIL-3085

 * Primary bug (seq2): in a postfix-separated anyEmpty sequence, when an optional
 complex element contains a dfdl:discriminator that evaluates to true (committing
 to that element) and the child parses successfully, but the postfix separator is
 subsequently not found, Daffodil aborted with an internal Assert.invariant
 failure instead of producing a clean "Failed to find postfix separator" parse
 error. The discriminator had already resolved the outer PoU, so the subsequent
 unconditional Assert.invariant(!pstate.isPointOfUncertaintyResolved(pou)) in
 the child-successful+sep-failed path fired. This was fixed by checking
 isPointOfUncertaintyResolved before deciding to resetToPointOfUncertainty
 (undo the entire parse attempt since the mark was created) vs
 resolvePointOfUncertainty (commit to whatever was just parsed, just clean up
 the checkpoint).
 * Related bug (seq3): in the same postfix+anyEmpty configuration, an optional
 simple-delimited element with zero-length content followed by its separator
 which should be AbsentRep per DFDL spec 9.2.4 was not recognized as absent,
 causing the sequence to stop prematurely instead of continuing to the next child.
 Two necessary fixes:
 - PostfixSeparatorHelper's child-failed+sep-found path called
 computeParseAttemptStatus with the post-separator bit position, making the
 isZL flag incorrect. Switch to computeFailedParseAttemptStatus with
 hasZLChildAttempt (captured before sep.parse1). A second Assert.invariant
 in the isSimpleDelimited guard of the same path is replaced by folding the
 PoU-resolved check into the condition. Then promote MissingItem to AbsentRep
 for non-positional optional elements per the spec.
 - The outer scalar loop's AbsentRep case unconditionally set isDone=true.
 Check Separated.isPositional, and for non-positional (anyEmpty) elements
 we call moveOverOneGroupIndexOnly and continue, while for positional elements
 we keep the existing trailing-suppression isDone=true behavior.
 * removed an erroneous setBitPos0b(currentPos) from the
 MissingItem+optional+pou branch in parseOneInstanceWithMaybePoU that
 incorrectly skipped past a prefix/infix separator needed by the next element.
 * added tests seq2–seq7: discriminator+missing-sep error, absent-then-present
 non-positional, required-element-absent error, positional trailingEmpty isDone
 path, both-absent non-positional, and present-then-absent non-positional.

 DAFFODIL-3085
@olabusayoT olabusayoT force-pushed the daf-3085-usmtf-bug branch from f14951a to 74347a3 Compare June 9, 2026 15:46
} else {
// pou was resolved by a discriminator somewhere, so
// we need to resolve the outer POU here
pstate.resolvePointOfUncertainty()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't feel right to me. I think the only thing that should resolve PoUs are things like discriminators, initiated content, etc.

It's not even clear to me why we need a PoU here. Similarly InfixPrefixSeparatorHelperMixin has this:

            pstate.withPointOfUncertainty(
              "parseOneWithInfixOrPrefixSeparator",
              childParser.context
            ) { pou =>
              sep.parse1(pstate)

              val rv = if (pstate.processorStatus eq Success) {
                SeparatorParseStatus.SeparatorFound
              } else {
                // reset to the point of uncertainty to discard the errors
                pstate.resetToPointOfUncertainty(pou)
                SeparatorParseStatus.SeparatorFailed
              }
              rv
            }

That creates a PoU, but doesn't care if it was resolved or not, and doesn't reset back to anything if it fails. Seems like this PoU isn't actually doing anything.

I wonder if all these things are actually relying on an outter separate, and this inner was is actually incorrect to have? The parseOneInstance already has the logic to determine whether or not a PoU is needed and then creates one, and then has logic based on if the PoU was resolved or not. Is that really the only PoU we need for all these separated things?

And the fact that you added a change to resolve the outer PoU maybe indicates this is the case? If this PoU weren't created, then the discriminator would resolve the outer PoU and we shouldn't need this logic.

That said, there is code below where we do reset back to the beginning of this PoU so we do need some PoU here? Is it possible that rather than creating a new PoU this helpers should be getting passed in the outter PoU and then can reset back to the beginning of it? And I wonder if we need an ability to reset back to a PoU without discarding it, so that we can reset back to it again multiple times?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants