Add Reader.EmbeddedFiles with a cycle-guarded name-tree walk#5
Merged
Conversation
Inspectors that read PDF attachments (e.g. ZUGFeRD/Factur-X invoice XML) otherwise hand-roll the EmbeddedFiles name-tree walk, where a cyclic or self-referential /Kids graph drives the recursion into a stack overflow. Add Reader.EmbeddedFiles() []EmbeddedFile: it walks the tree once with a visited-set (reference cycles) and a depth cap (inline-nested /Kids) and returns entries in tree order, mirroring DocumentInfo's accessor shape. Tests: flat, nested /Kids, cyclic /Kids terminates, and no-attachments.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Inspectors that read PDF attachments (ZUGFeRD/Factur-X invoice XML, etc.) otherwise hand-roll the EmbeddedFiles name-tree walk, and a cyclic or self-referential
/Kidsgraph drives that recursion into a stack overflow — fatal and unrecoverable. (Reproduced here: without the guard, the cyclic test crashes withfatal error: stack overflow.)What
Reader.EmbeddedFiles() []EmbeddedFilewalks the catalog's EmbeddedFiles name tree once and returns the attachments in tree order:The walk is guarded by a visited-set (reference cycles) and a depth cap (inline-nested
/Kids), so a hostile tree can neither loop nor overflow the stack. The shape mirrors the existingDocumentInfo()accessor; an ordered slice (not a map) keeps the result deterministic.Tests
TestEmbeddedFiles(flat),TestEmbeddedFilesNestedKids,TestEmbeddedFilesCyclicKidsTerminates(the cycle that previously overflowed),TestEmbeddedFilesNone. Full suite +go vetpass; gofmt-clean.