fix(supply_chain): parse pyproject.toml with tomllib instead of requirements regex#95
Open
Shrotriya-lalit wants to merge 1 commit into
Open
Conversation
…rements regex _analyze_dependencies routed pyproject.toml through _extract_packages_from_requirements, which treats every TOML key (requires-python, name, description, authors…) as a package name, causing false-positive SC5/SC6 findings on metadata fields. Add _extract_packages_from_pyproject (Python 3.11+ tomllib, stdlib) that reads only [project].dependencies, [project.optional-dependencies], and [build-system].requires — the three PEP 621 / PEP 517 dependency arrays. A frozen set of PEP 621 metadata keys acts as a secondary guard. Malformed TOML is caught and returns [] so the analyzer never crashes. Closes NVIDIA#2 Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
_analyze_dependenciesroutedpyproject.tomlthrough_extract_packages_from_requirements, a line-by-line regex designed forrequirements.txtformat. TOML keys such asrequires-python,name,description, andauthorsall look like dependency strings to the regex,producing false-positive SC5/SC6 findings on standard PEP 621 metadata
fields.
Example: a
pyproject.tomlwithname = "myproject"triggered an SC6(typosquatting) alert for a non-existent package called
myproject.Root Cause
The requirements parser treats every non-comment line as a potential package
specifier. TOML files have a completely different structure and must be parsed
semantically, not line-by-line.
Fix
Add
_extract_packages_from_pyproject(content)usingtomllib(Python 3.11+stdlib, no new dependencies) that reads only the three PEP 621 / PEP 517
dependency arrays:
[project].dependencies[project.optional-dependencies].<group>[build-system].requiresA
frozensetof PEP 621 metadata keys (name,version,description,authors, …) acts as a secondary guard. Malformed TOML is caught byTOMLDecodeErrorand returns[]so the analyzer never crashes.Tests
[project].dependenciesextracted correctly[project.optional-dependencies]groups extracted[build-system].requiresextracted[]without exceptionname,version,description) never treated as packagespyproject.tomlwith metadata fields produces no SC5/SC6 findingsCloses #2
Checklist
make lintpassesgit commit -s)