Skip to content

fix(meta_analyzer): parse stringified findings array from LLM#91

Open
matt-appno wants to merge 1 commit into
NVIDIA:mainfrom
matt-appno:fix/meta-analyzer-stringified-findings
Open

fix(meta_analyzer): parse stringified findings array from LLM#91
matt-appno wants to merge 1 commit into
NVIDIA:mainfrom
matt-appno:fix/meta-analyzer-stringified-findings

Conversation

@matt-appno

Copy link
Copy Markdown

Problem

MetaAnalyzerResult has a field_validator that repairs overall_assessment when an LLM returns it as a JSON string (mode="before"json.loads), but there is no equivalent validator for the top-level findings array.

Some models/endpoints — reproducibly Claude Sonnet via an OpenAI-compatible gateway (SKILLSPECTOR_PROVIDER=openai pointed at a router) — return findings as a stringified JSON array rather than a real list. This fails Pydantic's list_type validation and crashes the scan:

Error: 1 validation error for MetaAnalyzerResult
findings
  Input should be a valid array [type=list_type, input_value='[\n  {\n    "pattern_id"...', input_type=str]

The process exits with code 2 and no report is written, even though all per-file analyzers ran successfully. I hit this on multiple real scans (a skill and an MCP-server repo).

Fix

Add a mirrored field_validator("findings", mode="before") that json.loads a string payload, falling back to an empty list on parse failure or a non-list result — exactly matching the existing _parse_stringified_assessment behavior for overall_assessment.

Tests

New TestMetaAnalyzerResultFindingsValidator in tests/nodes/test_llm_analyzer_base.py covering:

  • stringified JSON array → parsed into MetaAnalyzerFinding objects
  • native list → unchanged
  • invalid string → []
  • non-list JSON (e.g. an object) → []

Full file suite passes (83 passed); ruff check / ruff format clean.

🤖 Generated with Claude Code

MetaAnalyzerResult had a field_validator to repair `overall_assessment`
when an LLM returns it as a JSON string, but no equivalent for `findings`.
Some models/endpoints (e.g. Claude Sonnet via an OpenAI-compatible
gateway) return the top-level `findings` array as a JSON string, which
fails Pydantic list validation and crashes the scan with exit code 2
before any report is written.

Add a mirrored `field_validator("findings", mode="before")` that
json.loads a string and falls back to an empty list on parse failure or
non-list payloads, matching the existing `overall_assessment` handling.

Covered by new unit tests in TestMetaAnalyzerResultFindingsValidator.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant