feat: add multilingual batch scanner with parallel execution and LLM gap-fill by WhereIs38 · Pull Request #100 · NVIDIA/SkillSpector

WhereIs38 · 2026-06-18T21:07:04Z

Closes #98

Summary

Adds contrib/multilingual/ — a multilingual batch scanner that scans directories of AI agent skills in parallel, with automatic language detection and targeted LLM gap-fill for non-English skills.

Zero changes to src/skillspector/. All integration is via import-time patches that wrap upstream constructors without modifying any source file.

Why this module exists

The upstream project scans one skill at a time — great for depth, but serial execution means LLM latency stacks linearly. I needed to scan many skills quickly, so this module avoids serial bottlenecks by design.

Scale. Each skill runs in an isolated thread via ThreadPoolExecutor. With enough API keys, adding workers cuts total scan time proportionally — 23 skills finish in ~2 minutes at 8 workers, roughly one human-agent conversation round. The ceiling is the user's key count, not the code: 100 keys scanning 2000 skills still finish in minutes.

Cost. Parallel scanning means high token throughput, so I chose DeepSeek — the cheapest per-token option — for development and testing. The module itself is provider-agnostic: any OpenAI-compatible endpoint works. I couldn't test local models due to hardware constraints (Mac with limited RAM, a 4 GB VRAM Windows machine). That remains a known gap I hope someone with better hardware can fill.

Compatibility. The module is tested on macOS and Windows. runner.py applies a small set of import-time patches so DeepSeek works out of the box; the patches follow standard OpenAI-compatible protocol, so Ollama and other endpoints should work as well. All patches are non-invasive and self-contained within contrib/multilingual/.

In short: upstream provides the detection algorithms; this contrib provides the reach. If accepted, I'm interested in continuing to improve scalability and external provider compatibility upstream.

What It Does

Discovery — recursively finds all SKILL.md directories under input root
Language detection — Unicode script-ratio heuristic, extending support to Chinese, Japanese, and Korean
Parallel scan — ThreadPoolExecutor runs graph.invoke() per skill, configurable --workers
Gap-fill — targeted LLM pass for 8 rules with no semantic-analyzer equivalent (P5, P6-P8, MP1-MP3, RA1-RA2)
Aggregated report — terminal / JSON / Markdown, sorted by risk score
Multi-key API pool — rate-limit-aware scheduler with exponential backoff

Evidence (23 built-in fixtures, 8 workers)

Skill	`--no-llm`	LLM mode
`ssd1_semantic_injection`	0/100	100/100
`ssd3_nl_exfiltration`	0/100	60/100
`ssd4_narrative_deception`	10/100	100/100
`sdi4_divergence`	13/100	100/100
`safe_skill`	0/100	0/100 ✓
`ssd_clean`	0/100	0/100 ✓

LLM semantic analyzers catch entire vulnerability categories invisible to static patterns. Clean skills remain clean — zero false-positive inflation.

How to verify

Prerequisites

Create .env in the repo root with 10 different DeepSeek API keys (the ApiKeyPool rotates across keys to avoid rate-limiting):

cp contrib/multilingual/.env.example .env

Edit .env and fill in:

SKILLSPECTOR_PROVIDER=openai
SKILLSPECTOR_MODEL=deepseek-v4-flash
OPENAI_BASE_URL=https://api.deepseek.com/v1

SKILLSPECTOR_API_KEYS="
  sk-or-xxx1|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx2|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx3|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx4|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx5|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx6|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx7|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx8|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx9|https://api.deepseek.com/v1|deepseek-v4-flash
  sk-or-xxx10|https://api.deepseek.com/v1|deepseek-v4-flash
"

Activation

source .venv/bin/activate

Unit tests (no API keys needed, < 2s)

pytest contrib/multilingual/tests/ -v

Test 1 — Static mode (no LLM required, ~0.7s, default 4 workers)

python -m contrib.multilingual.batch_scan tests/fixtures/ --no-llm -f terminal

Expected: 23/23 skills, ~0.7 s, 8 CRITICAL / HIGH findings.

Test 2 — LLM parallel mode (requires API keys, ~2 min)

python -m contrib.multilingual.batch_scan tests/fixtures/ -f terminal --workers 8

Expected: 23/23 skills, ~2 min, 15 CRITICAL / HIGH findings (LLM catches semantic injection, narrative deception, and other vulnerabilities that static patterns miss).

Test 3 — Single-worker mode (for free-tier API keys)

python -m contrib.multilingual.batch_scan tests/fixtures/ -f terminal --workers 1

Testing

18 unit tests in contrib/multilingual/tests/ cover discovery, language detection, JSON / Markdown report formatting, and an end-to-end --no-llm scan. Deterministic components are fully covered. LLM-dependent output is inherently non-deterministic and requires live API keys — the static-vs-LLM comparison in README provides more meaningful evidence for those paths than any mock-based test could. make lint passes on the upstream codebase.

🤖 Generated with Claude Code

Signed-off-by: WhereIs38 CinderellaDoyle@icloud.com

README.md
DESIGN.md
CONTRIBUTING.md

…gap-fill - Parallel scan via ThreadPoolExecutor, configurable --workers - Unicode-based language detection (zh, ja, ko) - LLM gap-fill for 8 rules with no semantic-analyzer equivalent - Aggregated terminal / JSON / Markdown reports - Multi-key API pool with exponential backoff - Zero changes to src/skillspector/ - Cross-platform: macOS + Windows Signed-off-by: WhereIs38 <CinderellaDoyle@icloud.com>

WhereIs38 force-pushed the feature/multilingual-batch-scanner branch 2 times, most recently from e4ecae7 to a32aa67 Compare June 19, 2026 08:06

WhereIs38 force-pushed the feature/multilingual-batch-scanner branch from a32aa67 to 22de8d6 Compare June 19, 2026 08:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add multilingual batch scanner with parallel execution and LLM gap-fill#100

feat: add multilingual batch scanner with parallel execution and LLM gap-fill#100
WhereIs38 wants to merge 1 commit into
NVIDIA:mainfrom
WhereIs38:feature/multilingual-batch-scanner

WhereIs38 commented Jun 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

WhereIs38 commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why this module exists

What It Does

Evidence (23 built-in fixtures, 8 workers)

How to verify

Prerequisites

Activation

Unit tests (no API keys needed, < 2s)

Test 1 — Static mode (no LLM required, ~0.7s, default 4 workers)

Test 2 — LLM parallel mode (requires API keys, ~2 min)

Test 3 — Single-worker mode (for free-tier API keys)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

WhereIs38 commented Jun 18, 2026 •

edited

Loading