feat(providers): local agent-CLI providers (claude/codex/gemini), no API key by AbhiramDwivedi · Pull Request #52 · NVIDIA/SkillSpector

AbhiramDwivedi · 2026-06-14T19:33:08Z

Closes #57

What

Local agent-CLI providers that route Stage-2 LLM analysis through a locally-installed, already-authenticated agent CLI instead of a metered HTTP API — no API key needed, the CLI's own login session is used. Supported today: claude, codex, gemini (via SKILLSPECTOR_PROVIDER=claude_cli / codex_cli / gemini_cli). Adding another CLI is a small registry entry + a ~5-line provider subclass — see the "HOW TO ADD A NEW AGENT CLI" guide in providers/_agent_cli.py.

How

One hardened chokepoint (providers/_agent_cli.py): shell=False; untrusted prompt via stdin only, never argv; capability stripping verified against the real CLIs — claude: --allowed-tools "" deny-by-default + --permission-mode dontAsk + --strict-mcp-config + --disable-slash-commands; codex: --sandbox read-only + --ephemeral + --ignore-user-config; gemini: --approval-mode plan (read-only, no tool execution). --dangerously-skip-permissions / auto-approve is never used. Scrubbed env; temp CWD; per-call timeout; streamed stdout with a hard size cap (process killed on overflow — no unbounded buffering); model-label validated against argument injection; fail-closed on every error path. A per-CLI CliSpec registry keeps all CLI-specific argv/parse/auth behind one lookup, so the shared security core is unchanged when a CLI is added.
The analyzer seam is untouched. LLM analyzers get their model from get_chat_model(); for CLI providers that returns a minimal ChatOpenAI-compatible adapter backed by provider.complete(), with structured output via prompt-for-JSON + Pydantic validation (fail-closed). HTTP providers are unchanged.
No pinned model versions. A CLI provider forwards no --model by default, so it runs with the user's OWN configured model and thinking level; set SKILLSPECTOR_MODEL to override. (No bundled model_registry.yaml; CLI providers use the package-wide default token budgets.)
is_available() does a real local auth probe (claude auth status / codex login status) so a report's llm_available never claims availability when the CLI is logged out.

Antigravity (`agy`) — registered but disabled

antigravity_cli is wired into the registry but fail-closed and disabled. Tested end-to-end against the real agy: its --print mode renders to a TTY and returns empty stdout on a pipe (how the runner must capture it), and it takes the prompt as an argv value rather than stdin — so it cannot be driven programmatically. Its backend is Gemini, so use gemini_cli for that capability. The registry entry documents the finding in one place; enabling it later is a one-function change if agy gains a headless stdout mode.

Test

Fully-mocked unit tests (no CLI required) cover the subprocess invariants, the bounded reader (real subprocesses: normal / overflow-kill / timeout), the adapter + structured-output parsing, provider selection, the registry, no-pinned-model resolution, and the disabled-antigravity guard — so a contributor without any of these CLIs installed runs the entire default suite. An opt-in live harness (tests/integration/test_agent_cli_live.py, marked integration, excluded by default) is parametrized over claude/codex/gemini and skips per-CLI when a binary is absent/unauthenticated. Verified end-to-end: claude/codex/gemini each return real output with no pinned model.

🤖 Generated with Claude Code

…API key Add four agent-CLI LLM providers driven by locally-installed, already-authenticated CLI binaries (claude, codex, gemini, agy) instead of metered HTTP endpoints: SKILLSPECTOR_PROVIDER=claude_cli -> local `claude` OAuth session, no API key SKILLSPECTOR_PROVIDER=codex_cli -> local `codex` session SKILLSPECTOR_PROVIDER=gemini_cli -> local `gemini` session SKILLSPECTOR_PROVIDER=antigravity_cli -> registered but DISABLED (fail-closed; agy is TTY-only, uncapturable on a pipe) Security chokepoint: all subprocess I/O goes through run_agent_cli() in _agent_cli.py (shell=False, prompt via stdin only, capability-stripped argv, env scrub, temp CWD, bounded streaming, fail-closed). _agent_cli_base.py is the shared provider base class; the four concrete providers are ~5-line subclasses. No pinned model: CLI providers forward no --model by default so the user's own configured model is used; SKILLSPECTOR_MODEL overrides. model_registry.yaml files are absent; metadata methods return None and fall through to default token budgets. The AgentCLIChatModel adapter in llm_utils.py mimics the ChatOpenAI interface (.invoke / .ainvoke / .with_structured_output) backed by the provider's complete() subprocess transport, so existing LLM analyzers (meta_analyzer, semantic_*) work with no code changes. Rebased and adapted onto upstream provider refactor (a5092dd): - providers/base.py: adds AgentCLICapable + has_cli_capability alongside upstream's new ChatModelProvider / LLMProvider protocols. - providers/__init__.py: registers CLI providers in _select_active_provider; preserves upstream's create_chat_model / resolve_chat_model_credentials / NO_LLM_API_KEY_MESSAGE / raise_no_llm_api_key_configured; CLI branch skips create_chat_model (no HTTP transport) and calls raise_no_llm_api_key_configured. - llm_utils.py: get_chat_model branches on has_cli_capability — returns AgentCLIChatModel for CLI providers; delegates to providers.create_chat_model (which uses upstream's native-client path, e.g. ChatAnthropic) for HTTP ones. - Tests: merged upstream's ChatAnthropic/create_chat_model/NO_LLM_API_KEY_MESSAGE coverage with PR#52's CLI dispatch/adapter/antigravity/no-pinned-model tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Ram Dwivedi <abhiram.dwivedi@yahoo.com>

AbhiramDwivedi · 2026-06-17T10:52:34Z

Rebased onto the latest main (a5092dd) and adapted onto the recent provider refactor (the new providers/chat_models.py + create_chat_model() seam), so this is mergeable again. Collapsed to a single commit.

Integration summary

HTTP providers now flow through your create_chat_model() (native ChatAnthropic, etc.); the CLI providers branch to a small AgentCLIChatModel adapter selected by a duck-typed has_cli_capability() check — the existing HTTP path is unchanged.
is_llm_available() delegates to the provider's is_available() for CLI providers and falls back to credential resolution for HTTP providers.
No model pinned: CLI providers forward no --model (the user's own CLI-configured model is used); SKILLSPECTOR_MODEL overrides.

Re-verified end-to-end after the rebase: claude, codex, and gemini each return a live response through the full chat_completion → AgentCLIChatModel → CLI path (both plain and structured-output), with no model pinned. Unit suite + ruff green; the hardened subprocess chokepoint (providers/_agent_cli.py) is unchanged from the pre-rebase commit.

486 · 2026-06-18T14:09:15Z

`claude_cli` provider: `_parse_claude_output` rejects the array-shaped `--output-format json` emitted by some Claude Code builds

First, thanks for this PR — the hardened _agent_cli.py chokepoint (stdin-only untrusted content, capability stripping, env scrub, temp CWD, fail-closed) is exactly what's needed to point an LLM at adversarial skill content. Tried it end-to-end with SKILLSPECTOR_PROVIDER=claude_cli and hit one CLI-version compatibility issue worth flagging.

Symptom

With a Claude Code CLI at 2.1.178, every Stage-2 semantic analyzer fails and the scan silently degrades to static-only:

WARNING semantic_security_discovery failed: expected a JSON object from claude, got list: '[{"type":"system","subtype":"init", ...
WARNING semantic_developer_intent  failed: expected a JSON object from claude, got list: ...
WARNING semantic_quality_policy    failed: expected a JSON object from claude, got list: ...

Root cause

On this build, claude -p --output-format json returns a JSON array of stream events rather than a single result object. The assistant text lives in the final element (type == "result"):

$ printf 'Reply OK.' | claude -p --output-format json --allowed-tools "" \
    --permission-mode dontAsk --strict-mcp-config --disable-slash-commands \
  | python3 -c "import json,sys; d=json.load(sys.stdin); \
      print(type(d).__name__, [x.get('type') for x in d])"
list ['system', 'system', 'system', 'system', 'system', 'system', 'system', 'assistant', 'assistant', 'result']

_parse_claude_output only handles the single-dict envelope, so it raises expected a JSON object from claude, got list. The analyzers catch this and fall back to static-only, which is easy to miss unless you check the warnings.

Suggested fix — accept both shapes in _parse_claude_output: if the parsed JSON is a list, select the last element carrying a result key (preferring type == "result"), then continue with the existing dict handling:

    if isinstance(envelope, list):
        result_event = None
        for item in envelope:
            if isinstance(item, dict) and "result" in item:
                if str(item.get("type", "")).lower() == "result":
                    result_event = item
                elif result_event is None:
                    result_event = item
        if result_event is None:
            raise AgentCLIError(
                f"claude event array has no element with a 'result' key; "
                f"types={[i.get('type') for i in envelope if isinstance(i, dict)]!r}"
            )
        envelope = result_event

After this change the semantic analyzers run and metadata.llm_available == true, verified against a batch of real skills.

It may also be worth surfacing a partial-degradation signal (e.g. flip llm_available to false or warn loudly at the end) when the LLM stage was requested but every analyzer failed, so a parser/transport break can't quietly turn a deep scan into a static-only one.

Happy to open a PR with the parser change + a unit test for the array shape if useful.

AbhiramDwivedi mentioned this pull request Jun 14, 2026

feature: support local agent CLIs (claude/codex) as an LLM provider without an API key #57

Open

AbhiramDwivedi changed the title ~~feat(providers): add claude_cli and codex_cli agent-CLI providers~~ feat(providers): local agent-CLI providers (claude/codex/gemini), no API key Jun 17, 2026

AbhiramDwivedi force-pushed the pr/b-agent-cli-provider branch from 0dbea44 to c4a92ff Compare June 17, 2026 10:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): local agent-CLI providers (claude/codex/gemini), no API key#52

feat(providers): local agent-CLI providers (claude/codex/gemini), no API key#52
AbhiramDwivedi wants to merge 1 commit into
NVIDIA:mainfrom
AbhiramDwivedi:pr/b-agent-cli-provider

AbhiramDwivedi commented Jun 14, 2026 •

edited

Loading

Uh oh!

AbhiramDwivedi commented Jun 17, 2026

Uh oh!

486 commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AbhiramDwivedi commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

Antigravity (agy) — registered but disabled

Test

Uh oh!

AbhiramDwivedi commented Jun 17, 2026

Uh oh!

486 commented Jun 18, 2026

claude_cli provider: _parse_claude_output rejects the array-shaped --output-format json emitted by some Claude Code builds

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AbhiramDwivedi commented Jun 14, 2026 •

edited

Loading

Antigravity (`agy`) — registered but disabled

`claude_cli` provider: `_parse_claude_output` rejects the array-shaped `--output-format json` emitted by some Claude Code builds