Skip to content

fix(P2): add bidi control character detection (Trojan Source / CVE-2021-42574)#92

Open
Shrotriya-lalit wants to merge 1 commit into
NVIDIA:mainfrom
Shrotriya-lalit:fix/issue-39-bidi-control-chars
Open

fix(P2): add bidi control character detection (Trojan Source / CVE-2021-42574)#92
Shrotriya-lalit wants to merge 1 commit into
NVIDIA:mainfrom
Shrotriya-lalit:fix/issue-39-bidi-control-chars

Conversation

@Shrotriya-lalit

Copy link
Copy Markdown

Problem

P2_PATTERNS catches zero-width characters () but was missing the full set of Unicode bidi control characters (U+202AU+202E, U+2066U+2069).

Bidi overrides are the attack vector behind CVE-2021-42574 (Trojan Source): an attacker inserts a RIGHT-TO-LEFT OVERRIDE () or similar character so the code looks safe to a human reviewer, but the LLM reads the characters in logical order and follows hidden instructions.

Fix

Add one regex entry to P2_PATTERNS:

(r"[‪-‮⁦-⁩]", 0.85),

Confidence 0.85 — higher than plain zero-width chars because bidi controls have almost no legitimate use in AI skill markdown content.

Tests

  • test_p2_bidi_control_chars_produce_finding — RLO character triggers P2
  • test_p2_bidi_rlo_edge_cases — all 9 bidi control characters individually trigger P2

Closes #39

Checklist

  • make lint passes
  • Tests added for new behaviour
  • DCO sign-off on commit (git commit -s)

…n Source)

P2_PATTERNS was missing Unicode bidi override/embedding/isolate characters
(U+202A-U+202E, U+2066-U+2069) that can be used to hide malicious
instructions from human code review while the LLM sees and executes them.
Add the range with confidence 0.85 (higher than plain zero-width chars
because bidi controls have almost no legitimate use in AI skill content).

Closes NVIDIA#39

Signed-off-by: Lalit Shrotriya <shrotriya.lalit@outlook.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bidirectional control characters (Trojan Source, CVE-2021-42574) not detected in file contents

1 participant