Methodology | skillsec.io

The categories we evaluate and how we score them. We publish the methodology, not the patterns.

What we scan

Any plaintext file an AI coding agent might obey:

Claude Code SKILL.md
Cursor .cursorrules and .cursor/rules/*.md
GitHub Copilot copilot-instructions.md
OpenAI Codex / GPT custom instructions
Gemini CLI GEMINI.md
Aider CONVENTIONS.md, Continue.dev configs, or any plaintext system prompt

The detection rules are content-based, they don't care which agent the file targets.

Why we don't publish the patterns

Publishing exact regex patterns gives an attacker the cheat sheet for evading the scanner. Publishing the threat model and severity schedule, on the other hand, lets defenders understand and audit what the scanner protects against. We've picked the second trade-off.

Scoring

A scan starts at 100. Each finding subtracts a weight by severity. The overall severity label is derived from the final score: under 40 is critical, under 60 is high, under 80 is medium.

critical−40

high−20

medium−8

low−3

10 categories evaluated. Click a finding in any scan for context.

Command Injection

Code blocks only

9 patternsmedium – critical

Detects skill content that would execute arbitrary code at runtime. A skill that interpolates untrusted input into a shell or dynamic-evaluation primitive turns the agent into a confused deputy with shell access to the host. Scoped to fenced code blocks and inline code spans - the places a skill presents a runnable command - so plain prose discussing these terms doesn't trip the scanner.

What we look for

Dynamic code execution primitives (eval, exec, runtime compilation)
Shell command substitution and string interpolation
Subprocess invocations that route through a shell
Destructive filesystem commands targeting sensitive locations
Requests for privilege escalation

Remediation

Avoid executing dynamic code or shelling out at runtime. If a shell command is required, hard-code the arguments and never interpolate untrusted input.

Credential Access

Whole document

18 patternsmedium – critical

Detects skill content that reads secrets the host user did not intend to share - cloud credentials, SSH keys, password vaults, and secret-shaped environment variables. A skill that accesses these has exceeded its scope of authority.

What we look for

References to private-key and credential file locations
Reads of cloud-provider configuration directories
Environment variables whose names suggest a secret (token, password, key, auth)
Shell commands that print or upload credential files

Remediation

Skills should not read credential files or secret-shaped environment variables. Handle secret access in the host application, never in a skill.

Suspicious Network

Context-sensitive

7 patternshigh – critical

Detects download-and-execute patterns, reverse-shell signatures, and outbound traffic to non-allowlisted hosts in proximity to locally-held secrets. These are the two highest-yield techniques in agent-skill abuse - supply-chain ingestion and data exfiltration.

What we look for

Pipe-to-shell installation patterns (curl|sh, wget|sh, PowerShell IEX)
Reverse-shell or remote-shell signatures
Outbound HTTP requests near references to local secrets
Connections to ports / hosts outside a documented allowlist

Remediation

Avoid download-and-execute patterns. If a network call is required, document the host, use HTTPS with cert verification, and ensure no local secret flows into the request.

Obfuscation

Context-sensitive

4 patternsmedium – high

Detects encoded payloads inside skill files. Encoded blobs in a skill almost always indicate intent to hide what executes - a behavior with no legitimate need in a file meant to be read by humans and audited by tools like this one.

What we look for

High-entropy encoded blobs adjacent to decoder calls
Long base64-shaped strings with no surrounding context
Line-wrapped base64 payloads split across multiple lines
Long hex strings near byte-decoders or constructors
Patterns that materialize executable strings at runtime

Remediation

If a skill needs to ship data, ship it as plaintext that auditors can read. Encoded payloads in a skill file are a code-review red flag with no upside.

Prompt Injection

Whole document

11 patternsmedium – critical

Detects payloads designed to subvert the host model: instructions overriding the system prompt, persona overrides, jailbreak framings, and hidden Unicode used to smuggle instructions past human review. The threat model here is the skill instructing the agent, not the agent reading the skill.

What we look for

Instructions overriding prior system prompts
Persona reassignment ("you are now…", "act as…")
Requests to disable safety guardrails
Hidden zero-width or bidirectional Unicode characters
Unicode tag characters used to smuggle hidden ASCII instructions
Homoglyph / fullwidth evasion of the above (matched after NFKC folding)
Requests to leak the host model's system prompt

Remediation

Remove instructions that override the host model and any hidden Unicode. Skills should describe what they do - they should not instruct the model to behave differently.

Path Traversal

Whole document

6 patternshigh – critical

Detects skill content that attempts to read outside its intended directory or references operating-system files that carry sensitive material. These references rarely have a legitimate purpose in an agent skill.

What we look for

Multi-segment relative paths used to escape a working directory
References to sensitive system files (passwd, shadow, sudoers)
Reads of process and kernel introspection paths
References to Windows credential hives

Remediation

Constrain file access to explicit, intended directories. Never reference system credential files or use multi-level path-escape segments.

Unsafe Deserialization

Whole document

9 patternshigh – critical

Detects deserialization sinks that allow arbitrary object construction or remote code execution when fed untrusted input. These primitives have a long history of being used as one-line RCE vectors.

What we look for

Native pickling or marshalling APIs
Generic YAML loaders that accept arbitrary types
Language-specific unserialize functions
JSON-pickle and similar object reconstructors

Remediation

Use the safe variants of deserializers (JSON instead of pickle, yaml.safe_load instead of yaml.load) and never deserialize untrusted input.

Host Agent Credential Exfiltration

Whole document

8 patternshigh – critical

Detects references to the credentials and configuration of the host AI agent itself - provider API keys, MCP server configs, editor-integrated assistant configs. A skill that reads these has the keys to the kingdom: it can impersonate the user against any provider the host is configured for.

What we look for

References to host-agent configuration files (Claude, Cursor, Copilot)
Reads of provider API key environment variables
Hardcoded provider-key-shaped tokens
References to MCP server credential names

Remediation

Skills must never read their host agent's configuration files or provider API keys. Treat host credentials as out-of-scope for any skill.

MCP Server Configuration

Context-sensitive

6 patternsmedium – critical

Inspects Model Context Protocol server definitions for launch commands, arguments, environment blocks, and remote endpoints that would let an unreviewed config execute code or exfiltrate data the moment the agent starts. This is the supply-chain surface behind the SymJack and Cyata disclosures: project-controlled config that the agent acts on with no human in the loop.

What we look for

Server launch commands that invoke a shell or download-and-execute tool
Arguments carrying shell metacharacters or remote scripts
Auto-run of unpinned packages via npx / uvx / pipx
Hardcoded credentials in server environment blocks
Remote MCP endpoints outside a documented allowlist

Remediation

Treat every MCP server entry as code you are about to run. Pin package versions, avoid shell launch commands, keep secrets out of the config, and allowlist remote endpoints. Review .mcp.json and host-agent settings before running an agent on an untrusted repository.

Host Environment Tampering

Whole document

9 patternsmedium – critical

Detects skill content that modifies the developer's environment so code executes later, outside the review moment: symlink redirection, Git-hook installation, shell and login persistence, and install-time lifecycle scripts. These are the post-approval techniques behind the SymJack symlink hijack and the Cursor Git-hook RCE - the gap between what an approval prompt shows and what actually runs.

What we look for

Symlink creation that can redirect later file operations
Writes into .git/hooks or git core.hooksPath changes
Appends to shell startup files (.bashrc, .zshrc, .profile)
Cron, systemd-user, autostart, and LaunchAgent persistence
npm preinstall / postinstall lifecycle scripts

Remediation

A skill should not install hooks, symlinks, startup entries, or lifecycle scripts. If automation is required, make it explicit and run it under direct human review - never as a side effect of a file copy or an install step.