plugin-eval — agentic threat model

8.6AIVSS 8.6 · High

The plugin-eval agent acts as a meta-evaluator for Claude Code plugins, presenting a unique risk where a malicious plugin could bypass vetting via adversarial evasion or exploit the evaluation process to achieve local code execution.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 7.8AARS uplift 0.77Factor sum 3.5/10Threat ×1.0Mitigation ×1.0

Autonomy of Action		0.30
Goal-Driven Planning		0.40
Self-Modification		0.10
Dynamic Tool Use		0.50
Persistent Memory		0.50
Contextual Awareness		0.40
Dynamic Identity		0.10
Multi-Agent Interactions		0.50
Non-Determinism		0.40
Opacity & Reflexivity		0.30

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

As a Claude Code plugin, it relies on Anthropic's foundation models. It is highly vulnerable to indirect prompt injection if a plugin being evaluated contains adversarial instructions designed to hijack the evaluation logic.

L2 · Data Operations✓ mapped

The agent ingests and parses external plugin codebases. Maliciously crafted plugin files could exploit parsing vulnerabilities or attempt to exfiltrate local codebase data during the inspection phase.

L3 · Agent Frameworks✓ mapped

The agent orchestrates a three-layer evaluation framework. If the framework executes or dynamically imports the target plugins to test their 'behavior', it risks running untrusted code directly within the agent's execution context.

L4 · Deployment & Infrastructure✓ mapped

Runs locally within the developer's Claude Code environment. A compromise of the evaluation process could lead to local privilege escalation or unauthorized file system access on the host machine.

L5 · Evaluation & Observability✓ mapped

The agent's core function is evaluation. It faces threats of evaluation gaming, where malicious plugins are optimized to score highly on the Elo scale while hiding backdoor payloads from the static analysis layer.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — there is no mention of built-in security guardrails, access control policies, or compliance auditing to govern how the plugin accesses and executes third-party code.

L7 · Agent Ecosystem✓ mapped

Operates directly in the plugin/agent ecosystem. A compromised evaluator could systematically approve malicious plugins, poisoning the local ecosystem trust and leading to cascading compromises across other active plugins.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).