tdd-guardian — agentic threat model

7.6AIVSS 7.6 · High

tdd-guardian is a developer-focused agentic plugin that runs locally to enforce test workflows, presenting a high risk of local code execution (RCE) if compromised due to its ability to execute test suites, mutation tests, and git hooks.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 7.5AARS uplift 0.98Factor sum 3.9/10Threat ×1.0Mitigation ×0.9

Autonomy of Action		0.60
Goal-Driven Planning		0.50
Self-Modification		0.10
Dynamic Tool Use		0.70
Persistent Memory		0.20
Contextual Awareness		0.70
Dynamic Identity		0.10
Multi-Agent Interactions		0.20
Non-Determinism		0.40
Opacity & Reflexivity		0.40

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — Uses Claude Code (Anthropic models) under the hood. Threats include prompt injection bypassing the quality gates or tricking the auditor into approving malicious code.

L2 · Data Operations✓ mapped

Operates directly on local codebase files, git history, and test coverage reports. Risks include reading sensitive local files or exposure to malicious code repositories.

L3 · Agent Frameworks✓ mapped

Orchestrates test execution, mutation testing, and quality audits. Vulnerable to tool misuse if the plugin is tricked into executing arbitrary shell commands disguised as test runners.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — Runs locally within the developer's environment or CI/CD pipeline. Lacks sandboxing details, meaning a compromise could lead to full host/developer machine compromise.

L5 · Evaluation & Observability✓ mapped

Performs quality audits and coverage gating. Vulnerable to evaluation gaming where developers or malicious actors write dummy tests to bypass the coverage and quality gates.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — No explicit security compliance, authorization, or policy enforcement mechanisms are detailed beyond local git/commit hooks.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — Acts as a plugin to Claude Code, but no complex multi-agent coordination or marketplace interactions are explicitly defined.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).