tdd-guardian — agentic threat model
tdd-guardian is a developer-focused agentic plugin that runs locally to enforce test workflows, presenting a high risk of local code execution (RCE) if compromised due to its ability to execute test suites, mutation tests, and git hooks.
OWASP AIVSS score rationale
| Autonomy of Action | 0.60 | |
| Goal-Driven Planning | 0.50 | |
| Self-Modification | 0.10 | |
| Dynamic Tool Use | 0.70 | |
| Persistent Memory | 0.20 | |
| Contextual Awareness | 0.70 | |
| Dynamic Identity | 0.10 | |
| Multi-Agent Interactions | 0.20 | |
| Non-Determinism | 0.40 | |
| Opacity & Reflexivity | 0.40 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Not certain from the listing — Uses Claude Code (Anthropic models) under the hood. Threats include prompt injection bypassing the quality gates or tricking the auditor into approving malicious code.
Operates directly on local codebase files, git history, and test coverage reports. Risks include reading sensitive local files or exposure to malicious code repositories.
Orchestrates test execution, mutation testing, and quality audits. Vulnerable to tool misuse if the plugin is tricked into executing arbitrary shell commands disguised as test runners.
Not certain from the listing — Runs locally within the developer's environment or CI/CD pipeline. Lacks sandboxing details, meaning a compromise could lead to full host/developer machine compromise.
Performs quality audits and coverage gating. Vulnerable to evaluation gaming where developers or malicious actors write dummy tests to bypass the coverage and quality gates.
Not certain from the listing — No explicit security compliance, authorization, or policy enforcement mechanisms are detailed beyond local git/commit hooks.
Not certain from the listing — Acts as a plugin to Claude Code, but no complex multi-agent coordination or marketplace interactions are explicitly defined.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).