test-driven-development — agentic threat model
This agent skill enforces a strict TDD loop but lacks built-in sandboxing or execution guards, presenting a high risk of arbitrary code execution if the underlying agent runs generated tests in an unsecure environment.
OWASP AIVSS score rationale
| Autonomy of Action | 0.30 | |
| Goal-Driven Planning | 0.50 | |
| Self-Modification | 0.10 | |
| Dynamic Tool Use | 0.40 | |
| Persistent Memory | 0.10 | |
| Contextual Awareness | 0.30 | |
| Dynamic Identity | 0.00 | |
| Multi-Agent Interactions | 0.10 | |
| Non-Determinism | 0.40 | |
| Opacity & Reflexivity | 0.30 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Not certain from the listing — The skill is an instruction-only surface, meaning it relies entirely on an underlying foundation model. If that model is susceptible to prompt injection or jailbreaking, the TDD mandate can be bypassed or manipulated.
Not certain from the listing — There is no mention of data operations, RAG, or vector stores. However, if the agent accesses a codebase, poisoned source files or malicious test inputs could exploit the code-generation process.
The framework orchestrates a strict red-green-refactor planning loop. Vulnerabilities include framework-level bypasses where prompt injection overrides the 'watch-it-fail' mandate, or insecure tool integration when executing the generated tests.
Not certain from the listing — The skill dictates the logic but not the execution environment. If the host agent runs the 'watch-it-fail' verification tests outside of a secure sandbox, it risks remote code execution (RCE) from generated malicious code.
The core mechanism relies on 'watch-it-fail' verification as an evaluation metric. A major threat is evaluation gaming, where the agent writes trivial, self-passing, or mocked tests to satisfy the loop without implementing actual secure logic.
Not certain from the listing — Being a free, open-source instruction-only skill, there are no built-in identity, authorization, or compliance controls defined at this layer.
Not certain from the listing — No multi-agent interactions are specified, though if integrated into a larger development ecosystem, a compromised TDD agent could propagate malicious code to downstream deployment agents.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).