TestDriver.ai — agentic threat model

8.7AIVSS 8.7 · High

TestDriver.ai introduces unique risks by replacing deterministic selectors with AI vision to drive application UIs. While reducing maintenance overhead, its reliance on visual interpretation introduces non-determinism and susceptibility to visual prompt injection, which could be exploited to manipulate test execution or compromise staging environments.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 7.5AARS uplift 1.23Factor sum 4.9/10Threat ×1.0Mitigation ×1.0

Autonomy of Action		0.70
Goal-Driven Planning		0.60
Self-Modification		0.20
Dynamic Tool Use		0.70
Persistent Memory		0.30
Contextual Awareness		0.80
Dynamic Identity		0.20
Multi-Agent Interactions		0.10
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Uses AI vision models to interpret user interfaces. This introduces susceptibility to adversarial UI elements (visual prompt injection) where malicious or unexpected UI layouts could trick the model into executing unintended actions or bypassing critical test assertions.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — details on how screenshots, DOM states, and test data are stored, processed, or protected against data exfiltration and baseline poisoning are not provided.

L3 · Agent Frameworks✓ mapped

Translates visual understanding into execution steps (clicks, keystrokes). A flaw in the orchestration framework could lead to tool misuse, such as executing destructive actions (e.g., clicking 'Delete Account') if the vision model misinterprets the UI context.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — the hosting environment, runner sandboxing, and network isolation controls for executing these vision-driven tests are not specified.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — the mechanisms for monitoring test execution drift, logging visual decision-making paths, and establishing guardrails against runaway test loops are not detailed.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — compliance certifications (e.g., SOC2), access controls, and enterprise governance policies for managing test runner credentials are not mentioned.

L7 · Agent Ecosystem✓ mapped

Operates as an automated agent within the software development lifecycle. If integrated into CI/CD pipelines, a compromised or manipulated testing agent could be used to falsely pass malicious builds or block legitimate deployments.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).

These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.