Droidrun — agentic threat model

9.4AIVSS 9.4 · Critical

Droidrun presents a high-risk profile due to its capability to translate natural language commands into direct physical actions on Android and iOS devices, creating a direct vector for device compromise if prompt injection or tool misuse occurs.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.8AARS uplift 0.65Factor sum 5.4/10Threat ×1.0Mitigation ×1.0

Autonomy of Action		0.80
Goal-Driven Planning		0.70
Self-Modification		0.10
Dynamic Tool Use		0.90
Persistent Memory		0.30
Contextual Awareness		0.80
Dynamic Identity		0.20
Multi-Agent Interactions		0.30
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — Droidrun acts as a framework and likely relies on external LLMs (e.g., OpenAI, Anthropic) or local models for natural language parsing, making it susceptible to prompt injection and adversarial reprogramming that could translate to malicious device commands.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — The framework's handling of device state, screenshots, or UI hierarchy data is not detailed, presenting risks of sensitive data exposure or exfiltration if device telemetry is sent to untrusted model endpoints.

L3 · Agent Frameworks✓ mapped

Droidrun orchestrates LLM agents to control mobile OS environments. The primary threat is tool misuse and insecure tool integration, where malicious or hijacked natural language commands are translated into destructive device actions (e.g., deleting files, sending unauthorized messages).

L4 · Deployment & Infrastructure✓ mapped

The framework operates directly on Android and iOS devices or emulators. This presents severe risks of privilege escalation, host compromise, and unauthorized access to device APIs (ADB, accessibility services) if the execution environment is not strictly sandboxed.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — There is no mention of built-in guardrails, execution logging, or real-world monitoring to detect anomalous or malicious device commands before they are executed on the target OS.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — The open-source framework does not detail built-in authentication, authorization, or policy enforcement mechanisms to restrict which commands can be executed on the connected mobile devices.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — While it supports 'LLM agents', it is unclear if it facilitates multi-agent coordination or marketplace integrations, though rogue agents could theoretically gain control of the device interface.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).