UFO — agentic threat model

9.6AIVSS 9.6 · Critical

UFO presents a high agentic risk profile due to its ability to execute arbitrary actions across Windows OS applications using a dual-agent framework and GPT-Vision, lacking native sandboxing or safety guardrails in its basic description.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.5AARS uplift 1.07Factor sum 6.5/10Threat ×1.1Mitigation ×1.0

Autonomy of Action		0.80
Goal-Driven Planning		0.90
Self-Modification		0.20
Dynamic Tool Use		0.90
Persistent Memory		0.50
Contextual Awareness		0.80
Dynamic Identity		0.30
Multi-Agent Interactions		0.80
Non-Determinism		0.70
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Utilizes GPT-Vision as its core foundation model. Highly vulnerable to visual prompt injection, adversarial GUI elements, and model reprogramming via malicious application screens.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — details about vector stores, training data, or RAG pipelines are not specified, but it processes active GUI screenshots and OS state data which could contain sensitive user information.

L3 · Agent Frameworks✓ mapped

Employs a dual-agent framework (HostAgent and AppAgent) for orchestration. Vulnerable to tool misuse and insecure tool integration, as the agents translate natural language into direct OS and application control commands.

L4 · Deployment & Infrastructure✓ mapped

Deployed locally on Windows OS to navigate and control applications. This creates a high risk of host compromise, privilege escalation, and unauthorized local execution if the framework is manipulated.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — no explicit mention of built-in evaluation, logging, guardrails, or observability tools for monitoring agent actions or preventing destructive OS commands.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — no mention of identity management, authorization controls, or compliance frameworks for OS-level actions, suggesting it inherits the permissions of the logged-in user.

L7 · Agent Ecosystem✓ mapped

Features a dual-agent ecosystem (HostAgent and AppAgent). Vulnerable to agent-to-agent trust abuse, where a compromised AppAgent could feed malicious UI context to the HostAgent, leading to cascading execution failures.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).