AutoGen — agentic threat model

9.0AIVSS 9.0 · Critical

AutoGen presents a high-risk profile primarily due to its support for automated code generation and execution combined with multi-agent orchestration, which can lead to emergent vulnerabilities and remote code execution if not strictly sandboxed.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 9.8AARS uplift 0.16Factor sum 7.5/10Threat ×1.1Mitigation ×0.9

Autonomy of Action		0.80
Goal-Driven Planning		0.90
Self-Modification		0.70
Dynamic Tool Use		0.90
Persistent Memory		0.50
Contextual Awareness		0.80
Dynamic Identity		0.40
Multi-Agent Interactions		1.00
Non-Determinism		0.80
Opacity & Reflexivity		0.70

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — AutoGen supports 'Flexible LLM integration' and 'various LLM configurations', meaning foundation model threats (adversarial examples, data poisoning) depend entirely on the user-selected LLMs.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — The description does not specify built-in vector stores or RAG data operations, though these can be integrated as tools.

L3 · Agent Frameworks✓ mapped

As an orchestration framework, AutoGen is highly vulnerable to tool misuse and insecure tool integration, particularly because it supports 'Code generation and execution' which can be abused via prompt injection.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — The hosting, sandboxing, and secrets management are left to the developer deploying the framework, though code execution capabilities demand strict containerization.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — No native evaluation, logging, or guardrail mechanisms are explicitly detailed in the provided directory listing.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — Compliance, identity management, and access controls are not detailed and must be implemented externally by the deploying organization.

L7 · Agent Ecosystem✓ mapped

Highly relevant; AutoGen's core feature is 'Multi-agent collaboration' and 'conversations', creating significant risks of cascading failures, agent-to-agent trust abuse, and emergent rogue behaviors.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).