OpenAI Codex — agentic threat model
OpenAI Codex acts primarily as a stateless code-generation foundation model rather than an autonomous agent, meaning its direct agentic risk is low; however, its downstream risk is high if generated code is executed without human-in-the-loop validation or sandboxing.
OWASP AIVSS score rationale
| Autonomy of Action | 0.10 | |
| Goal-Driven Planning | 0.10 | |
| Self-Modification | 0.00 | |
| Dynamic Tool Use | 0.10 | |
| Persistent Memory | 0.00 | |
| Contextual Awareness | 0.40 | |
| Dynamic Identity | 0.00 | |
| Multi-Agent Interactions | 0.00 | |
| Non-Determinism | 0.60 | |
| Opacity & Reflexivity | 0.70 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
As an advanced code-generation foundation model, Codex is highly susceptible to adversarial prompt injection (indirect injection via comments/code) and model reprogramming, which can force the model to output malicious or backdoored code.
Not certain from the listing — training data operations and ingestion pipelines are proprietary, but the model faces significant risks of training data poisoning where malicious actors intentionally submit public code containing subtle vulnerabilities.
Not certain from the listing — Codex is a model rather than an agent framework, but insecure integration by developers (e.g., passing Codex outputs directly to an exec() function) represents a severe tool-misuse threat.
Not certain from the listing — hosting infrastructure for the API and interactive demos is not detailed, but requires strict sandboxing to prevent arbitrary code execution during live demo interactions.
Not certain from the listing — guardrails and output filtering mechanisms are not specified, creating blind spots where insecure, vulnerable, or plagiarized code could be generated without detection.
Not certain from the listing — compliance alignments (such as SOC2 or EU AI Act) are not mentioned, raising potential compliance and intellectual property risks regarding copyright/licensing of generated code.
Not certain from the listing — no multi-agent ecosystem or marketplace interactions are described, though downstream integration into IDEs and developer workflows creates a wide attack surface.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).
These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.