promptfoo-evaluation — agentic threat model

9.3AIVSS 9.3 · Critical

This agent presents a high-risk profile due to its capability to execute arbitrary Python evaluation scripts and run Promptfoo commands directly on the host system, creating a direct path to remote code execution if compromised.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.8AARS uplift 0.53Factor sum 4.0/10Threat ×1.1Mitigation ×1.0

Autonomy of Action		0.60
Goal-Driven Planning		0.50
Self-Modification		0.30
Dynamic Tool Use		0.80
Persistent Memory		0.20
Contextual Awareness		0.40
Dynamic Identity		0.10
Multi-Agent Interactions		0.20
Non-Determinism		0.50
Opacity & Reflexivity		0.40

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Utilizes LLMs for generating configurations, writing Python assertions, and acting as an 'llm-rubric' judge. Vulnerable to prompt injection that could manipulate evaluation results or influence generated Python code.

L2 · Data Operations✓ mapped

Handles test cases, few-shot examples, and evaluation datasets. Vulnerable to data poisoning where malicious test cases or assertions are injected to bypass security evaluations.

L3 · Agent Frameworks✓ mapped

Orchestrates the generation of promptfooconfig.yaml and Python scripts. Insecure tool integration is a major threat here, as the framework must safely handle and execute generated code.

L4 · Deployment & Infrastructure✓ mapped

Runs Promptfoo and Python evaluation scripts directly on the host. This presents a severe threat of host compromise, privilege escalation, and arbitrary code execution if the generated scripts are not strictly sandboxed.

L5 · Evaluation & Observability✓ mapped

Acts as an evaluation tool itself, but is vulnerable to evaluation gaming or blind spots if the 'llm-rubric' or Python assertions are manipulated to falsely report successful security passes.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — there is no mention of built-in authentication, access controls, or policy enforcement mechanisms to restrict who can run host-level evaluation scripts.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — the agent appears to operate as a standalone utility for prompt testing, with no explicit multi-agent coordination or ecosystem marketplace interactions described.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).