Mentat — agentic threat model

8.2AIVSS 8.2 · High

Mentat poses a high local security risk due to its ability to directly modify multiple files across a codebase from the command line, making it vulnerable to prompt injection attacks that could introduce malicious code or exfiltrate sensitive project context.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.4AARS uplift 0.67Factor sum 4.2/10Threat ×1.0Mitigation ×0.9

Autonomy of Action		0.60
Goal-Driven Planning		0.70
Self-Modification		0.10
Dynamic Tool Use		0.60
Persistent Memory		0.20
Contextual Awareness		0.80
Dynamic Identity		0.10
Multi-Agent Interactions		0.00
Non-Determinism		0.60
Opacity & Reflexivity		0.50

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Utilizes GPT-4, exposing the tool to prompt injection, indirect prompt injection via poisoned codebase files, and model-reprogramming risks that could alter code generation behavior.

L2 · Data Operations✓ mapped

Integrates deeply with project context and Git. Risks include data exfiltration of sensitive local files, environment variables, or hardcoded secrets if the agent is manipulated into reading and transmitting them.

L3 · Agent Frameworks✓ mapped

Orchestrates multi-file edits directly. Insecure tool integration could allow an attacker to craft malicious inputs that force the agent to write arbitrary code, delete files, or execute unauthorized Git commands.

L4 · Deployment & Infrastructure✓ mapped

Runs locally via a Command-Line Interface (CLI). If executed in an unsandboxed environment, a compromise of the agent translates directly to local host compromise and potential privilege escalation.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — there is no mention of built-in guardrails, output verification, or logging mechanisms to detect anomalous file modifications before they are written.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — as an open-source CLI tool, it likely lacks enterprise-grade access controls, compliance certifications, or centralized policy enforcement, relying instead on the user's local permissions.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — Mentat operates as a standalone developer tool and does not explicitly feature multi-agent collaboration or marketplace integrations.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).

These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.