code-review-mcp — agentic threat model
The code-review-mcp agent presents a high-risk profile due to its ingestion of untrusted pull request diffs combined with active write access to GitHub repositories via GITHUB_TOKEN, creating a direct vector for prompt injection to execute unauthorized repository actions.
OWASP AIVSS score rationale
| Autonomy of Action | 0.70 | |
| Goal-Driven Planning | 0.40 | |
| Self-Modification | 0.10 | |
| Dynamic Tool Use | 0.60 | |
| Persistent Memory | 0.20 | |
| Contextual Awareness | 0.80 | |
| Dynamic Identity | 0.50 | |
| Multi-Agent Interactions | 0.10 | |
| Non-Determinism | 0.70 | |
| Opacity & Reflexivity | 0.60 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Not certain from the listing — the underlying foundation model is not specified, but it is highly vulnerable to indirect prompt injection embedded within untrusted PR diffs, which could reprogram the model to approve malicious code or leak secrets.
The agent ingests pull request diffs and repository files as its primary data source. The primary threat is data poisoning and indirect prompt injection via malicious code comments or payload strings in the PR diffs.
The agent uses the Model Context Protocol (MCP) to list, inspect, and review PRs. Insecure tool integration is a major threat if the agent can be manipulated into executing arbitrary git commands or abusing the GitHub API beyond the intended review scope.
The agent relies on a GITHUB_TOKEN for authentication. If the hosting environment or the token itself is compromised, attackers gain direct write/read access to the target repository, potentially leading to supply chain compromise.
Not certain from the listing — there is no mention of logging, guardrails, or evaluation frameworks to detect if the agent has been compromised by a prompt injection attack or is generating biased/incorrect security reviews.
The agent relies on GitHub token authentication. Security controls depend heavily on the scope of the GITHUB_TOKEN (e.g., read-only vs. write permissions); lack of explicit policy enforcement at the agent level is a key gap.
The agent operates as an MCP tool, which can be integrated into broader multi-agent workflows. A compromised review agent could falsely validate malicious code generated or submitted by other upstream agents.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).