Pi Coding Agent — agentic threat model

9.4AIVSS 9.4 · Critical

Pi Coding Agent presents a high-risk profile due to its terminal-first execution environment and support for dynamic package installation (npm/git) without built-in sandboxing. Its highly extensible nature makes it a powerful developer tool but leaves it highly vulnerable to prompt injection leading to remote code execution.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 8.8AARS uplift 0.63Factor sum 5.0/10Threat ×1.05Mitigation ×1.0

Autonomy of Action		0.60
Goal-Driven Planning		0.70
Self-Modification		0.40
Dynamic Tool Use		0.80
Persistent Memory		0.50
Contextual Awareness		0.60
Dynamic Identity		0.20
Multi-Agent Interactions		0.10
Non-Determinism		0.70
Opacity & Reflexivity		0.40

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — The agent supports model switching and integration with many model providers, meaning L1 threats like adversarial prompt injection or mis-aligned outputs depend heavily on the user-selected backend model.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — While it features context engineering and session history, there is no mention of a dedicated vector database or RAG pipeline, leaving data poisoning or embedding inversion risks undefined.

L3 · Agent Frameworks✓ mapped

The agent uses a minimal coding harness with TypeScript extensions, skills, and prompt templates. This extensibility introduces high risk of tool misuse or insecure tool integration, especially when loading untrusted extensions.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — As a terminal-first CLI and SDK-based tool, hosting and sandboxing are left entirely to the developer's local environment, posing severe risks of local privilege escalation if run unsandboxed.

L5 · Evaluation & Observability✓ mapped

Provides tree-structured session history, print/JSON modes, and RPC for observability. However, there are no built-in guardrails or automated drift detection mechanisms mentioned to prevent malicious execution.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — The open-source, minimal nature of the harness suggests no built-in identity, authorization, or compliance policies, leaving all governance to the implementing developer.

L7 · Agent Ecosystem✓ mapped

Supports installing shareable packages from npm or git. This introduces significant supply-chain risks, where compromised third-party packages or malicious 'skills' could execute arbitrary code in the developer's terminal.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).