Vocode — agentic threat model
Vocode's risk posture is centered on real-time voice orchestration, where vulnerabilities can lead to automated voice phishing (vishing), session hijacking, and telephony fraud. Its reliance on external LLM and STT/TTS providers introduces significant supply-chain and data-transit risks.
OWASP AIVSS score rationale
| Autonomy of Action | 0.50 | |
| Goal-Driven Planning | 0.40 | |
| Self-Modification | 0.10 | |
| Dynamic Tool Use | 0.50 | |
| Persistent Memory | 0.30 | |
| Contextual Awareness | 0.60 | |
| Dynamic Identity | 0.30 | |
| Multi-Agent Interactions | 0.20 | |
| Non-Determinism | 0.70 | |
| Opacity & Reflexivity | 0.60 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Integrates directly with external LLM providers. Vulnerable to prompt injection via voice (vishing/over-the-air injection), adversarial audio inputs that bypass LLM safety filters, and mis-aligned or hallucinated voice outputs.
Not certain from the listing — Vocode orchestrates real-time audio streams. Threats include exposure of transient voice data, lack of secure logging for transcriptions, and potential data exfiltration via compromised TTS/STT endpoints.
Orchestrates the critical STT -> LLM -> TTS pipeline. Vulnerabilities include state desynchronization during real-time interruptions, race conditions in conversation handling, and insecure integration with telephony APIs (e.g., Twilio).
Not certain from the listing — As an open-source framework, deployment is developer-managed. Key threats include insecure hosting of the orchestration server, exposed WebSockets for real-time audio, and leaked API keys for LLM/TTS providers.
Not certain from the listing — No explicit mention of built-in guardrails or real-time monitoring. Gaps in logging voice interactions could lead to undetected prompt injections or abuse.
Not certain from the listing — Compliance details (such as HIPAA for voice data or GDPR for biometric/voice processing) are not specified. Telephony fraud and lack of robust access controls on voice endpoints are key risks.
Not certain from the listing — While it supports customizable agents, there is no explicit multi-agent marketplace mentioned. Risks involve untrusted third-party STT/TTS/LLM integrations.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).