NeMo Guardrails — agentic threat model

5.7AIVSS 5.7 · Medium

NeMo Guardrails is a specialized safety and governance framework designed to mitigate LLM risks; its own risk profile is low-to-medium, primarily centered around guardrail bypasses, Colang scripting logic flaws, and the secure execution of custom actions.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 7.5AARS uplift 0.57Factor sum 2.3/10Threat ×1.0Mitigation ×0.7

Autonomy of Action		0.20
Goal-Driven Planning		0.20
Self-Modification		0.10
Dynamic Tool Use		0.30
Persistent Memory		0.20
Contextual Awareness		0.50
Dynamic Identity		0.10
Multi-Agent Interactions		0.20
Non-Determinism		0.30
Opacity & Reflexivity		0.20

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Integrates with multiple foundation models. Vulnerable to adversarial jailbreaks designed to bypass the guardrail checks, or prompt injection attacks that manipulate the guardrail's internal LLM self-evaluation steps.

L2 · Data Operations✓ mapped

Features 'retrieval rails' to validate RAG inputs/outputs. Threats include data poisoning of the vector database or knowledge base, which can lead to the retrieval of malicious context that evades guardrail detection.

L3 · Agent Frameworks✓ mapped

Uses Colang scripting and 'execution rails' to run custom code/actions. Threats include remote code execution (RCE) if execution rails run untrusted code, and logic bypasses in Colang dialog flow definitions.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — As an open-source toolkit, deployment security depends on the integrator. Threats include lack of sandboxing for execution rails and insecure API exposure of the guardrails server.

L5 · Evaluation & Observability✓ mapped

Acts as an active observability and policy enforcement layer. Threats include guardrail evasion (gaming the evaluation criteria), latency overhead leading to denial of service, and insufficient logging of blocked malicious attempts.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — Helps enforce safety compliance policies (e.g., toxic content filtering), but does not natively manage user authentication, authorization, or enterprise compliance audits.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — While it can be deployed to protect individual agents, the listing does not detail native multi-agent ecosystem trust boundaries or marketplace security controls.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).