diffdock (scientific-agent-skills) — agentic threat model

8.1AIVSS 8.1 · High

This agent skill exposes a high-impact scientific model (DiffDock) that runs bundled Python/model code on the host, presenting significant local execution and code injection risks if integrated into an un-sandboxed agent framework.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 7.5AARS uplift 0.6Factor sum 2.3/10Threat ×1.05Mitigation ×1.0

Autonomy of Action		0.20
Goal-Driven Planning		0.10
Self-Modification		0.00
Dynamic Tool Use		0.40
Persistent Memory		0.00
Contextual Awareness		0.30
Dynamic Identity		0.00
Multi-Agent Interactions		0.20
Non-Determinism		0.50
Opacity & Reflexivity		0.60

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models✓ mapped

Uses a specialized diffusion-based model (DiffDock) for molecular docking. Threats include adversarial input manipulation of protein/ligand structures to produce false binding predictions, or model evasion.

L2 · Data Operations⚠ not certain from listing

Not certain from the listing — relies on external structural biology files (PDB/SDF) as inputs. Gaps in data provenance or malicious structure files could lead to buffer overflows or parser exploits in underlying libraries.

L3 · Agent Frameworks✓ mapped

The skill injects tool-specific guidance and runs Python/model code. Insecure tool integration or lack of input sanitization on the framework side could allow arbitrary code execution on the host.

L4 · Deployment & Infrastructure✓ mapped

Runs bundled Python/model code directly on the host. Without strict containerization or sandboxing, this poses a severe threat of host compromise and privilege escalation.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — no built-in logging, guardrails, or drift detection are mentioned for the execution of this specific molecular docking skill.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — being an open-source skill, it lacks explicit identity, authorization, or compliance controls, shifting all security responsibility to the deploying developer.

L7 · Agent Ecosystem✓ mapped

Designed as a single skill within a larger scientific agent-skills library. If integrated into multi-agent workflows, compromised upstream agents could feed malicious inputs to trigger execution vulnerabilities.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).