FutureHouse — agentic threat model

7.6AIVSS 7.6 · High

FutureHouse's AI systems, such as PaperQA2, present moderate agentic risk primarily centered on data integrity; compromised or poisoned academic inputs could lead to highly plausible scientific misinformation or dual-use research risks.

OWASP AIVSS score rationale

AIVSS = (CVSS_Base + AARS) × Mitigation_Factor, where AARS = (10 − CVSS_Base) × (Factor_Sum / 10) × ThM

CVSS base 6.0AARS uplift 1.56Factor sum 3.9/10Threat ×1.0Mitigation ×1.0

Autonomy of Action		0.60
Goal-Driven Planning		0.70
Self-Modification		0.10
Dynamic Tool Use		0.40
Persistent Memory		0.30
Contextual Awareness		0.60
Dynamic Identity		0.10
Multi-Agent Interactions		0.20
Non-Determinism		0.50
Opacity & Reflexivity		0.40

Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.

MAESTRO 7-layer threat model

Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.

L1 · Foundation Models⚠ not certain from listing

Not certain from the listing — likely relies on commercial or open-source LLMs (e.g., GPT-4, Claude) for PaperQA2. Risks include adversarial prompt injection altering scientific summaries or model reprogramming.

L2 · Data Operations✓ mapped

PaperQA2 searches academic databases and summarizes literature. Risks include database poisoning where malicious papers are ingested, leading to poisoned RAG outputs and incorrect scientific conclusions.

L3 · Agent Frameworks⚠ not certain from listing

Not certain from the listing — likely uses custom orchestration or open-source frameworks to manage search and summarization loops. Risks include insecure tool integration or prompt injection via paper content.

L4 · Deployment & Infrastructure⚠ not certain from listing

Not certain from the listing — as an open-source tool, deployment is user-managed. Risks include insecure local hosting, lack of sandboxing when parsing untrusted PDFs/academic papers, and dependency vulnerabilities.

L5 · Evaluation & Observability⚠ not certain from listing

Not certain from the listing — no explicit evaluation or guardrail mechanisms are detailed. Risks include blind spots in detecting hallucinated scientific facts or biased literature synthesis.

L6 · Security & Compliance (cross-cutting)⚠ not certain from listing

Not certain from the listing — no compliance certifications (like SOC2) or formal access controls are mentioned. Open-source nature means security compliance is largely the responsibility of the deployer.

L7 · Agent Ecosystem⚠ not certain from listing

Not certain from the listing — while designed as 'AI Scientists', there is no explicit mention of multi-agent coordination or marketplace interactions. Risks of cascading failures are low unless integrated into larger workflows.

MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).

These scores are auto-generated from public information (the agent's own listing, docs, and repository) using the canonical OWASP AIVSS formula and the MAESTRO framework — an estimate for guidance, not a penetration test, audit, or certification. See the scoring methodology. Are you the vendor? Factual corrections are free.