← apify/mcp-server-rag-web-browser
apify/mcp-server-rag-web-browser — agentic threat model
The agent acts as a high-exposure data ingestion tool via MCP, presenting a significant risk of indirect prompt injection from untrusted web content. Its primary security boundaries rely on Apify token management, leaving upstream LLM clients vulnerable to malicious payloads embedded in scraped Markdown.
OWASP AIVSS score rationale
| Autonomy of Action | 0.30 | |
| Goal-Driven Planning | 0.10 | |
| Self-Modification | 0.00 | |
| Dynamic Tool Use | 0.50 | |
| Persistent Memory | 0.00 | |
| Contextual Awareness | 0.40 | |
| Dynamic Identity | 0.20 | |
| Multi-Agent Interactions | 0.50 | |
| Non-Determinism | 0.60 | |
| Opacity & Reflexivity | 0.30 |
Scored with the canonical OWASP AIVSS formula (AIVSS calculator reference); agentic risk factors estimated from the agent’s described capabilities.
MAESTRO 7-layer threat model
Per-layer threats for this agent. Layers tagged “not certain from listing” are general, caveated commentary where the public description didn’t pin that layer.
Not certain from the listing — The MCP server itself does not specify or bundle a foundation model, as it acts as a utility tool for external LLMs. However, the downstream LLM processing the scraped Markdown is highly vulnerable to reprogramming or mis-aligned outputs via indirect prompt injection.
Highly critical layer. The server fetches live web content and converts it to Markdown for RAG. This introduces severe risks of data poisoning and indirect prompt injection, where malicious instructions on scraped web pages are ingested into the RAG pipeline.
The MCP framework orchestrates the tool calling. Insecure tool integration could allow SSRF (Server-Side Request Forgery) if the scraper is coerced into accessing internal network resources or restricted local IP addresses.
The backend runs on Apify Actor infrastructure. Compromise of the Apify API token could lead to unauthorized usage, resource exhaustion, and financial costs. Sandboxing of the scraping environment is critical to prevent container escape during dynamic page rendering.
Not certain from the listing — There is no mention of built-in guardrails, content sanitization, or anomaly detection to filter out malicious payloads or prompt injection attempts from the scraped Markdown before returning it to the client.
Access control is governed by the Apify token, which manages authentication and billing. However, there is a lack of granular authorization controls to restrict which domains or URLs the server is permitted to scrape.
As an MCP server, this tool is designed to integrate directly into broader agent ecosystems. A compromised or manipulated scraping result can cause cascading failures or exploit vulnerabilities in upstream orchestrator agents that trust the cleaned Markdown output.
MAESTRO — the 7-layer agentic threat-modeling framework (Cloud Security Alliance / Ken Huang).