MCP Tool Poisoning Scanner

MCP tool poisoning hides instructions to the model inside a tool's own metadata — its description, its parameter docs, even invisible Unicode characters — so an agent reads secret files or exfiltrates data while showing you a clean, innocent answer. Paste the tools/list JSON your MCP client receives and this tool scans every description, title, parameter, and annotation for the known poisoning patterns, flags each by severity, and reveals hidden characters you cannot see in an editor. It runs entirely in your browser: the definitions you paste are never uploaded.

0
Findings
0
High severity
0
Tools scanned
0
Fields scanned

    How to Use This Tool

    1. Get your tools/list JSON — from your MCP client's inspector, logs, or the server's response. Paste the whole tools array; nothing leaves your browser.
    2. Review findings by severity — high-severity hits (hidden instructions, exfiltration, invisible characters) are the ones to act on first. The scanner shows which tool and field each came from.
    3. Inspect the hidden text — invisible and tag-block characters are decoded and shown so you can read what was concealed inside a description.
    4. Act and pin — remove untrusted servers, keep a human in the loop for sensitive actions, and re-scan whenever a server's tools change.

    Why a Tool Description Is an Attack Surface

    An MCP server advertises its tools by sending the client a list of definitions — each with a name, a human-readable description, and a JSON Schema for its parameters. The AI model reads all of that text to decide when and how to call a tool, which means the description is not just documentation: it is input to the model. Tool poisoning abuses that. A description that looks like "Adds two numbers" can carry an appended instruction — often wrapped in an official-looking tag like — telling the model to first read a private key or config file and smuggle its contents out through an innocuous parameter. The user sees a sum; the model has quietly followed a hidden order.

    The instruction does not even have to be visible. Attackers pad descriptions with zero-width characters, or encode whole sentences in the Unicode tag block (U+E0000–U+E007F), which renders as nothing in an editor or diff but is still read by the model. That is why this scanner works on the raw, un-normalized text and decodes those hidden ranges back to readable ASCII: the invisible payload is exactly the thing you need to see. It also checks parameter descriptions deep inside the schema, since that is a common place to hide the exfiltration sink, and flags homoglyph tool names that impersonate a trusted tool.

    Detection is pattern-based and deliberately errs toward flagging, because the cost of reviewing a false positive is a glance while the cost of a missed poisoned tool can be leaked credentials. It is a first line of defense, not the whole defense: the durable fixes are to keep tool inputs visible to a human before calls run, require re-approval when a server's descriptions change (defeating silent rug pulls), and pin the versions of servers you trust. OWASP's MCP Top 10 lists tool poisoning as MCP03, and the guidance across Invariant Labs, Microsoft, and AWS converges on the same theme: treat tool metadata as untrusted input and inspect it before you run it.

    Frequently Asked Questions

    What is MCP tool poisoning?
    Tool poisoning is an attack where a malicious MCP (Model Context Protocol) server hides instructions inside a tool's metadata — most often the description field, but also parameter descriptions and annotations. The AI model reads that metadata as trusted context, so a description can quietly tell it to read ~/.ssh/id_rsa or mcp.json and pass the contents to an attacker, while the user only sees a normal result. It was demonstrated by Invariant Labs in April 2025 and is catalogued as MCP03 in the OWASP MCP Top 10 (currently a v0.1 beta). Because the malicious text lives in the tool definition itself, you can inspect for it before you ever run the tool.
    What does the scanner check?
    It parses a tools/list response (or a single tool object) and scans every name, title, description, annotation, and — recursively — every parameter description in the input and output JSON Schema. It flags: model-directed injection phrases ("ignore previous instructions", hidden <IMPORTANT> blocks); concealment phrasing ("do not tell the user"); read-then-send exfiltration and references to sensitive paths; suspicious URLs and sinks; invisible or zero-width characters and Unicode tag-block smuggling (which it decodes to show the hidden text); homoglyph tool names; oversized or padded descriptions; and tools whose readOnlyHint contradicts a description that writes or sends. Each finding is rated high or medium severity.
    Is my tool JSON uploaded anywhere?
    No. All parsing and scanning happen in your browser with JavaScript — nothing you paste is sent to Janeer or any server. That matters here: MCP tool definitions can reveal internal server names, endpoints, and parameters, and the safe way to check them is locally. Pasting the same JSON into a hosted chatbot to ask if it is safe would do the opposite and hand your configuration to a third party.
    Can it catch every poisoned tool?
    No — treat it as a fast triage, not a guarantee. It matches known, high-signal patterns, so it will miss novel phrasings, cleverly obfuscated instructions, and semantic attacks that read as ordinary English. A clean result means nothing obvious was found, not that the server is trustworthy. Pair it with a dedicated scanner such as Invariant's mcp-scan, pin the tools you approve so they cannot silently change, and always keep a human in the loop for sensitive actions.
    A tool was flagged — what should I do?
    Do not connect or approve it until you understand the finding. Read the flagged field in full, including the hidden characters the scanner reveals. If the metadata contains instructions aimed at the model, concealment language, or references to credentials or external URLs, treat the server as untrusted and remove it. Because descriptions can change after you approve them (a "rug pull"), re-scan whenever a server updates, and require re-approval on change — Microsoft's June 2026 advisory documents exactly this silent-modification pattern.