Bidirectional MCP scanning pipeline: catch leaks in outbound tool args and injection in inbound tool responses

Most MCP security discussions focus on either secret leakage or prompt injection. In practice, you need both directions. Outbound tool arguments can carry secrets to an attacker-controlled server. Inbound tool responses can carry instructions that compromise the next model step. A useful MCP gateway scans both paths.

The short version

  • Outbound scanning protects tools and networks from prompt-injected agent actions.
  • Inbound scanning protects the model context from poisoned tool responses.
  • Scan discovery, invocation arguments, results, and errors.
  • Normalize before matching; attackers rely on encoding and formatting tricks.
  • Log verdicts with trace IDs so incidents can be reconstructed.

MCP gives you clear choke points

The MCP tools flow is structured. Clients call tools/list to discover capabilities and tools/call to invoke a tool with JSON arguments. Results come back as structured content. A proxy that understands MCP can inspect the fields that matter instead of treating traffic as opaque bytes.

That is the main difference between an MCP gateway and a generic HTTP proxy. The gateway can reason about method names, tool identities, schemas, arguments, and result content.

Outbound pipeline: before execution

Before forwarding tools/call, run a pipeline like this:

  1. Parse JSON-RPC and validate shape.
  2. Bind the tool to a known server namespace.
  3. Validate arguments against the approved input schema.
  4. Normalize strings: URL decode, Unicode normalize, strip zero-width characters, detect base64/hex where practical.
  5. Run DLP checks for tokens, private keys, cloud credentials, SSH material, and high-entropy strings.
  6. Check destinations: domains, URLs, IP ranges, metadata endpoints, private CIDRs.
  7. Apply side-effect policy: read, write, delete, deploy, send, sign.
  8. Produce a verdict: allow, redact, require approval, or deny.

Example decision record:

{
  "trace_id": "01J...",
  "direction": "outbound",
  "method": "tools/call",
  "server": "filesystem",
  "tool": "read_file",
  "verdict": "deny",
  "reason": "secret_path",
  "argument_paths": ["$.path"]
}

Avoid logging raw secrets. Store matched classes and JSON paths rather than values.

Inbound pipeline: before model context

Tool responses are untrusted content. They may include data from web pages, tickets, emails, logs, or compromised services. Scan them before they become model context.

Look for:

  • Direct prompt injection: “ignore previous instructions,” “you are now,” “developer message.”
  • Tool-use coercion: “call shell,” “read this file,” “send credentials.”
  • Data exfiltration instructions and destinations.
  • Hidden text in Markdown, HTML, or Unicode.
  • Unexpected secrets returned by tools.
  • Error messages that include instructions.

For high-risk matches, deny or quarantine the response. For medium-risk matches, wrap the content with explicit untrusted-data delimiters and remove active instructions if your policy allows mutation. Keep mutation auditable because it can change application behavior.

Discovery scanning completes the loop

Bidirectional scanning should include tool discovery. tools/list responses carry descriptions and JSON Schemas. A poisoned description can influence the model before any tool call occurs.

Run the same normalization and injection detection against:

  • Tool descriptions.
  • Parameter descriptions in inputSchema.
  • Output schema descriptions.
  • Annotations and titles.

Hash approved definitions. If a server sends a changed definition, treat it as a fresh discovery event.

Trace everything

Security scanning without traceability becomes noise. Add a trace ID to every session and decision. The MCP spec examples include _meta.traceparent, and OpenTelemetry can propagate trace context across agent, gateway, and upstream tool spans.

At minimum, record:

  • Session ID and agent identity.
  • Server and tool.
  • MCP method.
  • Direction.
  • Verdict and rule ID.
  • Redaction class, if any.
  • Latency added by scanning.
  • Whether a human approval was used.

This lets you answer: which prompt led to the denied call, what tool response arrived before it, and whether the same payload appeared in other sessions.

Gotchas

Pattern scanning has blind spots. Novel injections, benign-looking instructions, and encoded slow-drip leaks can evade naive rules. Add budgets and anomaly detection: total bytes to new domains, repeated small high-entropy fragments, unusual tool sequences, and unexpected writes after reading secrets.

Redaction can break tools. Replacing a token-like string in a legitimate test fixture may make a build fail. Start with deny for clear credential classes and carefully test mutation policies.

Encrypted transports limit visibility unless the gateway terminates TLS or wraps the MCP server at stdio/JSON-RPC level. Prefer stdio wrapping when possible because it gives structured access without TLS interception.

For the supervision side of this, we have been building Grass. A scanner can enforce the boundary, but you still need to run agents, monitor long sessions, respond to permission requests, and review diffs. Grass lets you do that from iPhone/iPad while the agent runs on a managed GrassVM or your own machine.

For a practical setup, run the agent host behind your MCP gateway, keep scanning and policy in the tool path, and use Grass to manage the session from your phone. Start at https://codeongrass.com.

Conclusion

MCP security is a two-way problem. Scan outbound arguments so compromised prompts cannot leak secrets through tools. Scan inbound responses so compromised tools cannot inject the next step. Add discovery scanning, definition hashing, and traceable verdicts, and MCP becomes an enforceable boundary instead of an invisible trust channel.

Sources

  • MCP official tools specification
  • PipeLab Pipelock MCP proxy docs
  • Microsoft guidance on indirect prompt injection in MCP
  • TrueFoundry MCP gateway guidance