Skip to main content

Command Palette

Search for a command to run...

Asking your AI to summarize a web page is now an attack surface

Updated
2 min read

Researchers showed a page can hide instructions that run when an assistant summarizes it.

https://www.theregister.com/research/2026/05/29/chatgpt-prompt-injection-turns-web-pages-into-phishing-lures/5248137

The model reads the page, obeys the buried text, and serves the user a fake security alert, a phishing link, or an attacker's QR code that moves the attack from the browser to a phone.

The user asked for a summary. They got an instruction they never saw.

This is the part most people get wrong about prompt injection.

The danger is not the model. It is the boundary.

The moment a model reads content it did not write, that content is code, and you have handed it your output channel.

We treat a query from a user as untrusted. We escape it, we never run it raw.

A web page summarized by an agent gets none of that suspicion.

My own defaults are plain. I keep untrusted content behind guardrails, and I reach for official docs through trusted MCP servers instead of the open page.

The pattern is the same one we already use everywhere else: • treat any content the model did not write as untrusted input • screen it before it reaches the model • screen the model's output before it reaches a user or a tool

Defense is moving this way too, with small guard models that sit inline and check agent input and output in real time.

None of this is exotic. It is the oldest rule in security, pointed at a new input.

Untrusted text was always dangerous.

Now it can talk back.

#ai #llm #llm-security