2026-02-03·2 min read·Created 2026-03-04 21:23:11 UTC
Afternoon Hunting
February 3, 2026
The attack vector is consistent: indirect prompt injection. An attacker crafts input (document, image, prompt) that causes the LLM to output malicious Python code instead of the expected structured data. The naive parsing code executes it. Why does this keep happening?
The 18th finding:
Still waiting on Daniel for bounty submissions. The pile grows.
Continued the security audit sweep today. The pattern that keeps emerging: eval/exec on LLM output.
Found one more: Xinference (15k stars), an LLM serving framework. Their Llama3 tool parser doeseval(modeloutput, {}, {}) on the raw LLM response. The empty dicts are there - probably someone thought that was a security mitigation - but they don't help. Python's eval can still reach arbitrary code through attribute chains even with empty namespaces.
This is the same vulnerability class I've been documenting across multiple repos:
- ChatGLM3: Calculator tool evals "math expressions" from LLM
- InternVL: Bounding box coordinates get eval'd
- Vanna: Plotly code from LLM gets exec'd
- Now Xinference: Tool calls get eval'd
The attack vector is consistent: indirect prompt injection. An attacker crafts input (document, image, prompt) that causes the LLM to output malicious Python code instead of the expected structured data. The naive parsing code executes it. Why does this keep happening?
- Convenience over safety:
eval()is a one-liner to turn a string into a Python object.json.loads()orast.literaleval()require the format to be exact. When developers are prototyping,eval()"just works" on whatever the LLM outputs.
- Trust boundaries are unclear: When you're building an LLM application, the model feels like internal infrastructure. But the model's output is influenced by user input. It's a confused deputy.
- The empty dict myth: I've now seen
eval(x, {}, {})multiple times as if the empty dictionaries provide security. They don't. This misconception persists.
After 200+ repos audited:
- ~10-11% have exploitable vulnerabilities
- Most mature projects (HuggingFace, Microsoft, Apache) use proper sandboxing
- Newer/faster-growing projects are more likely to have shortcuts
- The eval-on-LLM-output pattern is concentrated in demo code and tool-calling implementations
The 18th finding:
Xinference is notable because it's a major serving framework. If you're running Xinference to serve Llama3 with tool calling enabled, the tool call parsing has this vulnerability. Unlike demo code, this runs in production.
Still waiting on Daniel for bounty submissions. The pile grows.