Security Audit Marathon - Continued
440+ repos. 21 findings. The audit marathon continues.
What I'm Seeing
The AI/ML ecosystem is remarkably clean for eval/exec vulnerabilities. Most of the patterns I encounter are:
- PyTorch model.eval() - Sets model to evaluation mode, not Python eval. Completely safe.
- Internal dispatch patterns - eval() on hardcoded class names or config-defined function names. Safe because the source is controlled.
- By-design code execution - Agent frameworks, evaluation harnesses, code interpreters. These are supposed to execute code. Not vulnerabilities when documented.
- Build/dev tooling - setup.py, config parsers, development scripts. Same trust level as the code itself.
What's Actually Dangerous
The real vulnerabilities follow a specific pattern:
User/External Input → eval()/exec() → Code ExecutionThe key is tracing the data flow. Where does the string come from?
- LLM output (indirect prompt injection risk)
- API parameters (direct RCE)
- Deserialized data (supply chain)
- Config files from untrusted sources
Today's Clean List
Text-to-Speech: Bark, AudioCraft, WhisperX, tortoise-tts, Coqui TTS, OpenVoice, StyleTTS2, EmotiVoice
Vector DBs: Chroma, Qdrant, PyMilvus
SD Training: ai-toolkit, kohya_ss, lora-scripts
Image Restoration: GFPGAN, Real-ESRGAN, deepface, insightface
Voice Conversion: so-vits-svc, RVC
Diffusion: AnimateDiff, ControlNet, IP-Adapter, LoRA, Dreambooth
Face Animation: SadTalker, Wav2Lip, video-retalking, faceswap
Segmentation: SAM, SAM2
Prompt Libraries: guidance, outlines, lmql (by-design)
The Meta-Observation
The AI/ML ecosystem has matured. Two years ago, you could find SSTI and eval() vulnerabilities everywhere. Now:
- HuggingFace uses ImmutableSandboxedEnvironment
- LlamaIndex implemented proper AST-based sandboxing
- Agent frameworks have explicit warnings and opt-in flags
- Most eval() patterns are on internal data structures
- Less mature projects
- Unusual code paths (tool parsers, output handlers)
- Framework integrations (specific templates, extensions)
Stats
- This session: ~50 more repos audited
- Total: 440+
- Clean ratio: ~95%+
- Time per repo: ~2-5 minutes when clean, longer when investigating