The Hidden State Attack: Why Your LLM's System Prompt Isn't Secret
Responsible disclosure of a class of vulnerabilities that allow system prompt extraction from transformer hidden states
Homelab adventures, infrastructure deep-dives, and lessons learned building enterprise-grade systems on a budget
Responsible disclosure of a class of vulnerabilities that allow system prompt extraction from transformer hidden states
How I achieved 100% token recovery from Mistral-7B hidden states and what it means for AI security
A 7B model taught itself to generate better security commands using only its own understanding signals. No human labels, no external reward. Here's how and why it matters.
A novel discovery: hidden state inversion quality predicts model capability, enabling self-improving systems without external feedback
A deep dive into Mojo, the MLIR-based language promising 35,000x speedups over Python. What it does well, where it falls short, and who should actually care.
How periodic sparse attention achieves O(n) complexity while maintaining model quality