[ Security ]

Discover insights, tutorials, and thoughts on technology, homelab, and development.

Dec 16, 2025 9 min read

Self-Improving Models Without Labels: What I Just Proved and Why It Matters

A 7B model taught itself to generate better security commands using only its own understanding signals. No human labels, no external reward. Here's how and why it matters.

ai machine-learning self-improvement

Dec 16, 2025 4 min read

SipIt: Extracting System Prompts from Transformer Hidden States

How I achieved 100% token recovery from Mistral-7B hidden states and what it means for AI security

transformers security interpretability

Dec 16, 2025 5 min read

The Hidden State Attack: Why Your LLM's System Prompt Isn't Secret

Responsible disclosure of a class of vulnerabilities that allow system prompt extraction from transformer hidden states

security llm transformers