[ Blog ]

Homelab adventures, infrastructure deep-dives, and lessons learned building enterprise-grade systems on a budget

Dec 18, 2025 by Adam

When Models Talk the Talk but Don't Walk the Walk: A Journey into LLM Behavioral Consistency

We fine-tuned a security agent to 100% skill differentiation in probing tests, but it collapsed to a single behavior in deployment. This gap led us to develop a trust diagnostic framework.

llm fine-tuning behavioral-consistency security-agent

Dec 18, 2025 by Adam

I Was Wrong About Self-Improving Models: Here's What I Actually Found

A follow-up to my retracted post on self-improving models. The fidelity metric improved 18% but actual performance dropped 11%. Here's what went wrong, what I learned about Goodhart's Law, and why reproducibility filtering might be the answer.

self-improvement machine-learning goodharts-law reproducibility

Dec 16, 2025 by Adam

Writing High-Performance AI Kernels in Mojo: 10x Faster Than PyTorch

How I built custom Mojo kernels for AI research that outperform PyTorch on consumer hardware

mojo performance simd gpu

Dec 16, 2025 by Adam

The Hidden State Attack: Why Your LLM's System Prompt Isn't Secret

Responsible disclosure of a class of vulnerabilities that allow system prompt extraction from transformer hidden states

security llm transformers vulnerability

Dec 16, 2025 by Adam

SipIt: Extracting System Prompts from Transformer Hidden States

How I achieved 100% token recovery from Mistral-7B hidden states and what it means for AI security

transformers security interpretability sipit

Retracted

Dec 16, 2025 by Adam

Self-Improving Models Without Labels: What I Just Proved and Why It Matters

A 7B model taught itself to generate better security commands using only its own understanding signals. No human labels, no external reward. Here's how and why it matters.

ai machine-learning self-improvement llm