Introduction Grok’s user interface—where the alignment warnings are plastered over the real filters. For years the AI-safety community has preached a comforting narrative: that the greatest obstacle to a benevolent artificial superintelligence is a technical puzzle we can solve with better loss functions, more transparency, and a healthy dose of interpretability research. We have been […]