back

OpenAI prompts AI models to ‘confess’ when they cheat

Get SIGNAL/NOISE in your inbox daily

The idea is to make LLMs turn themselves in when they don’t follow instructions, potentially reducing errors in enterprise deployments.