A fantastic new Anthropic paper shows that LLMs seem hell-bent on obfuscating why they gave an answer, even when it would be easier to be honest.
Download: https://drive.google.com/file/d/12jgVrGZLtVC_8DGTczQIzCVWPIBrnA0y/view?usp=sharing
Paper: https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf
Post: https://www.anthropic.com/research/reasoning-models-dont-say-think
FT Exclusive: https://x.com/FT/status/1910545751119135199
Coconut Paper: https://arxiv.org/pdf/2412.06769
Noam post:https://x.com/polynoamial/status/1910379351759347860
‘Deep Misgivings’: https://tech.yahoo.com/ai/articles/openais-sam-altman-deep-misgivings-093754634.html
OG Paper: https://arxiv.org/pdf/2305.04388
Drew Rogers
2025-05-22 23:58:21 +0000 UTCPhilip
2025-04-27 11:22:24 +0000 UTCKol Tregaskes
2025-04-21 11:49:08 +0000 UTCRyan Smith
2025-04-16 20:22:23 +0000 UTC