AIExplained posts from patreon

A Robot in Your House, 2026-27. Excl. Interview w/ Sunday Co-founder Tony Zhao on their Breakthrough Method

https://x.com/tonyzzhao?lang=en

https://www.sunday.ai/

https://x.com/sundayrobotics

https://www.sunday.ai/journal

Wired: 2025-12-21 19:23:03 +0000 UTC View Post

5 Takeaways from Sutskever Breaking Silence + Opus 4.5

Interview Transcript: https://www.dwarkesh.com/p/ilya-sutskever-2

The Information Exclusive: https://www.theinformation.com/articles/openai-ceo-braces-possible-economic-headwinds-catching-resurgent-google?rc=sy0ihq

Opus 4.5: 2025-11-27 18:45:05 +0000 UTC View Post

Last 24 Hours: Signs of Introspection in LLMs

Before this paper from Anthropic, out on the 29th, I was a lot more skeptical about LLMs self-reporting their internal state. This is not proof that they can, but partial proof of circuits showing true introspective capability, and more than that, the ability to map a question about it to those circuits.

Plus a big update to lmcouncil.ai.

2025-10-30 18:40:22 +0000 UTC View Post

Exploring the Frontiers of Intelligence … by co-creating LMcouncil.ai

If Simple Bench tests common sense, what is the best test of practical wisdom? For me, it was finding out whether 2025 language models could build the app I have been meaning to make for two years. They can’t, no yet, but they can get very close with a determined human in the loop.

lmcouncil.ai is a combination of SmartGPT, Karpathy’s vision for an ‘LLM Council’ and my own touches. Compare model outputs, call votes, se...

2025-10-15 15:47:00 +0000 UTC View Post

The 'Blockers to the Singularity' - and the plans to remove them - OpenAI paper

A double-length vid on all the juiciest bits from OpenAI's most recent paper, as well as my framework on the blockers to the singularity. A little bit of Math on why GenAI hallucinates, and why scale is not all you need, plus the crazy Google AlphaSoftware paper.

Paper: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-...

2025-09-19 14:40:42 +0000 UTC View Post

Claude Can Now End the Chat - but When and Why?

Anthropic say that their biggest model, Opus can now end a chat. But when will it, any why? And is AI model welfare the most overblown or under-discussed topic in AI? The highlights from a 62-page report.

Anthropic Post: https://www.anthropic.com/research/end-subset-conversations?s=09

Announcement: 2025-09-01 13:34:33 +0000 UTC View Post

METR Doubling-Times Star Joel Becker, on Developer Slowdown With AI & Amodei Automation Prediction

Highlights from an interview with one of the most productive AI researchers around, Joel Becker, who contributed to the famed METR doubling times analysis and the recent eye-opening study on developers being slowed down when using AI. Billions of life-decisions and trillions of investment ride on the exact contours of AI progress, so it was definitely worth a chat. With paper context, my analysis, and more.

Clips:

06:03 - Models Used

06:30 - Incremental Progress Hides Expon...

2025-08-20 12:20:52 +0000 UTC View Post

Deep-ish Thoughts on New Gemini DeepThink (plus GPT-5 delay, HLE errors)

Reflections on this moment in AI progress, and why I am positive about the short-to-medium term outlook from here (plus data and examples).

DeepThink Release: https://blog.google/products/gemini/gemini-2-5-deep-think/

HLE Errors: https://www.futurehouse.org/research-announcements/hle...

2025-08-04 11:57:45 +0000 UTC View Post

Our New Age of Artificial, Intelligent Surveillance

No one is talking about how LLMs enable a new era of mass intelligent surveillance. This is not just Grok 4 @ Doge, this is world-wide, as we speak. The second documentary, exclusive, early and ad-free on Patreon.

Gemini (10m tokens): https://arxiv.org/abs/2403.05530

NYT: Anthropic CEO on regulation and transparency: 2025-07-12 12:06:10 +0000 UTC View Post

No. 1 Superforecaster & AI 2027 Author Eli Lifland - On Our Differing Timelines to Superintelligence (New Podcast Series Potentially!)

A debate on whether superintelligence will arise by 2027, with No. 1-ranked RAND Superforecaster & AI 2027 co-author Eli Lifland. Set to be a new series, where we discuss on the podcast how progress is going with respective to our divergent timelines.

AI 2027: https://ai-2027.com/

Key Timeline and Eli Update: https://ai-2027.com/res...

2025-06-19 10:07:39 +0000 UTC View Post

Claude 4 Simple SOTA Insights + Leaked System Prompt

What is Claude 4 Opus understanding that others' aren't, to hit new records? And what is going on in that leaked system prompt. 10 highlights of the system prompt plus a quick update on Simple. Oh, and it looks like GPT-5 is finally arriving in July...

Full Leaked System prompt: https://raw.githubusercontent.com/elder-plinius/CL4R1...

2025-06-02 13:50:55 +0000 UTC View Post

A New Twist in the ChatGPT Sycophancy Saga

Remember when, years ago (actually just 3 weeks), ChatGPT started to go crazy? Being a yes-man with an ‘improved personality’ according to Altman? Well turns out there are 3 new theories as to why that happened, and a study that at the beginning of the video I thought turned things on its head. By the end, as you can see, I am not so sure? So is ChatGPT fixed?

Is GPT-4o Fixed? 2025-05-14 16:18:58 +0000 UTC View Post

Next-level reasoning: The Good News and Bad News - 2 new papers analysed

Paper 1: Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? https://arxiv.org/pdf/2504.13837

Paper 2: Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification https://arxiv.org/pdf/2502.01839

Tweet: 2025-04-25 15:13:52 +0000 UTC View Post

Paper: AI Doesn't Say What It Thinks. AI Orgs: It Could Be Your Friend

A fantastic new Anthropic paper shows that LLMs seem hell-bent on obfuscating why they gave an answer, even when it would be easier to be honest.

Download: https://drive.google.com/file/d/12jgVrGZLtVC_8DGTczQIzCVWPIBrnA0y/view?usp=sharing

Paper: 2025-04-14 16:19:25 +0000 UTC View Post

“OpenAI is Not God”: DeepSeek, Liang Wenfeng and the R1 Phenomenon

The documentary revealing just how much the world got wrong about DeepSeek, what motivates the man behind the company, and what's next.

Incredible Editing: Hassan Iq.

Sources:

Liang Wenfeng 2023 Interview: https://web.archive.org/web/20241228030725/https://www.chinatalk.media/p/deepseek-from-hedge-fund-to-frontie...

2025-04-03 22:27:01 +0000 UTC View Post

'Claude 3.7 Knows it is Being Tested' - New Research, Theory of Mind and Consciousness

New research: 'Claude 3.7 knows it is being tested'. But can it model your mind? And is it having a subjective experience? Plus a guest appearance from the me of ... March 2023.

Download Link: https://drive.google.com/file/d/1g2L_1K6AZnKOykt-eyQzJ0yCUVmleGAb/view?usp=sharing

Apollo Research post: 2025-03-24 16:48:31 +0000 UTC View Post

4 AI Trends Emerging in 2025. Patchy, Epic, Expensive, and Deceptive Models

We are less than 6 months into a paradigm that many say 'will get us there'. Let's uncover 4 underrated trends in AI, to get a sneak peek of the future. Things are about to get pricey, stay patchy, be epic and remain deceptive.

Link to Download: https://drive.google.com/file/d/1gPEuYiF2hrFvBPa762OFZiV0miDUR0pd/view?usp=sharing

Pricey: 2025-03-07 17:08:13 +0000 UTC View Post

Content Creation Insights, Update on Timelines, Race Dynamics: Think Sip-by-Sip Podcast

Some chill behind-the-scenes thoughts on being a content creator, an interesting exchange I had this week, and my updated take-off timelines.

https://www.sesame.com/research/crossing_the_uncanny_valley_of_voice#demo

https://www.hume.ai/

2025-03-04 19:10:42 +0000 UTC View Post

Mini-documentary Poll

I want everyone to get exceptional value from this Patreon; I have also had a load of comments about sharing specifically the mini-documentaries on the main channel. So I thought I would turn to you guys for expert advice.

So for the occasional mini-docs (normal AI Explained-style vids/pods etc will still be on here only), what would you feel is best?

(writing next one as we speak, on DeepSeek)

2025-02-17 11:53:27 +0000 UTC View Post

The One Machine to Rule Them All - Origin Stories. Mini-Documentary on How the Founding Vision of Each AGI Lab Went Awry

A different style of video! Ft. professional video editor, and focused on the 'impossibility' of AGI labs (OpenAI, DeepMind, Anthropic etc) keeping to their founding visions...

Feedback appreciated!

2025-02-10 19:30:14 +0000 UTC View Post

Pod 12: Apollo Research Group Interview - Models Try Hard Not to Undergo 'Unlearning', the media, and much more ... - Let's Think Sip-by-Sip

My Dec Apollo video: https://www.patreon.com/posts/media-over-o1s-117630338

Updated Apollo Paper: https://static1.squarespace.com/static/6593e7097565990e65c886fd/t/67869dea6418796241490cf0/1736875562390/in_context_schemin...

2025-01-22 18:35:36 +0000 UTC View Post

'Takeoff Speeds' - my unreleased, now-topical explainer

This week, mid-Jan 2025, Sam Altman announced that he now thought we were in the fast-takeoff timeline. But what did he think before, and what are takeoff speeds anyway? My deep dive (and last unreleased video of the 8-part series).

Download: https://drive.google.com/file/d/1QkgBLNKcQ6TjkA6ljTkiNpqPqpTR2tjW/view?usp=sharing

Superintelligence OG: 2025-01-20 11:54:36 +0000 UTC View Post

Veo 2 vs Sora ... then Veo o3?

Let's kickstart 2025 by comparing Veo 2 to Sora, then asking a bigger question. Can the o3 method be repurposed for different modalities? I piece together the evidence, and argue that we could see step-changes in more than just text-based reasoning.

Link for off-line watching: https://drive.google.com/file/d/1o354AIk6QeIL8LWF5wJX7z-Wd7hocYex/view?usp=sharing
2025-01-03 18:30:12 +0000 UTC View Post

Media Misreporting Over o1's 'Escape' - 70 page report highlights - why o1 did what it did

Everyone saw the headlines - o1 tried to escape. But did it? And why? Don't believe the media hysteria.

Download Link: https://drive.google.com/file/d/1QxnBRfPdG54s5F0a92pC49w5tD3GweMx/view?usp=sharing

Apollo Report: 2024-12-09 16:48:19 +0000 UTC View Post

DeepMind Prof. Tim Rocktäschel on Takeoff Speeds, GDP 2x-ing, Gemini 2, ASI timelines and Automating Science

I quiz Google Deepmind Principal Scientist Prof. Tim Rocktäschel on AGI to ASI timelines, promptbreeding, GDP 2x-ing, Gemini 2 and automating science.

https://geni.us/ArtificialIntelligence

https://en.wikipedia.org/wiki/Summa_Technologiae

Laura Ruis - PROCEDURAL KNOWLEDGE IN PRETRAINING DRIVES REASONING IN L...

2024-12-01 15:18:53 +0000 UTC View Post

AI vs Human Creativity. Can you tell them apart? Plus, a key battleground for 2025 onwards

New reports in the Washington Post and Telegraph got me down a rabbit hole of poetic and artistic exploration. The results were ... interesting. But the conclusions, for business vs labor, could be profound.

Download Link: https://drive.google.com/file/d/1BeSImcc0tgIUS7GTWckVMr0rANTPIJmc/view?usp=sharing

Nature Study: 2024-11-19 19:08:45 +0000 UTC View Post

Pod 10: 4 Reasons Why Data is Now Even More Important: Scaling plateaus, judge rulings, test-time training paper and post-AGI jobs - Let's Think Sip-by-Sip

Today's Bloomberg Piece: https://www.bloomberg.com/news/articles/2024-11-13/openai-google-and-anthropic-are-struggling-to-build-more-advanced-ai?s=09

Reuters Quotes Sutskever: 2024-11-13 16:17:30 +0000 UTC View Post

A Step Toward Nationalizing AI? White Memo Full Analysis and Context

A White House Memo released in the last 72 hours is hard to ignore. It is the clearest statement yet from the US Govt that it thinks ‘general-purpose AI’ is coming soon, and that it could upend international order. They want to get much more involved, for better or worse. Here are the 28 most impactful excerpts from the Memo, together with tons of context from new and old material from OpenAI, Anthropic, the Pentagon, the NYT and more.

Link for Off-line Watching/Downloa...

2024-10-27 16:29:36 +0000 UTC View Post

Pod 9: Full Simple-Bench Results, o1-preview to Grok-2 - Let's Think Sip by Sip

At last, full Simple bench results for 13 models, including the new Grok 2, o1-preview, and latest Gemini. Plus, what's coming next.

Link for Download: https://drive.google.com/file/d/1iH1B4qBgA82WdBczlHRlwC_4ZbDCgC0X/view?usp=sharing

https://arxiv.org/html/2409.01374

Simple-Ben...

2024-10-16 14:57:00 +0000 UTC View Post

'Machines of Loving Grace' - Key Highlights. 'All the 21st Century ... by 2036.'

The CEO of the one of the leaders in the race to human-level AI (Anthropic), envisions ‘Machines of Loving Grace’, watching over us. Dario Amodei weaves in hundreds of predictions in biology, neuroscience, economies and global politics, all premised on the ‘optimistic scenario’. Released less than 48 hours, I give you the full highlights, and my own analysis. Not for the faint-hearted.

Download Link: 2024-10-13 19:05:59 +0000 UTC View Post