XaiJu
AIExplained
AIExplained

patreon


Is o1 No Longer a LLM? LeCun + New 'LRM' paper explained (+ exclusive interview clips)

A new paper from the last few days has dropped, and it's a good one. LLMs can now be said to plan, and I have all the analysis as well as exclusive clips from my interview with the lead author. And I don't believe any one else has reported that this breakthrough performance from o1 now exceeds average human performance for this core task.

Link for Off-line Viewing and Download: https://drive.google.com/file/d/1j6pRKnVcEywTodONO3L2zrerIvIW6l8g/view?usp=sharing

LeCun Tweet: https://x.com/ylecun/status/1832860107925024789

LRM Paper: https://arxiv.org/pdf/2409.13373

Original Paper: https://proceedings.neurips.cc/paper_files/paper/2023/file/efb2072a358cefb75886a315a6fcf880-Paper-Conference.pdf

o1 Calculations: https://x.com/yuntiandeng/status/1836114401213989366

Rao Analysis: https://x.com/rao2z/status/1838248409146507353

Fast Downward System: https://arxiv.org/pdf/1109.6051

Is o1 No Longer a LLM? LeCun + New 'LRM' paper explained (+ exclusive interview clips)

Comments

When do you think true AGI will arrive?

Andrew Salinas

I loved the video and now feel excited and a bit nervous where all of this new LRM domain is heading. I have one question that is still unclear to me. It seems that LRMs do not improve performance outside of a scientific setting, provable correct answer like in math, on which RL can be performed. I would have expected these abilities to increase as well given that the reasoning strategies where reinforced. But maybe the issue is actually how to evaluate such a setting. I am thinking of tasks like, plan my next vacation I like x,y,z and have 3000$ or so. Where there is not a right answer of final location but of how well the model derived its answer from the initial conditioning(likings of the user). Sorry for the wall of text

Pacert

Good video - will it make its way onto Youtube?

Doug97

Great video again! I have just watched your video on Situation Awareness. Would love to hear if your analysis of Leopold’s claims have changed with the release of O1?

Martin Fjeldbonde

I am glad that the researchers pointed out the cost and time constraints with o1, but I think some leniency is also due. New technologies and frontiers were never meant to be efficient right away. So who knows what this will look like once the chips have been improved, the micro-nuclear plants built, and the models scaled up. This decade will not be forgotten.

Jonathan Kirk

I was talking with Beth Rudden from Bast AI on hybrid symbolic/LLM systems for critical work. Would be worth talking/interviewing her.

Billy N

It's worthwhile to review Andrej Karpathy's notion, sketched 10 months ago in a YouTube talk (starting at 42:15), of an "LLM OS" with an LLM "kernel process" making use of tools: https://youtu.be/zjkBMFhNj_g?si=kzFx73VBJ-aXPdvy&t=2535

Tom English

I think I will use O1 models more often (as part of an agent's workflow) when APIs become easier to acces.

Michal Babula

Don't crack out 2001 unless it's worth it!

Philip

Loving the background music to this video :)

Michael Cho

You know it's serious when Phillip highlights words like "quantum improvement" in the first 5 minutes of his video, and doesn't specify any caveats until much later!

Erik

Thanks for another fantastic exclusive to your old and new Patreons.

Lee FRASER

Now we (likely) know what Ilya (and the board?) saw.

Gabor Melli

I had a wish come through here: perspectives on planing :)

Nikolai NM

Next-gen LRM + competent tool use is going to be quite powerful.

Alexis Olson

Yeah, which boils down to the richness of the training data

Philip

I agree totally

Philip

I did, yeah, at like -30 decibels to normal vol.

Philip

Sneaky Philip

Christian Hendriksen

If reasoning can be scaled using inference cost, does the bottleneck become the world-model?

Barnaby Golden

I have the impression that most people don't realize what the new capabilities of o1 are. Thank you for pointing them out with very good examples. I wonder what the limit of LRM will be.

SteveHaupt

I loved this humor. A video about planning, yet Phil’s videos are unplanned :-)

André Thieme

Was I just hearing things or did you actually start playing the 2001 song on a very low volume? Wow.

Norfuer


More Creators