XaiJu
AIExplained
AIExplained

patreon


'Emergent Behaviors' - Controversial Terms in AI, Explained - New Series

One of my favorite videos in the series! A new bonus series explaining some of the most controversial terms in artificial intelligence, this time covering the term 'emergent behaviors'. Deciding if you think models do - or do not - display emergent behaviors could shape your perspective on AI, so let me know what you think at the end of the video!

Original Emergent Abilities Paper

https://arxiv.org/pdf/2206.07682

Emergent Abilities are a Mirage?

https://arxiv.org/pdf/2304.15004

Jason Wei Rebuttal

https://www.jasonwei.net/blog/common-arguments-regarding-emergent-abilities

9s Chart Source

https://www.assemblyai.com/blog/emergent-abilities-of-large-language-models/

PaLM Tree Graphic

https://miro.medium.com/v2/resize:fit:1400/format:webp/0*6iHzcwteqW42rmqA.gif

Mustafa Suleyman / Yann LeCun Misunderstanding

https://x.com/ylecun/status/1687382572580675584

Wei Weighs In

https://x.com/_jasonwei/status/1687624276827062279

Grokking Paper

https://arxiv.org/pdf/2201.02177

MIT Origin Story

https://www.technologyreview.com/2024/03/04/1089403/large-language-models-amazing-but-nobody-knows-why/

Comments

The first time I heard of emergence was in the context of sociology. The debate was about a perspective that society is emergent; or that societal phenomena "emerge" as facts in the world. My professor was of the opinion that no, there are no emergent facts in the world, not in the natural world nor in society. In the end, all apparent societal facts can be reduced to actions of individuals and there is nothing lost, when we build everything up from there. That stance was basically an axiom for his theories and thinking. That debate reminds me of what is going on in the video. Depending on how you look at it, a phenomena is emergent or comes about in a more granular manner. In the end, models can do sums, or write business emails. I don't think describing the assembly changes that a phenomena emerged. I take the point about predictability, though I wonder to what degree we are talking about "predictability after the fact". I doubt we will always have the right set of criteria to see every ability coming. My guess would be that at least sometimes, we will discover the ability first, and can then figure out how to show it was predictable. Though I admit I'm a bit out of my depth here.

Jörg Weiß

Very interesting. I am not sure if you covered on youtube the AI scientist paper (and its 185 pages)? Seems there may be some connections to emerging behaviors, which were not always desired (eg the AI scientist trying to override the time constraints it was under). https://arxiv.org/abs/2408.06292 Anyway I found this paper was one of the best illustration of what the future could hold if AI is leveraged properly. What did you think?

Eddine Maiza

These types of insights are rare and are greatly appreciated. Keep up the good work.

Joshua Davis

Exactly! Emergent behaviour is from complex systems theory. Only in a perfect world you can measure all the factors and predict the outcome and therefore nothing‘emerges’. But in reality we can never do that due to the ‘complexity’ of the system. So it appears to me those people in those papers are changing the well established meaning of emergent. And honestly in academia its not that unusual that some academics just wanna prove a point and make some weird moves, like Gary Marcus, and Yann LeCum who has proved multiple times they’re not as genius as they think they are.

Armin

Great video. But we don’t train language models bit by bit, we suddenly make the models much bigger, meaning the emergent behaviour appears from one model to another. Its like saying nothing happens randomly because if we could follow the laws of physics step by step we could predict that happens. But we can’t follow the cause and effect step by step. Also, you were saying at some point the models from memorising grok the capabilities. But wasn’t it that LLMs memorise the content or chain of reasoning? How groking happens then?

Armin

I wish Patreon had native offline viewings like Spotify and YouTube's paid versions. I get stuck in situations where I can't access the internet a lot and I would love to watch such videos on those situations.

Armin

Thanks as always Jon

Philip

Thank you!

Philip

Yes will keep doing that!

Philip

Great idea, will start doing that via Assembly if I remember!

Philip

Also really, really appreciate the offline viewing/listening links you provide. Hoping you can provide one for this. This is a great series.

Ric

Hey, firstly thanks for putting out the best AI breakdowns on the internet. Is there a way to put transcripts of the video in the description? Makes note-taking way easier!

Nick

Thank you as always kind sir

Cooper Patton

I feel smarter having read this comment. I like that feeling. Many thanks. I'll be stealing some of these analogies! Great comment.

Cooper Patton

So if LLMs work kinda like "finding" programs that produce some output per Anthropics "Scaling Monosemanticity" paper. It seems like without increasing the scale of the model and letting it continue to train are you not just "searching" the program space? The size of the model indicates how many programs can be sampled at once and/joined? Smaller models would then inherently need more compute to find the correct sample of programs to join to satisfy some training metric vs. larger models that are able to hold more programs without discarding?

Joshua Sellers

Philip, this was, as always, well done and illuminating. I have a few comments to share, not as any kind of rebuttal, but in hopes of sparking further conversation and understanding. Doubtless known but unmentioned by you is that the term emergence has well established uses in other fields of science. In chemistry, it describes unforeseeable properties that emerge when various atoms link up or decompose. For example, when two hydrogen atoms bond with an oxygen atom, we get water, which, at room temperature, behaves very differently from its constituents. (Wetness and its ability to extinguish fire are unforeseeable.) Conversely, if you break table salt into its components, one of them, sodium, will burn ferociously if you drop it in water. These are quantum properties, in the sense that they are either present or absent, with no intermediaries. Emergence also has a meaningful place in evolutionary biology, but it only appears “quantum” because the intermediary steps are invisible to us. Human bipedalism, for example, is an enormously consequential emergent phenomenon. It enabled our ancestors to succeed as hunters and freed our forelimbs to evolve into technology-producing hands. But the fossil record and chimps today make it clear that bipedalism didn’t happen in a day, nor in a generation. As you note, the granularity of the metric may make a critical difference in the retrospective path of an allegedly emergent ability in an LLM. However, and this is key, prospectively such abilities may still be unpredictable. That’s because intermediate steps may occur without our knowing where they are leading, and so may not be measured at all. If alien zoologists had been observing primates some four million years ago, they might have observed that some were trying out bipedalism but it’s unlikely they could have deduced the consequences, any more than a chemist who has only ever seen hydrogen and oxygen separate from one another could guess that water would be wet. In short, (well, not so short, I must admit), I think that emergence remains a huge issue in AI development.

Clay Farris Naff

Well explained . Looking forward to the “remaining 1%” on Groking video and hearing the solution or theories on how to get to 100% accuracy

moein merati

Thanks for taking another crack at emergence. Personally, I'm still whiplashed that all these capabilities emerged out of LMs at all. I now recall, for example, Yoav Goldberg's excitement back in 2015, and not predicting all that emerged. https://gabormelli.com/RKB/2015_TheUnreasonableEffectivenessofC To prove mirages, please predict future capabilities.

Gabor Melli

Well explained and very interesting and relevant

Daniel Henderson

Awesome video!! I groked and liked it much

Robert Gomez-Reino

Very informational. Thanks.

Jon Kurishita


More Creators