AIExplained

AIExplained

Pod 8: Do we have a straight shot to AGI? 'Don't teach, incentivize' - Let's Think Sip by Sip

Added 2024-09-30 14:58:11 +0000 UTC

Are 'Benchmarks All You Need'? And do we have any conceptual breakthroughs to go before text-based AGI? I bring in the latest OpenAI quotes and reflect deeply on what it all means.

Link for Download: https://drive.google.com/file/d/1QV9h0_-uSADSaH_DGAL8tpHLCbFPdeXE/view?usp=sharing

‘Don’t teach. Incentivize’ Hyung Won Chung: https://www.youtube.com/watch?v=kYWUEV_e2ss

RLHF: https://arxiv.org/pdf/2203.02155

Mensa 98th Percentile: https://x.com/JgaltTweets/status/1836093456831402481

Mark Chen, SVP of Research at OpenAI: https://x.com/markchen90/status/1836068847167914162?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet

Pod 8: Do we have a straight shot to AGI? 'Don't teach, incentivize' - Let's Think Sip by Sip

Pod 8: Do we have a straight shot to AGI? 'Don't teach, incentivize' - Let's Think Sip by Sip

Comments

I was curious about this as well - was there an answer in the video? I’m not understanding how they could objectively do this

Oliver

2024-10-20 22:36:01 +0000 UTC

So they have turned the great weakness of LLM's -- their stochastic parroticity -- into a strength. Yes, this is the bitter lesson made manifest. And with it, just as in Go, suddenly an intelligence explosion seems not only possible, but perhaps imminent. I find it a bit ironic that Vernor passed just months ago. You do great work, btw. I've been following since April of 2023, I think. You are about the only source I pay attention to on this topic, and given that this is, as you say, surely the story of the century, that is saying something.

Jason Dowd

2024-10-04 21:19:17 +0000 UTC

It's all limited my time, hardware and power. And the underlying data. If this method had no limit OpenAI would not be publishing it so openly, nor hinting how it works

Philip

2024-10-04 12:33:23 +0000 UTC

Yes would go through again, but at much greater scale (and speed), as the technique is already established (plus hardware improvements etc)

Philip

2024-10-04 12:32:21 +0000 UTC

Yeah that would be the final stage, just pure token gibberish leading to insane results

Philip

2024-10-04 12:31:33 +0000 UTC

You do!

Philip

2024-10-04 12:31:02 +0000 UTC

There is always more work Sebastian, the demand couldn't be higher for what you are doing.

Philip

2024-10-04 12:30:53 +0000 UTC

Great points, and thank you James

Philip

2024-10-04 12:30:25 +0000 UTC

Fantastic podcast. FWIW, the problem domains go way beyond taste. I think any domain like ethics or politics (which involve values as well as facts) will continue to be challenging.

James Maclaurin

2024-10-03 22:41:13 +0000 UTC

You hit the bullseye! Very well summarized and perfectly captured the essence of the paradigm shift.

SteveHaupt

2024-10-02 17:15:49 +0000 UTC

As someone who has just started a PhD about creating "semi functional" medical benchmarks for LLMs (it's taken over a year to get funding and access to patient data unfortunately), with a specific focus on extracting simple ground truths out of patient cases that could be automatically verified, your thoughts on the importance of benchmarks due to this paradigm reflects mine exactly. There's a bit of an ethical bind that this new paradigm puts me in. My benchmarks reflect a great deal of manual work, which I'd rather not do again. So only evaluating local models and trying to ensure that the benchmark does not make it into training data seems like a good idea. However, now it might seem that this kind of benchmark might be valuable in the future not just for evaluation, but for training. However, by using it for training, you contaminate its evaluation potential and you'd have to make a new benchmark for evaluation. Maybe in the same way we have training and evaluation sets for data, we will have to make training and evaluation benchmarks? As if I there wasn't enough work to be done already...

Sebastian

2024-10-01 17:42:05 +0000 UTC

It’ll also be interesting to see how it goes for Harmonic (and DeepMind) that using things like Lean for verifiable proofs.

Shawn Fumo

2024-10-01 11:58:15 +0000 UTC

As Steve said, will need to be trained, but they probably can use the CoT traces to help in doing that training faster. I’m sure there needs to be a balance since GPT5 might respond better to slightly diff CoT than 4, but it’s probably still a good base to start with.

Shawn Fumo

2024-10-01 11:55:46 +0000 UTC

You have echoed my thoughts almost exactly, and if what you said is even remotely possible I need to get going on a benchmark ASAP! Very excited to start seeing models think pixel by pixel, I think that vision is going to be really hard to solve for niche domains but I have faith. Looking forward to the new Simple-Bench leaderboard!

Trenton Dambrowitz

2024-10-01 08:29:03 +0000 UTC

This shift in AI training - focusing on getting things right instead of just pleasing us fundamentally change how it solves problems.

Michal Babula

2024-10-01 07:33:19 +0000 UTC

I took Karpathy's "cease to speak English in their chain of thought" to mean it'll be thinking in a higher-dimensional non-human language, not switching between Chinese and French. Like those older Facebook negotiator agents that invented their own language. Another way of thinking about it: an advanced AI would be as limited thinking in English as we would be thinking in chimpanzee grunts or dolphins squeaks and clicks.

Brian Crabtree

2024-10-01 07:23:07 +0000 UTC

I believe the next model of OpenAI has to go through the process again, because the reasoning is not a separate module but gets baked into the model directly. The process to get there has a new separate element.

SteveHaupt

2024-10-01 07:07:16 +0000 UTC

benchmarks are all you need + nuclear reactors and vision. Great insights.

Joshua Davis

2024-09-30 21:26:48 +0000 UTC

lol, I asked while listening, then got to the 3rd breakthrough:))

Robert Gomez-Reino

2024-09-30 20:37:56 +0000 UTC

"relying on objectively correct answers..." I never understand this part. How is o1 rewarded for correct reasoning steps? how/what is asesing that those steps are correct?

Robert Gomez-Reino

2024-09-30 20:35:28 +0000 UTC

Thank you, Philip . If we are to assume the 01 was made as a second layer on top of GPT4(with RL and verifier etc) , can we expect that Open AI can deploy this layer instantly on top of GPT5 ? Or does GPT5 have to go through its own RL process to become 02 model ?

moein merati

2024-09-30 19:27:16 +0000 UTC

Hmm, that's really interesting. I had the idea in my mind that a verifier would be a different model, since the problem of checking a correct solution has a much smaller surface area than that of generating a correct solution, so you could train a smaller classifier on just that. I guess using the same model itself to self improve through reinforcement learning gives you better performance in the end. Really cool!

Alan Ispani

2024-09-30 18:21:39 +0000 UTC

I am still skeptical. I think that if it were truly learning concepts it would be able to recursively apply the same concept over and over again. It can't multiply 2 ten digit numbers. If it had actually learned how to multiply generally, I would think that would carry forward to numbers of any size. I am biased though, personally I do not want to no longer be the dominant species on the planet.

theheatdeathiscoming

2024-09-30 16:27:10 +0000 UTC

If what you're saying is true, then it's very to conceive that an AI with a shocking level of intelligence on benchmarks will be a good enough tool to help with the creation of the next AI paradigm, whatever it might be? Recursive self-improvement may be only a few months away then?

Manuel Bevand

2024-09-30 16:15:59 +0000 UTC

More Creators

Sena＠ASMR

Sena＠ASMR

fantia

mina ₍ᐢ. .ᐢ₎ ₊˚⊹♡

mina ₍ᐢ. .ᐢ₎ ₊˚⊹♡

patreon

krisstian

krisstian

patreon

Italy Unnie

Italy Unnie

patreon

Jordi Bruin

Jordi Bruin

gumroad

SarahBlackwellFiction

SarahBlackwellFiction

patreon

JosephAnderson

JosephAnderson

patreon

donaora889

donaora889

fanbox

teikyu

teikyu

patreon

Aaron Shirk

Aaron Shirk

patreon

Kurigami

Kurigami

fanbox

ackers

ackers

patreon

FruitsParadise

FruitsParadise

patreon

Cracked Ivory

Cracked Ivory

patreon

NCThomas

NCThomas

patreon

azo

azo

fanbox

snugglepuff

snugglepuff

fanbox

shaunkeaveny

shaunkeaveny

patreon

fleetwoodmutt

fleetwoodmutt

patreon

Deanvspanties

Deanvspanties

patreon

Zyphroxyl

Zyphroxyl

patreon

Lord_Snow

Lord_Snow

patreon

sugarcubedstudios

sugarcubedstudios

patreon

フランク-兄さん

フランク-兄さん

patreon

chococae

chococae

patreon

DaxMapsOfficial

DaxMapsOfficial

patreon

rhodeislandred

rhodeislandred

patreon

rustyfawkes

rustyfawkes

patreon

scabslut

scabslut

patreon

fffff

fffff

patreon

super1

super1

fanbox

highwaywarrior

highwaywarrior

patreon

xiai

xiai

fantia

kuzumochi

kuzumochi

fanbox

Dookie

Dookie

patreon

MONO-CHAN

MONO-CHAN

patreon

Invisible Cactus Games

Invisible Cactus Games

patreon

ArtistFigureReference

ArtistFigureReference

patreon

Mans.JS

Mans.JS

gumroad

彩音〜xi-on〜

彩音〜xi-on〜

fanbox