AIExplained

Pod 6: No One Agrees @ OpenAI if GPT-4o is 'a smart highschooler' + My Take on Murati, Altman and Sutskever - Let's Think Sip by Sip

Added 2024-06-23 17:35:01 +0000 UTC

There is a clear dividing line emerging at the height of OpenAI, and in AGI labs more broadly. This pod reflects on the 'reasoning' and 'scale' axes, including fascinating new comments from OpenAI researcher Noam Brown about his CTO, Murati, claiming GPT-4 as 'a smart highschooler'. Plus my take on Sutskever and Superintelligence and the 'Altman shift.'

OpenAI CTO Mira Murati Comments: https://x.com/tsarnick/status/1803901130130497952

Noam Brown Rebuttal: https://x.com/polynoamial?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor

'Frontier Intelligence'?: https://www.anthropic.com/news/claude-3-5-sonnet

Google DeepMind Reasoning Lead Joins In: https://x.com/denny_zhou/status/1804739992708841504

Altman Compute Fundraise: https://www.wsj.com/tech/ai/sam-altman-seeks-trillions-of-dollars-to-reshape-business-of-chips-and-ai-89ab3db0

'Limit the rate of growth of compute' - Altman: https://openai.com/index/planning-for-agi-and-beyond/

Altman 'Naive Early Views' Interview: https://www.youtube.com/watch?v=fMtbrKhXMWc&t=1229s

Altman View on 'Greatest Threat': 'https://blog.samaltman.com/machine-intelligence-part-1#:~:text=Progression%20of%20machine%20intelligence%20is,all%20of%20a%20sudden%20go'

Comments

Called it: https://www.nytimes.com/2024/07/16/technology/universal-income-openai-silicon-valley.html

Brian Crabtree

2024-07-16 15:41:43 +0000 UTC

On it!

Philip

2024-07-15 16:17:38 +0000 UTC

I think this was super interesting! I'm perfectly fine with your videos being more deeply researched, and this just being your current thoughts and ponderings! 🙂

Adrian Schmidt

2024-07-11 22:10:13 +0000 UTC

I do feel like timelines have changed. I'm not expecting "GPT-5" to be released this year (it was said that GPT-5 was "not ready" only very recently). My gut says it's still in pre-training or possibly fine-tuning stage and has not started safety training. Maybe Q2/Q3 2025. I am curious: where did GPT-4a come from, is it a side project trained alongside "GPT-5", is it an early checkpoint of "GPT-5", are OpenAI producing 3 levels of the model like Anthropic and this is the Haiku-level version, maybe "GPT-4.5" will be the middle (if released at all) and "GPT-5" will be an Opus-level model? I was also optimistic that AGI would be "GPT-6" and/or 2026. But "PhD-level" doesn't sound like AGI to me. Now feel like it will be "GPT-7" and/or 2028 at the earliest. I'm definitely getting the same vibe as you that we're still a little way off. What did Bill Gates say recently, something like there are 1 or 2 more "cranks" to go? Will those 2 cranks get us to AGI? I still believe we'll get there before 2030 and I'm sure Sam Altman was quoted (again) recently that he still thinks OpenAI's 15-year mission is still on target. But it's probably going to be a lot closer to 2030 than they thought. Though if a guess in 2015 that AGI will be here by 2030 and get anywhere near close then that's pretty impressive (or lucky ;-)). It seems to be that reality hit them over the head and the task of achieving AGI (by their vague definition) is more complicated than first thought. It also seems it's a little further away and thus ASI superintelligence testing is not quite required yet. Also, I always go back to the line on the site that all deals including Microsoft's are off once AGI is achieved. That reads to me that they are inclined to drag this out as long as possible. They'll need to find a happy medium though, Anthropic are pushing more and more. Sonnet 3.5, for me, is a better model than GPT-4o and now Claude is adding really useful features, like artifacts. Elon and Grok could be the ones pushing all of them, though. Grok 2 is to arrive in August (so he says) and Grok 3 by the end of the year. He also says his supercluster data centre is only months away and he's still quoting AGI will be here next year. I love Elon's enthusiasm. :-) Ilya's new SSI company is fascinating. How is he going to get funding, surely investors are not going to get any return. And like you say, is he just going to plonk ASI on us in a few years without warning? Isn't Ilya Mr Safety? I'm not sure what to make of it at the moment. BTW, I like this type of podcast. Well, I like any type of podcast from you, Phillip. Very interesting discussion, thank you. Your content is one of the few where I stop and listen/watch without multi-tasking on something else. :-) And no, sorry, I still haven't got to the bottom of this Patreon list )your first podcast). Stuff keeps popping up at the top. :-p

Kol Tregaskes

2024-07-07 21:16:35 +0000 UTC

Hey Phillip, what's your take on this recent paper claiming it can eliminate MatMul operations without significant performance loss? If indeed this scaled to frontier LLMs, it could mean a paradigm shift, allowing them to run locally anywhere, and many orders of magnitude larger models would be practically runnable. We need an analysis on the subject. https://www.reddit.com/r/LocalLLaMA/comments/1ddv967/a_revolutionary_approach_to_language_models_by/

Andronikos Koutroumpelis

2024-06-30 19:00:12 +0000 UTC

Thanks Phil, I enjoyed this format. Perfect for a rainy Sunday afternoon. On topic: It will be interesting to watch what happens when the majority in the industry realize that we have indeed hit a wall. Sure, LLMs have produced many useful tools, costing humans some jobs, but is this transforming the world as we know it yet? Nope. And frankly, I fully expect this whole AI frenzy to follow the Gartner hype cycle, and we're right at the peak, or even past it for a few weeks already. Time to buy some 2025 NVDA puts...

Markus Heinsohn

2024-06-30 15:21:32 +0000 UTC

I dont think Sam has changed his point of view on things, I just think it’s becoming more real and people are starting to worry about losing jobs and such. By him talking down AGI, it softens the blow a bit

Dan Brian

2024-06-29 17:19:29 +0000 UTC

The study of heuristics and biases in psychology is driven by errors that humans routinely make because much of our “reasoning” uses heuristics that get us the right answer most of the time. That’s what you should expect of reasoning that is constrained and evolved. Psychologists don’t take that to mean that people are incapable of reasoning—we are just satisfactory rather than optimal at reasoning. Why don’t we draw the same conclusion from this “evidence” that GPTs don’t reason? Does it really show that they can’t reason as opposed to that they can be fooled (just like humans)?

James Maclaurin

2024-06-28 07:49:01 +0000 UTC

Thank you for your insights!

Patrick Mosby

2024-06-27 05:52:43 +0000 UTC

Thanks mate. I was just joking but appreciate you sharing your perspective. Big fan, many thanks. Let's see how NVIDIA goes, sure will be fascinating.

Cooper Patton

2024-06-26 14:14:25 +0000 UTC

We don't think they're holding back till the election? Also my theory on why open ai is so slow is that they're having to use so much compute for inference vs other models simply by virtue of having a larger user base. And perhaps the apple deal is taking a bunch of resource too. I imagine they're doing a lot of things at once. I'm still optimistic for GPT-5 but you make a great point on memory vs rationalisation, especially given the study on dynamic MMLU type tests you recently talked about. Maybe Yann LeCunn has been right all along.

Cooper Patton

2024-06-26 14:12:38 +0000 UTC

I mean if I were guessing stocks I would say Nvidia is overvalued. But AI shares is a broad term, plenty of winners in there amongst the hype. Always return to basic advice though, stay internationally diversified, in low cost index funds, possibly managed futures like DBMF for some hedging.

Philip

2024-06-26 14:11:46 +0000 UTC

You answer him but not me! I'm pretty sure that giving financial advice is amongst the wisest things you can do for your Patreon. Trust me bro. Haha, thanks for clarifying & thanks Jorg, a great question. Keep the podcasts coming!

Cooper Patton

2024-06-26 14:08:14 +0000 UTC

In a new interview he says it looks probably we'll get "Models better than most humans at most things" by 2025, 2026, or maybe 2027. https://youtu.be/hChm1qIIJLE?si=9kVmq8UfynMWeAqh

Brian Crabtree

2024-06-26 13:57:04 +0000 UTC

Thanks Steve, more coming!

Philip

2024-06-26 12:20:45 +0000 UTC

I would stick by the comments that scale will still yield incredible fruit in the next two years, at the very least in video, avatars and factual recall. But if we don't have a breakthrough in reasoning soon then yes, 5 years seems a reasonable timeframe for research to produce several more candidate breakthroughs.

Philip

2024-06-26 12:20:23 +0000 UTC

Thanks Sean

Philip

2024-06-26 12:17:30 +0000 UTC

:)

Philip

2024-06-26 12:17:16 +0000 UTC

Yeah I mean in the grand scale of history, AGI is pretty much certainly soon, so I am with you.

Philip

2024-06-26 12:17:10 +0000 UTC

Thanks Mike, appreciate it

Philip

2024-06-26 12:16:32 +0000 UTC

We will find out soon!

Philip

2024-06-26 12:16:22 +0000 UTC

Thanks Jon, yes the next generation is key; I feel that 3.5 Sonnet/4o hint at early signs that no new paradigm ('program synthesis or search'?) has yet been found. I am hearing nothing on the grapevine yet of a step-change, other than one slide of a leaked talk.

Philip

2024-06-26 12:16:04 +0000 UTC

Yes, he still believes, for sure. But 3.5 Sonnet feels quite incremental, no?

Philip

2024-06-26 12:14:10 +0000 UTC

Definitely not run out of options! You are right to spot those 6 ways of improving I mentioned in my last video, but my major contention is the naive scaling is not a panacea.

Philip

2024-06-26 12:13:39 +0000 UTC

Exactly!

Philip

2024-06-26 12:12:18 +0000 UTC

He would for sure, I think he singularly converted many, into 'feeling the AGI', perhaps prematurely.

Philip

2024-06-26 12:11:54 +0000 UTC

Almost like active inference, in a way, which definitely could keep working at bigger scales.

Philip

2024-06-26 12:11:13 +0000 UTC

Thanks Ben, great to hear.

Philip

2024-06-26 12:10:32 +0000 UTC

That would be an additional issue - contamination. Where benchmarks test for specific reasoning chains that have been crystallised within models, down to specific steps in equation solving etc.

Philip

2024-06-26 12:10:05 +0000 UTC

I’m not sure if Dario has changed his opinions that much. In the recent Time magazine interview he specifically says he feels like capabilities will keep increasing for at least a few generations. He even mentions he’d be a bit relieved if things did hit a wall, since it’d force all the companies/nations to take a bit of a breather at the same time until new methods were figured out. I don’t think he’d say that if the wall was already hit.

Shawn Fumo

2024-06-26 03:11:19 +0000 UTC

yes, I am also a bit confused. Couple of months ago Phil did a big video on reasoning and how there are so many different ways reasoning can be improved in these models and that we are nowhere near the end of the spectrum and now suddenly it sounds like we have already run out of options?

Jörg Eitner

2024-06-25 09:10:30 +0000 UTC

So am I selling my AI shares? 😂

Cooper Patton

2024-06-25 06:11:08 +0000 UTC

Nice

Libor Burian

2024-06-24 22:28:11 +0000 UTC

I think your point about "ideological belief" might be right on the money. The way people talk about "feel the AGI" fits perfectly into that idea. Also interesting: Shortly after the initial release, Sutzkever gave an interview on some podcast (can't rememeber which) . He was asked a hypothetical along the lines of "If we don't see GPT revolutionizing the world within 10 years, what would have gone wrong?"* Sutzkever's answer was: Reliability. He identified that as the problem that might actually stop progress; or maybe that would mark the stop of progress. He also expressed his optimism that they would be able to figure it out. I wonder what he'd say today. * Heavily paraphrased.

Jörg Weiß

2024-06-24 22:21:35 +0000 UTC

That was just a hyperbolic example to make clear that LLMs are constrained by their present data and cannot deduce novel insights by reasoning. E.g, if ChatGPT-5 had the dataset from 1880's, it could never invent radio telegraphy, which Guglielmo Marconi did in 1895. Maybe a future AI which is designed using other/additional methods than an LLM neural network will be able to do this. If the next major LLM update (whether that's called GPT-5 or something else) can take a baby step in that direction -- great. However, they cannot now do this, and it's grasping at straws to expect that scalability will in the very next version create reasoning by a magical emergent mechanism.

Joe Marler

2024-06-24 19:16:48 +0000 UTC

"You illustrated this well: If ChatGPT-5 had the dataset from 1880, it could not answer questions about quantum physics." If any current quatum physics professor lived in 1880 and only had their current knowledge would they be able to answer any questions about Quantum Physics?

Steve Caya

2024-06-24 18:21:00 +0000 UTC

I think Ilya Sutskever would say that the human brain is a neural network that can reason so there is no reason a scaled up artifical nerual network cannot not reason too. So its just a matter of scale and some mods to the models.

Steve Caya

2024-06-24 18:00:57 +0000 UTC

Phil, to me this podcast was not a casual discussion. It was a breathtaking display of cogent, perceptive insight. Of all the analysts and comentators, only you are providing this level of quality. The items you discussed (once this is widely understood) will affect world events, financial markets, and maybe regulatory posture. I can't understand why nobody else seems to perceive the points you made. Re the Altman, et al tone shift from "true believer", and the apparently evolving viewpoints of Dario Amodei: Your analysis of this is penetrating, almost clairvoyant. Re Sutskever's hard-core belief in LLM scalability and ideological belief that would produce AGI: I think your opinion is quite plausible. It would explain many things. There are things going on "beneath the covers" which only you have shown insite to perceive. You are doing what could be called "Boardroom Kremlinology" or "Corporate Leadership Profiling." What you've done reminds me of management consultant Peter Drucker's statement: "The most important thing in communication is hearing what isn’t said." This is also in line with your analysis of the LLM testing landscape. In discussing the Aschenbrenner paper, you mentioned the divergence between current benchmarks vs real-world capability: "Anyone who uses these models daily for coding, mathematics or anything technical will know what I'm talking about." Yes! I know what you are talking about. I use all frontier AIs for hours each day in my work. They are very useful, but also very limited. The question is WHY are they limited? Is the answer "scale is all you need?" From my own experience, I don't think so. You illustrated this well: If ChatGPT-5 had the dataset from 1880, it could not answer questions about quantum physics. I already sense some have misunderstood the ARC benchmark challenge. There is a pell-mel rush to "beat ARC" and thus prove LLMs can reason. That by itself will prove nothing. If an LLM gets 95% on ARC, those of us who use such models in the real world can tell if its majorly improved. If it is and if ARC contributed to that -- good. If it's not, they can't hide behind a benchmark because we will know anyway. You also perceived how the Aschenbrenner's "stacked area" OOM chart was misleading. The X-axis was time and the Y-axis upward-sloping "OOM contributions". This implied increasing LLM capability, but in fact it was only displaying computational increases. Whether those increases translate to similar AI capability increases at those scales is really unknown. Like you I am optimistic about future improvements in LLM capability. But I don't think that trajectory will be as originally envisioned by the people advocating "scale is all you need." As you stated, we won't have to wait long to know the answer. If the next major versions do not manifest significant increases in reasoning capability on diverse real-world tasks, then the "cat is out of the bag."

Joe Marler

2024-06-24 14:20:13 +0000 UTC

Great episode Philip, love the format as podcast

Andreas Harto

2024-06-24 13:53:01 +0000 UTC

Aaand Mustafa just said GPT-6 is ~2 years away. They still might squeeze in a 4.5 with new post-training reasoning breakthroughs over the next few months. Depends whether he means ~2 years until training is finished, it's announced, or it's released -- probably training is finished.

Brian Crabtree

2024-06-24 12:44:06 +0000 UTC

Could it be that we are looking at AI from the wrong direction? If we compare these models to humans, there’s a big difference in how they operate. Humans constantly adjust their brain “weights” when they encounter new problems and learn on the fly. Wasn’t there a recent paper that discussed this exact method for reasoning in LLMs, suggesting fine-tuning LLMs for new situations? Sure, we currently need to train these models with tons of examples, but this could be reduced by incorporating multimodality. Humans aren’t that great at advanced reasoning without being well-versed in a domain. No one solves complex math problems without proper training. It’s often just applying something familiar to a new problem and then reaching a new conclusion. This is why I’m not fully on board with the notion that LLMs have a significant limitation here. In my daily work, we use these models to categorize incoming emails or analyze large documents whether they adhere to certain criteria we have, very specific to our company. These models have definitely not been trained on these examples in the past and still can correctly perform these reasoning tasks. We define everything in context, and even GPT-3.5 does very well without fine-tuning. This might not be advanced mathematical reasoning, but it’s still reasoning and is perfectly adequate for actually almost any problems we are trying to solve. I could imagine finetuning could even further improve the models reasoning capabilities in very specific areas.

Jörg Eitner

2024-06-24 10:58:16 +0000 UTC

I really enjoyed this podcast. Your reflections and observations add so much to what you're presenting. The main channel content, filled with analysis, is very good. But this podcast is great because it builds on all that analysis. It helps us understand your main channel work better. We get to hear your reflections. It's particularly interesting to sense that there's a slowing down or dampening of expectations about AGI. Perhaps the singularity is not imminent.

Ben Dudley

2024-06-24 10:54:37 +0000 UTC

One thing that I find confusing is that if these models memorise reasoning, how come the performance dropped in the benchmark that they changed the numbers in the questions? It seems to me that this is a far more complex issue and doesn’t fit in one or two boxes.

Armin

2024-06-24 10:18:48 +0000 UTC

This is a really great podcast - thank you for sharing. It’s got me thinking about the progress we’ve made and the direction LLMs are heading in new ways!

Sean Betts

2024-06-24 09:51:16 +0000 UTC

Enjoyed it. If that format is easier and faster to produce than it would be nice o get more of it and have more of your thoughts and knowledge of the scene be handed out. The AI stuff moves pretty fast too, although I suspect that the low hanging fruits are mostly picked and it starts to kinda slow down - we don't seem to be bombarded with mindboggling news every other day anymore. Other than that it also looks like OpenAI is kinda loosing it's leader role, and developers are moving around to find the place they really want to be (better working environment, better value alignent, maybe better pay). Also people are getting comfortable with the current state of the play and are not short term dooming as much, up to the point where folks tend to dismiss the whole AGI thing entirely, or at least for their own lifetime, going back to "business as usual" (stuff I follow and people I talk to). Not sure if that's a good way. Kinda like "they promised flying cars" rant and now they are disappointed.

Shaa Dea

2024-06-24 07:53:35 +0000 UTC

Really enjoyed that. I think the rate you’re producing content and the balance of research versus thoughts such as in this podcast is good.

Mik Quinlan

2024-06-24 06:55:26 +0000 UTC

I totally agree with your larger point though: I think they have evidence that these massive new AI data centers are a good investment. A more meta point is: as geopolitics and secrecy subsume AI, there's going to be less to glean from published papers and public statements and more to glean from correctly analyzing incentive structures and reading between the lines. There will be more alpha in the micro-expressions than the system cards.

Brian Crabtree

2024-06-24 06:22:53 +0000 UTC

Good point! Yes, that might be it.

Christian Hendriksen

2024-06-24 04:59:58 +0000 UTC

I think he showed Gates the scarjo voice mode. Because twice in that interview Gates makes out of place comments about voice. Of the top of my head (1) he butts in once saying "like voice-to-voice?" when Sam says something about multimodality (most people at that time would have associated multimodality with image input & output, not voice) and (2) he says something about these model being able to do sales calls (as opposed to customer support - which was WAY outside the Overton window at the time of recording).

Brian Crabtree

2024-06-24 04:58:50 +0000 UTC

I do fall into the camp that "agi" is relatively soon. Even if the transformer architecture is not capable of true reasoning, with the unlimited context we are getting, they should at the least allow researchers to comb through the mountain of research to help find promising avenues and that alone should help accelerate things.

Mike D

2024-06-24 04:23:13 +0000 UTC

I am going to have to concur a bit with Jorg here. I'm working on a restaurant app where it takes in a menu (I've used one that was in Spanish as well) as context using GPT4. I have put it through it's paces; it can tell you vegetarian options, sea food options. You can ask it which menu items have a certain ingredient.

Mike D

2024-06-24 04:18:09 +0000 UTC

Side note on the UBI study release date: WorldCoin's employee and investor tokens start unlocking on July 24 to the tune of about 4% inflation per day. The main guy who highlighted this was @DefiSquared on X on May 14 (and the price has dropped by half since then). But his post ends with, "If you see a strategically timed announcements between now and insider unlocks in July- it's sadly not a new play in this industry, but serves to ensure exit liquidity for insiders at eye watering valuations." So I would expect the UBI study to come out right around July 24 - especially because (1) Sam again said it'll be out soon on the May 11 All In podcast and (2) WorldCoin is a big deal (fully diluted market cap of $28B) and the UBI study is very likely to grab headlines and move it's price so they're probably being very strategic about the release date. Source: https://x.com/defisquared/status/1790075349373694174?s=46&t=H5vlscpSLDgGdZwup6LiZA

Brian Crabtree

2024-06-24 03:50:23 +0000 UTC

Philip - Really enjoyed the podcast! +1 to the other comments that this didn't feel random/disorganized at all, actually a really interesting synthesis of the available evidence on the signals we're getting from leaders at the AGI labs. I also had a similar question to one of the other commenters if this has changed how you think about timelines.

Mike

2024-06-24 02:43:28 +0000 UTC

great podcast, thanks

Steve Caya

2024-06-24 02:12:23 +0000 UTC

Ilya starting SSI isn't that surprising viewed through the Situational Awareness lens. Israel might just have the foresight to partially brain drain US labs so they have extra leverage headed into a US-led ASI manhattan project. And Andrej trying to leave industry for modern academia (YouTube) could be more about not wanting to build ASI shoulder to shoulder with the NSA in a bunker vs deep learning hitting a wall.

Brian Crabtree

2024-06-24 02:08:54 +0000 UTC

Alternate take: when Mira, Leopold, and Kevin Scott say, "GPT-4 is high school level", they're zooming out and including GPT-4.5 which may come out in a few months with post-training reasoning breakthroughs. Then, when the reasoning leads at OpenAI and DeepMind see these comments, they zoom in and split hairs about GPT-4o vs GPT-4.5 because their contributions are mostly still to come in GPT-4.5.

Brian Crabtree

2024-06-24 01:34:44 +0000 UTC

After watching carefully the progress of AI in the the last six months, I do feel that we could be hitting a wall. But I also know, most of the top labs are being conservative in releasing new models BEFORE the US election to avoid deep fakes, etc. I feel by the end of this year, we should have a clearer understanding. Ether we have a huge intelligence releases in NOV or DEC, or small bump ups of cool stuff. If we get no real breakthroughs in the foundation of intelligence by 2025, this means there is more needed than just scaling up LLMs. BTW, I love your podcast..

Jon Kurishita

2024-06-24 00:44:29 +0000 UTC

Far from random, this is an excellent effort at addressing what may be the two most fundamental questions in the AI world today. I want to believe that AGI and ASI will arrive in the relatively near future. But even more than that, I want to be realistic about what's coming and when. This is a well thought out and thought provoking perspective. I will still place my money on the "None of the Above" option when it comes to predicting how this will play out. And I love this type of content for this forum.

Steve DeMoss

2024-06-24 00:22:58 +0000 UTC

From the information available to the public it seems that some kind of wall has been hit. Other than multimodality, which is of course usefull, we haven't seen any significant improvements since the start of this race.

Dennis Polyzos

2024-06-23 23:17:01 +0000 UTC

I guess I would still call it reasoning if google Lens would do this very specific job of detecting these items. It also works if I am asking for pescatarian options for example. It feels like moving goal posts to me if we just call this memorization while we would probably call it reasoning in humans to go through a menu of non-commodity dishes and be able to determine what agrees with various dietary preferences, don’t you think? Maybe not on the level of a PhD but a lot of people might already make mistakes at this task even though ;-)

Jörg Eitner

2024-06-23 21:26:14 +0000 UTC

You make a compelling and thought-provoking case as always. Is any of this causing you to change your predicted timeline? I believe you have said that you expect significant progress over the next one or two years just based on remaining runway for current techniques. And I believe you’ve said AGI (solving ARC challenge etc) is plausible (likely?) in 5 years. Perhaps you’re pushing all this back given persistent challenges with reasoning.

Mark Levine

2024-06-23 21:23:26 +0000 UTC

It took mine 1 second

Armin

2024-06-23 21:23:02 +0000 UTC

It's an interesting point, many would call that memorisation of food items and the class 'vegetarian',others it would be for sure reasoning. I guess if Google Lens with translate did the same thing, would it feel as much like reasoning? Thanks for the comment Jörg

Philip

2024-06-23 21:10:57 +0000 UTC

Sorry, but this is a lot of tea leaf reading, and the evidence seems far fetched. I do not understand how these models are not capable of reasoning if for example you give them pictures of a hand written restaurant menu and they are able to only list you the vegetarian dishes even though they are not marked as vegetarian and even in a different language. Maybe we are looking at the wrong kind of reasoning or are way too limited in what we deem reasoning?

Jörg Eitner

2024-06-23 21:07:56 +0000 UTC

Could be!

Philip

2024-06-23 21:05:12 +0000 UTC

Fair points!

Philip

2024-06-23 21:04:59 +0000 UTC

Yeah for sure the Stargate system implies a lot of latent potential, but could be for inference mainly too, i.e. video

Philip

2024-06-23 21:04:46 +0000 UTC

Let me know if it continues Tom, can speak with them myself

Philip

2024-06-23 21:03:53 +0000 UTC

And he just followed me, so wonder if he heard this pod

Philip

2024-06-23 21:03:36 +0000 UTC

I watched the Dartmouth interview. Although some members of the audience might have been technically sophisticated, the interviewer certainly was not, and that made the interview content pretty simplistic. I think that makes it hard to take the kindergartener/highschooler/PhD comment hard to understand as anything more than a metaphor. . Also, the fact that both Murati and Aschenbrenner, among other OpenAI folks, use some version of it suggests to me that it is an shorthand they use internally. . .

Glenn Montague

2024-06-23 19:00:05 +0000 UTC

Slight disagreement with the tone. Sam Altman wouldn't be pitching 7T if he's expecting what's essentially a hallucinating chatbot to world leaders. If all you got is this, do you think people would agree to hand over so much as 1T? Plus Noam Brown is far more optimistic in some tweets, eg "startup founders don't bet your company on frontier models hitting a wall" etc.

Feitian Li

2024-06-23 18:54:19 +0000 UTC

Really interesting podcast, I don't find it as random as you do! I think your argument is solid - however, some data points suggests that there's more to this and I'd like to hear your interpretation: Around September / October 2023, Altman began mentioning some kind of research breakthrough (https://sfstandard.com/2023/11/17/openai-sam-altman-fired-apec-talk/) - of course, this could be hype. But in Altman's appearance on "Unconfuse me" with Bill Gates, Gates offhandedly confirmed Altman had showed him something that impressed Gates. If Altman has stopped believing scale is all you need, then what did this thing imply? And on a more fundamental note: If Microsoft is funding extremely large model runs and associated datacenter investments on the order of 10s of billions of dollars, then I find it reasonable that OpenAI has a more convincing timeline and development plan than just "scale go brrrrr". But then again, Amazon spent 10 billion on Alexa already so, maybe corporations are not that stingy with ultra large investments.

Christian Hendriksen

2024-06-23 18:39:58 +0000 UTC

Is it only me or is Patreon platform just unbearably slow loading and downloading?

Tom Tomaszewski

2024-06-23 18:36:05 +0000 UTC

Note that noam brown believes scale will solve reasoning "at the limit"

Feitian Li

2024-06-23 17:54:08 +0000 UTC

More Creators

小麦丸

fanbox

MSP Art

gumroad

patreon

ChickArt - Christopher Huppertz - Grafik-Design

gumroad

爆米花鱼

fanbox

すぎむらたけし

fanbox

七瀬ここの◆

fanbox

kiriko

fanbox

MaxxSynth

patreon

quicke

patreon

Ferny's Progression

patreon

BSN_MMD

fantia

Sataen

patreon

S0k4

patreon

Just Me And My Boyfriend

patreon

Ookami Kurisu

fanbox

tutelarofquixotics

patreon

IchikoAoba

patreon

ISAmu.Room

dlsite

Aereleth

patreon

Trap queen emilyYunicorn

gumroad

kenney

patreon

てるを

fantia

na●si

fantia

AetrixSFM

patreon

silent、洛语依

fanbox

Jarvann

patreon

sh_akira

fanbox

BambiBound Clips

gumroad

Huff

patreon

むちねる / muchineru

patreon

FangBarbie

patreon

Jasonafex

patreon

Ukatoo

fanbox

tartnsfw

patreon

Hopespice

patreon

cabbagepreacher

patreon

Grey

gumroad

Easlo Studies

gumroad

vaelyon

patreon