XaiJu
AIExplained
AIExplained

patreon


'Open Source' - Controversial Terms in AI, Explained - New Series

A new bonus series (2/8 episodes) explaining some of the most controversial terms in artificial intelligence, this time covering the term 'open source'. In some quarters, it's the most controversial term of them all. Here, we mainly focus on the difference between open source and open weights - a key distinction!

Link for Off-line Viewing: https://drive.google.com/file/d/1RX1UI01gBzGyxBnIRbBVgMsL7i87ReMx/view?usp=drive_link

Sutskever Open Source Arguments: https://openai.com/index/openai-elon-musk/

GPT-2 Decision: https://openai.com/index/better-language-models/

https://openai.com/index/gpt-2-1-5b-release/

OpenAI was 'the wrong name' https://www.theverge.com/2023/3/15/23640180/openai-gpt-4-launch-closed-research-ilya-sutskever-interview

Llama 3 - 'Open source'? https://ai.meta.com/blog/meta-llama-3/

Open Source AI 'Official' Definition: https://opensource.org/deepdive/drafts/the-open-source-ai-definition-draft-v-0-0-3

Open (For Business) - More Realistic Motives for Open Source from Big Tech: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4543807

Long term vision to open source AGI, Zuckerberg: https://www.theverge.com/2024/1/18/24042354/mark-zuckerberg-meta-agi-reorg-interview

US Govt will ban open source? https://www.theregister.com/2024/05/23/us_lawmakers_advance_bill_to/

Chinese Military Computer Vision Dataset: https://arxiv.org/abs/2405.12167

Representation Engineering: https://arxiv.org/abs/2310.01405

The 'it' in models is the dataset, jbetker (OpenAI): https://nonint.com/2023/06/10/the-it-in-ai-models-is-the-dataset/

GPT-NewsCorp? https://www.bloomberg.com/news/articles/2024-05-29/openai-strikes-licensing-deals-with-the-atlantic-and-vox-media

Comments

Yes, fascinating. Personally, I have never expected expert-level AI to be open sourced (nearly put a hyphen there, doh, but Grammarly says there is a hyphen ;-)). It doesn't make sense to me to provide the keys to the castle where all your gold is. You want to horde that gold for as long as possible. AGI or ASI might have other ideas itself but that's for another discussion. I also don't like the idea that AGI and ASI will be controlled by a single source, company, person or government. That doesn't feel right to me but I've seen way too many sci-fi movies where this is the case and things go somewhat badly for humans. ;-) To me, AGI needs to be decentralised - at least at its core - with finetuned/personalised layers for every individual on the planet. It needs to be accessible to everyone; no restrictions linked to region, type of person, wealth, etc. But I am a believer in crypto and the Blockchain and think the current "system" is broken - so I would go and say this of course. :-)

Kol Tregaskes

The only one creating fully-OSS models at this point is Andrej Karpathy, and he had to leave OpenAI to do so.

Carlos Galarza

I love the intellectual honesty of your videos, the perspective, and the learning we get. Open source is not a "fire and forget binary drop", with a kinda-permissive license, but shall indeed include the source, techniques, and how to contribute your changes back! (e.g. Linux kernel) "Open" models are usually a marketing op of early stage companies, including OpenAI up to GPT-2. They release some products for free to get hype, then they close up. It's part of their business model or GTM strategy, but it's definitely NOT open source, not at all . Thank you 🙏

Enrico Ros

Pheeeew. Will the video message go through in the wrong way? I love your vids Philip, just thinking about the last two. We are getting strong signals that "just" scaling up LLM, same arch, won't create the next 10x or 20x effect. But do we really believe that ANY lab is working ONLY in scaling up? Do we even know if Omni is just the same arch (regardless of multimodality) as Turbo? You mention that there are good reason and many open front with different approaches that could lead to AGI. While all these combine feels very reasonable, I think the message misses a fundamental part, which is that, even if we would just stick to scaling up LLM and get "just" a 20-30% improvement from that scale up and optimizations in the current stack...from the user point of view, we still haven unpacked even 10% of what a GPT4-like model can be used for. Imagine now this model capability heavily optimized, potentially scale, at much higher inference speed (e.g. Edged like you point out) and the value and use cases are just MASSIVE. If we forget this point, and we just say 1) LLMs are in plateau (ala Gary), 2) they are only capable of solving problems that can map to in-training data, so they cannot really create anything new or discover something, they will keep massively hallucinating, and 3) AGI needs a totally different approach, then - all together - it kind of sounds to many (that's what I am ready from many in industry) that current LLMs are not ready for industrial use and we must wait for AGI. I just saw an X post a few days ago from some cool guy with 10s of thousands of X and Youtube followers, making GenAI videos for months... and he was posting he didn't know that LLMs could self-correct they answer when asked simply to check their answer. Ok, he is not an AI scientist of any sort, but some sort of power promt engineer, gathering and sharing info from a user's perspective. And yet... I just had today an update meeting with a CERN group we have a Generative AI Collaboration and we shared some of our advancements. Providing GPT4, Omni/Turbo (even some Llama3 70b, Qroqed) and some smart framework to provide tool functions to these agents, we are achieving great results ( 20x-30x less cost and much faster than equivalent human at same quality) on engineering and operating control systems. And the feeling is that we are totalling only scratching the surface. In summary, and sorry for such long and totally crafter on the fly message, 1) the interest in having smarter Gen AI models is clearly there, but we can, and we will, still do amazing things with the current ones. 2) current LLMs even if they wouldn't change architecture but just scaled and optimized in many places will still be a huge boost and game changer (we see this with Groq already in some use cases). Imagine GPT4-like reasoning - no more - but 100 times faster and cheaper and using CoT50.... 3) the big labs where the only one capable of creating this HUGE LLMs, but once created, there are thousands of researchers benefiting from that and studying what is happening inside them (e.g. inside Llama3), so I too believe some significant breakthroughs will happen relatively soon, and not necessarily in big labs.

Robert Gomez-Reino

Thanks for covering this topic! I've been involved in the opensource.org discussions somewhat and I've noticed in the discussion about data that people's viewpoints can shift as they get deeper into it. One might start out thinking that "of course we should include all the data used in training" and that would be ideal, but of course most of that data is not under Creative Commons licences, and getting permissions to share all that data is practically impossible. IMO, most models would be rubbish if constrained to just properly licensed content. So, practically, the current grey area of "but we're not copying that data, we just got a machine to look at it" works the best for all of us right now - we all end up with good "openish" models to run locally. (Personally I think restrictive copyright is an entirely obsolete concept in our current/future digital world, but it may take a while for everyone to agree with me). I hope however, that we can bypass this entire mess if we can work out how to use these current models to generate sufficient synthetic original data to build entirely new and 100% open/transparent models. Imagine a huge public database of millions of little summary articles about every conceivable thing, cross-checked and corrected multiple times ...

Martin Dougiamas

If you're looking for comparison points, it may be worth considering something like Snowflake Arctic as an example of a contemporary open source model If their 'coming soon' pages in the cookbook get updated, it looks like it'll meet the 'official' criteria for being open source AI

Poss


More Creators