XaiJu
AIExplained
AIExplained

patreon


Exploring the Frontiers of Intelligence … by co-creating LMcouncil.ai

If Simple Bench tests common sense, what is the best test of practical wisdom? For me, it was finding out whether 2025 language models could build the app I have been meaning to make for two years. They can’t, no yet, but they can get very close with a determined human in the loop.

lmcouncil.ai is a combination of SmartGPT, Karpathy’s vision for an ‘LLM Council’ and my own touches. Compare model outputs, call votes, set leaders, roleplay a full group chat of characters, and on and on …

To be clear, even with 500 hours work and the best models you can’t yet cover all edge cases, or get bug-free perfection. And that’s with GPT-5-Codex best for diff patches, Claude 4.5 Sonnet/4.1 Opus for mobile features, Gemini Deep Think for backend, and even that recommended list changes every week (w/ Google CodeMender coming for extra security)

But I would love your help:

Comments

Hey, thank you so much for asking! Will be adding custom credit blocks in the new year, at the moment best option is Max tier, for a little bit, then downgrade if needed in say Feb or March. I use most of Max tier every month so even that not enough!

Philip

How do I pay for an additional subscription at the same tier when I've hit the token limit?

Anon Anon

That's so great to hear! Anytime I need to do another 15 hour day a comment like this keeps me going, means a lot.

Philip

This is brilliant. I’ve always wanted this! And so glad you wound up making it. I trust you to take this great places.

Ben B

yes, did that ages ago but never tested, good point

Philip

Hey Philip, was wondering, could you make perhaps a set of 10 or something new questions to check if maybe, just maybe the simple bench questions have leaked? I mean, no need to go through the vigorous question testing with PhDs, just a few off the top questions to test, say GPT 5.1 and Gemini 3, to see if you get similar scores? Makes sense?

Strangest

I'm looking forward to giving this a go as the weather gets worse and I am spending more time indoors.

Jason Dowd

Please add an option to "delete all" past conversations. Thank you!

Riley Thomson

Quick feedback after signing up with email just now. The confirmation email went to Spam in my protonmail inbox. Maybe something to look into if you want to make onboarding smoother.

clay-loop

Thanks for asking! For the very first prompt the models will not be able to see each other's answers before they are generated, so you would set up the system prompts 'in roles' then kick things off with full context and what is required. FOR ALL SUBSEQUENT ROUNDs, the models see everything said by everyone else, so you can get debates/arguments/consensus etc. The only thing I don't have is like an ongoing debate that continues indefinitely until you stop it. A couple of people have asked for that but it would mean a lot of credits get burnt, but if this is exactly what you meant, do let me know. Last, if you meant you want one model to output the consensus point/summary/table of points said by everyone else, that is the 'leader' button at the bottom.

Philip

I have not been able to find out how to let the models interact with each other. I would appreciate using system instructions to create, let's say, research team members represented by different models to run virtual research lab meetings where I throw in a topic/question and the models engage with each other to come up with a consensus.

Michaela Liegertová

done, just for you! Let me know what you think!

Philip

one small feature request - Remove background button makes background black. Let us pick a color please. I hate dark themes, yes I'm minority but I like to not damage my eyesight ;)

Sam

Great idea! You mean background image or a viewing mode like 'columns' or 'dim/dark' mode that we currently have?

Philip

I love the initiative! Not related to the AI field, but could be useful to set a default visualization for a shared public council conversation when sharing it. Will keep testing the tool and provide more feedback!

Maico Bernal

Super cool, congratulations on putting all of this together, Philip!

Dorian Iten

Is it the boldness of the colours, the brightness of the screen or both, which you most want options over?

Philip

Very cool! Not AI related but would love some simple themes to have basic colour styling done for me for the different boxes etc. The bold colours are killing my eyes hahaha

Kyle Behrend

On 1, where would you like this placed? I am thinking a 'Submit Idea / Feedback button below FAQ/Hotkeys... On 2, if anyone runs out of credits on Max tier I would love to hear about it, I like what you said here in principle, like a Custom tier, will review Stripe docs.

Philip

Thanks so much Michael. Do send along your ideas on UI, unless you meant a dedicated mobile app, in which case - working on it.

Philip

Thanks Jim, feedback always appreciated

Philip

Cool idea! A couple of suggestions. #1 Make user feedback possible *directly* within the app, so you lose the vast majority of useful ideas as people have them. You could have AI triage and categorize common ideas if you get too much volume. #2 Allow users to purchase token limits in whatever amounts they'd like above some base allocation, either as additional subscription amounts or one-time purchases.

Alexis Olson

Love your idea! This is great. It is something I'm doing manually, asking several LLMs for an answer to get a broad overview or to get several different opinions. And your "Smart GPT" idea with the Council leader, is something I guess the top AI development leaders are implementing right now to improve their AIs answers. It is probably one of the best ways of improving the answers from an LLMs. Let's say there is a 1 in a 100 chance of getting a genius answer from an LLM. If you ask it 100 times and then pick all the different ideas from all the answers, you increase the chance of getting a genius answer from almost 0 to almost 1. About your app, I would love it if you update the UI to look more modern. Right now, it doesn't look that great. I think many more people would use your app and trust it more if the UI looked better.

Michael Lulev

Awesome! I’ve really been needing this. Will try out and pass along. Nice job and great work

Jim Beaver

Already added two features based on the comments in this thread...!

Philip

If anything, motivates me more to make videos. More sign-ups, even on Free, = more videos on all topics. But I can't ask more of those on paid Patreon though, you already do enough to keep channel going indefinitely

Philip

Would love any tips from someone who has attempted the same, keep 'em coming

Philip

Thanks so much Peter

Philip

That is a really good point, will work on that, expect rapid progress

Philip

Yes, though I trust you guys to test it out and give me feedback as to whether its Explained-worthy

Philip

Where does the brightness come from, if we control the background/bubbles? You mean the button text?

Philip

Super cool Phillip ! Guessing eventually you will do a demo video of this on your youtube channel?

Daniel A Barbatti

Please add light mode I'm scared of the dark :(

Ásgeir Thor Johnson

As a user I want to see how many credits each model uses so I can balance intelligence vs cost when I’m creating councils.

Bill Ray

The system prompt editor being hidden behind the “roleplay” button feels weird. It would be cool to be able to configure the models (system prompt, tools, effort level, etc.) in an intuitive way. Maybe by clicking on the model names which currently is how you replace them. Maybe users could even save individual model configurations and then call these different configurations to their councils.

Bill Ray

Is there a way to let the council deliberate amongst themselves for a few turns, reporting back with a verdict? I realize this is similar to the council leader feature, but I think some ideas would benefit from questioning and reflection.

Bill Ray

Where’s GPT-5 with search? I want a cracked research council!

Bill Ray

Wow I could see myself daily driving something like this if it had the conveniences of the other platforms. At which point I would almost certainly need the max tier. I could also see myself using an LLM Council API/Python library to integrate these “consensus” functionalities into my own projects. I hope this project becomes profitable enough to hire someone/a team to maintain and expand it! Very cool! Looking forward to more AI Explained videos!

Bill Ray

Very cool!

Peter Seelman

Hey Phillip, I released a very similar product last month called https://pipeliney.ai/ didn't have much success in getting take up though! I'd love to collab we have very similar ideas on how this all should work and what could potentially make a good product!

Scott Rowlandson

Yes to both Shawn! I agree on the niche papers, and yes on the topic and roles and having them debate. You initiate and then can ask them to respond to each other

Philip

Great work Phillip! I'll have to check out the app! Is it possible to give a bunch of AI models a topic and roles and have them debate each other? That would be a really cool interaction. Also, I'm looking forward to seeing you cover more niche and experimental AI research papers like back in the day (Mamba for instance). I think you're channel is at it's strongest when you cover AI research that no one else is talking about rather than just focusing on the major releases.

Shawn Rosofsky

Thank you Clarissa!

Philip

Love the branching! Different goal but very cool also

Philip

Alternative: aiexplained@outlook.com!

Philip

Good luck with the app. I hope it will not distract you too much from the AI Explained and Insider. 🤞

Pavol Vaskovic

29:25 I can't DM you on Patreon 😔

Gilad

Very cool!

Gilad

Very nice Philip! Also, as you have built such an interface, what do you think about this one? Maybe it even provides inspiration. https://x.com/maxleedev/status/1962938776294195498

Manuel

AWSOME PHILIPP

Clarissa Röthig


More Creators