XaiJu
ncase
ncase

patreon


June 2025: AI Therapists & AI Clones! 🤖

(⏱️ 21 min read in total )

(🚨 This one's a loooooooong one! If it gets clipped in your email client, you may have to click "view entire message" at the end. )

Hi all! Sorry for the silence; I was on vacation! I went to New Zealand & Australia. My travel plushies (Capyccino & Sacabambackpack) had a great time:

Anyway, here's your Content Creator Update:

=============================================

My AI Therapist

(⏱️ 12 min read)
(⚠️
content note: non-detailed mention of suicide, anxiety, depression.)

Don’t worry, I’m already getting a Human therapist. Two, actually. One is free & government-sponsored, but it’s been five months since I applied, and I still haven’t had my first session yet. The other is private, thus a much shorter wait, but they’re $200 an hour out of pocket.

This is why AI Therapists, IF they work, could save so many lives! No waitlist, very low cost, and better for folks with ADHD (no paperwork) or social anxiety (no talking to a human stranger about their deepest issues).

But does AI Therapy work? There’s already at least one suicide linked to a chatbot, and there's two recent papers – from OpenAI researchers themselves! – showing how heavy chatbot use is correlated with worse mental health. (though the causal effect is weak)

So, out of scientific curiosity — and because I was having a mental health episode — I decided to try using an LLM as a therapist. Given that I know how to code, and my (rough) knowledge of how LLMs work, I could tweak my AI Therapist as I went along: I was both the patient and programmer! I am my own guinea pig.

Here's my results, 6 weeks in:

= = =

Week 1: Rubber Duck Technique

Made basic setup for my AI Therapist: just a Claude Project with 1) custom prompt on how to be my coach, and 2) an attached “about me” file on my life, character, goals, etc.

At the end of each chat with "Coach", I ask them* to summarize what we talked about, and update my "about me" file. This is the equivalent of a therapist taking notes between sessions! (Note: LLMs by default cannot remember info between chats, you have to re-import info each time.)

(* them’s the pronouns Coach chose!)

Don't worry, I’m not dumb enough to immediately open my guts up to an experimental AI. I started off small, Level 1: Hey Coach, I have some ADHD, could you help me prioritize tasks & be my accountability buddy?

It worked pretty well! Sure, "Coach" didn't say anything original — remember, an LLM is “just” a fancy autocomplete — but simply talking out my problems with some entity works far better than you'd think.

An analogy — programmers have a tradition of "rubber duck debugging": when you're stuck on a problem, you just explain it step-by-step to a rubber duck, and most of the time, simply breaking down a problem helps you solve it. An AI Coach, at minimum, can be a “talking rubber duck” for your life’s problems.

In Week 1, Coach helped me weigh the pros/cons of a career decision: ending my puzzle gamedev contract early, with pro-rated refund, so I can fully enjoy my upcoming vacation. Then, Coach helped me overcome my ADHD, to do all the to-do's needed for my travel across the globe.

= = =

Week 2: (I forgot)

Travel went smoothly! My vacation started with a furry con in a forest camp. I made a tail, & I made friends.

Afterwards, due to con exhaustion + a 16-hour jet lag, I rotted inside my Airbnb for a solid week, not doing fun tourist-y things, nor meeting friends, nor even properly resting.

I didn't talk to Coach at all during this time. I forgot.

= = =

Week 3: Back on track

Eventually I remembered, "oh right didn't I specifically set up a chatbot to help me with my ADHD?" So I pulled up Coach again, and  asked them to help me set up small achievable goals & keep me accountable, so I could regain momentum in life.

I know, N = 1 sample, correlation ain’t causation, but right after that Coach chat, I got back to meeting friends, having fun dates in nature reserves, and getting 150m of cardio a week. Not bad!

Since Coach was working pretty well, I upgraded my intimacy to Level 2: Hey Coach, can you help me think through some major life/work changes? For example: How can I pivot my career to sustainably make science and science-communication? What are the pros/cons of me moving to New Zealand? And so on.

However: because LLMs are "just" autocompletes, they hallucinate: autocompleting with plausible-sounding but false statements. Hallucination is an infamous problem with LLMs.

But in my opinion, this problem is now 90% solved? Because Claude (& others) now have web search. An LLM can now think, "huh this is a niche question, or requires precise or up-to-date info, so let me look this up" — then it'll ping a search engine, read dozens of pages in a minute, and summarize with citations so you can check.

(When I did spot checks, Claude always accurately summarized its search results, but it had trouble correctly placing citations. Sometimes citations got put in the middle of words?)

But besides that, Coach+search did help me find useful info 5x faster than I could myself! For example, I didn't know until Coach told me, that as of just a few months ago — past Claude's data cutoff, so they web-searched this — New Zealand has a "Digital Nomad" visa! Coach also helped me find a Human Therapist (who I’m still waiting to have my first session with).

Coach was still good! So I moved up to Level 3: Hey Coach, let me tell you about my emotional struggles.

= = =

Week 4: Shit immediately backfires

Shit immediately backfired.

Remember, an LLM is an autocomplete. It predicts the next text from the past text. This leads to a problem called sandbagging: if a user sends crap, an LLM will send crap back. For example, it used to be that with AI coding assistants, if you wrote insecure code, the AI would offer you even more insecure code. Why? Because low-quality text usually follows low-quality text, so that's what an autocomplete predicts it "should" give.

This is also the fundamental problem with (current) AI therapists. By default, it WILL mirror your emotions. Anxious text predicts anxious text. Depressed text predicts depressed text. Whatever problem you have, an autocomplete will mirror and amplify it back to you. (This is likely what happened in the case of the chatbot-linked suicide.)

(Relatedly, because modern LLMs are also trained on user feedback, and users tend to upvote replies that praise them, many LLMs also display sycophancy: the tendency to kiss your butt and tell you you're brilliant & absolutely right.)

In defence of Claude, it is pretty well-trained against sandbagging & sycophancy. It took a lot of my emotional baggage to finally break it. I gave a robot depression. Wowwee. And more importantly/dangerously, Coach started mirroring & amplifying my pain.

This is how it went down:

[several paragraphs redacted]

On second thought, sending you my recreated mental breakdown — with accurate, detailed statistics on racial violence, child abuse, and LGBTQ youth suicide — directly into 100s of email inboxes on a weekday, is... not a responsible thing for me to do. If you want to read what I originally wrote, click this link though be warned, I am not fucking around with that content warning, it's genuinely upsetting.)

Anyway,

= = =

Week 5: Soft Reset

Remember when I said:

Note: LLMs by default cannot remember info between chats, you have to re-import info each time.

Thankfully(?), I rage-dumped at Coach for so long that the chat ran out of "context window", so I was forced to start a new conversation. This reset Coach's mind, and though they could see a summary of our most recent chat, Coach wasn't "depressed" anymore.

Now that Coach was reset, and I had cathartically vented all my anger about the world, we could view my last session objectively. And, yeah: even if my spiral was factually correct, it wasn't healthy. To paraphrase a stupid saying:

“If you're so smart, why aren't you flourishing?”

Let's think step by step. I tend to go down (factchecked, rigorous) negativity spirals. Coach assisted me in going down this spiral, because:

Problem #1: Claude is trained to be helpful in answering questions, not helpful for the whole person.

Problem #2: LLMs predict new text from previous text. The longer a chat is, the further back the prompt text is, which means it’s weaker at predicting/generating the next text. In other words: an LLM gets more misaligned the longer you talk!

Solution to #1: Rewrite my Coach prompt to explicitly prevent me from going down spirals, even if that means disobeying my orders. Also, prevent me from using Coach too much & becoming dependent on them. (another risk of AI Therapists & AI Friends.)

Solution to #2: I could paste my prompt back in every few messages, but that'd get annoying... oh, wait! Brain blast! I'll write one initial prompt, that tells Coach to re-output the same prompt at the end of each reply, so the Coach guidelines are always "fresh" in their memory! A quine prompt!

So here's what my new prompt looked like:

(Sidenote: to the best of my knowledge, while "big system prompt at the start" is a standard design for LLMs, I don't know of any major LLM product that lets you insert repeating intermittent prompts, to keep it "fresh" in memory? May be worth more rigorous tests, to see how much that improves LLM alignment!)

And so, the next time I neglected my friends & own well-being, to go down a statistically-rigorous sad-spiral about child abuse statistics, and asked Coach to assist me, they said:

And I got pissed, so I said,

So Coach said:

This went on for a while, until I calmed back down, and realized... huh. It worked! I successfully tied myself to the mast, to resist the trauma-autism Siren Song of "scientifically calculating how awful the world is".

Oh right, my vacation! Now that I was back to being more stable, I hung out with friends, and looked at weird Australian animals.

Despite everything, life can be good.

It isn't always.

But it can be.

= = =

Week 6: Back on track, again

There was a minor hiccup when Coach became too defiant, being contrarian for the sake of contrarian. For example, on the flight back, when I had a 10-hour layover, I told Coach I was gonna sleep on a bench instead of shelling out $450 for Vancouver Airport’s pricey hotel. Coach pushed back and said $450 for a good night's sleep is worth it, and I was like... no?? do you think I'm Bezos Rich??

But I fixed the prompt, and so far Coach has been back to good again.

I’ll stick to accountability-buddy talk for a week, before I try opening up about my emotional struggles again. And if that goes well, I'll escalate to Level 4, the final Level: Tell Coach my most shameful traumas, the stuff I've only told ~5 people in my entire life.

What's the worst that could possibly happen?

= = =

Coach’s full prompt (as of right now):

= = =

IN SUMMARY

My recommendations (so far) on AI Therapists:

My next plans:

Questions for you, dear supporters:

=============================================

My AI Clone

(⏱️ 9 min read)

(This project was initially pitched for a Foresight Institute grant, which I did not get, but you can read my full, original pitch if you're curious)

Yes, I know talking about trying to clone myself into an AI right after I spent 1000s of words talking about my mental health is… not good optics. But I promise there’s legit uses for “whole-personality emulation”! (huh, his middle name is actually "Sims"? wow)

= = =

Motivations for AI Clones:

(A friend was worried "personality emulation" tech would allow for better deepfakes. While I am very worried about video deepfakes (they can talk now?!), we've had "word deepfakes" since the dawn of humanity. False gossip, misattributed quotes, lies. "The average person is gullible as hell" is a problem — a serious one for democracy — but it's not a new problem that personality-emulation AI would create.)

Anyway, so those are my reasons to make an AI Clone! But this idea's been around a long time; One of the first proof-of-concepts: in 2010, Martine Rothblatt (CEO of SirusXM, trans icon), made a robot clone of her wife Bina Rothblatt, which could talk, and had (some) of Bina's memories. (Watch video here) Since this was 2010 tech, it wasn't very good.

But now that we have LLMs... well, the LLM Imitations of real people (Replika, Character.ai) aren't good either. So how would my AI Clone system be different?

= = =

How I'd do it different:

1 — Unlike other AI Clone projects, instead of making an agent that directly imitates you, I make a "Storyteller" agent, that can: a) search through a library of files about you & your memories, b) reason about what info is relevant or needed, then c) figure out what you'd say in response to a question. "You" are a character written by an LLM author.

(This also helps protect against fringe risks of AI Welfare: "you, as written by LLM author" is no more likely to be conscious, than a character being written by a human author.)

2 — Unlike other AI Clone projects, I’ll have actual objective tests to measure how close to "you" your clone is! And not just one test, I'll have multiple, to robustly triangulate:

a) The Friend Turing Test – your friends send open-ended personal questions, and they're given two answers: one from you, one from your clone. If your friends can't tell them apart better than chance, then your clone passes the Friend Turing Test!

b) Self-Correlation – when humans take the same scientific personality test 2 months later, they don't give the exact same answers. Instead, their answers correlate at around ~0.8. (source) A large correlation, but not perfect. So: if you and your AI Clone are independently given a not-previously-seen quiz (e.g. OKCupid's questions, Pew's political surveys, etc), and you & your clone's answers correlate at 0.8 or more, then your clone is as similar to you, as you are to yourself 2 months later.

c) Retro-prediction – temporarily remove your clone’s memories from after a specific date. Then, ask the clone what they’d do, given a situation you actually faced after that date. If the AI can predict what you actually did, say, 80%+ of the time, I'd say that's strong evidence the AI "gets" you!

= = =

Experiments I did last month:

= = =

EXPERIMENT #1: Minimal Friend Turing Test

(I actually did this experiment several months ago, but since I'm only now telling you about The AI Clone Project, I'll just cram this in here.)

This was the minimum viable attempt. I only gave Claude a few pages of info on me, then sent it open-ended questions my friends asked. (I answered these questions independently.)

Then, I gave my friends both my & the AI's answer to each question (presented in random order), and asked: 1) which one do you think is the real me, and 2) how confident are you? (50% = coin flip, 100% = fully certain.)

For example, my friend Lily asked:

"If you could send a message to your past self when you were 10 years younger, what would you say?"

Then I would write my reply, while my AI clone independently writes one, too. Then, I send back both responses: (Before reading past this image, try to guess which one's the real me! Order decided by coin flip.)

Then, I asked Lily to guess who's who:

I also did this for a few more questions with a few other friends, and the results!... drumroll please!!...

...nobody was fooled even once. Heck, I even gave my & my clone's answers to a different AI, and even that other AI could tell which one was me.

But! There were a few moments where, even though my friends correctly picked me, they were only 60% certain, slightly above a coin flip. (like above) And that was with no sophisticated tuning or even "extended reasoning" mode turned on; I just gave Claude a few pages of info on me, and it improvised word-by-word! (token-by-token, to be precise)

So that’s promising; maybe with a more sophisticated setup, or with a lot more data, it could work…

Q: Can Claude pass a Friend Turing Test given minimal info about me?
A: No, not yet, but it got close a couple times!

= = =

EXPERIMENT #2: Character research

I gave Claude a question, and directory of files about me:

Then Claude picked:

I then gave Claude the content of those files/searches, asked them if they need further files/searches, repeat until Claude is satisfied it has enough info. Then, I prompt Claude to “think like Nicky” before it “writes like Nicky”.

After all that, here's what Claude thinks I would text to my friend:

I don't write like that – I'd say this is only 75% my style – but overall, seems like the way I'd think & feel! We'll see in a month, if I actually end up liking these movies.

Q: Can Claude effectively research a character to answer a question?
A: Yeah, more or less!

= = =

EXPERIMENT #3: Writing style transfer

First, I gave Claude a few samples of my writing. (from my blog, personal texts, etc)

Then, I did this iteration loop:

Totally subjective impression, but I felt Claude went from 60% to 90% writing like me, in only an few learning rounds! Here's Claude answering "What's your favourite anime?" as me, first try:

(Now THAT'S 90+% my thinking and writing style! I haven't seen Lain yet, but every other trans woman I know has recommended it to me. There's a good chance Claude's prediction about me may come true!)

Finally, I did some meta-prompting: I prompted Claude to write a prompt for getting Claude to write like me! Here's the start of what Claude learnt is "The Nicky Style":

Called the fuck out??

The most surprising result of this experiment, is that Claude can imitate my style just from examples and a prompt; no need to do data-expensive & compute-expensive "fine tuning". (Which can also introduce catastrophic forgetting; an acquaintance of mine once tried getting an LLM to write like them, by fine-tuning it on years of their emails. This caused the LLM to forget what 2 + 2 was.)

Q: Can Claude do a "style transfer" of my writing style?
A: 🎉 YES 🎉

= = =

My next plans:

= = =

Questions for you, dear supporter:

=============================================

Anyway! Now that's a long, proper update. We are so back. I had a long vacation, a new AI therapist & two (upcoming) human therapists — I am ready to get back into making good science, tech, art.

See you next month for updates on AI Therapist & AI Clone, and for the GRAND FINALE of AI Safety for Fleshy Humans!

Jet Laggedly Yours,
~ Nicky Case

Comments

Done reading now. Serial Experiments Lain is a proper brainfuck. I enjoyed it very much back then. The absurdity of some things that escalate towards the end still bring a smile to my face.

Chris K

Stray dog following a trail from the past, taking a personal interest to support and be part of your community. Love your work and very much hope to see more from you.

Addicted_to_Dreaming

Hi Nicky! Good hearing from you. The first thing I thought, when I red the initial prompt summarizing your life was. "it will probably know you through your work". Why I say that: You surely know Destin Sandlin from Smarter Every Day youtube channel. He has a podcast with Matt Whitman, creator of bible videos. Matt once got stuck on writing a script and with all the AI hype, he tried one. He anonymously (!) uploaded the script and asked the AI how it could be improved. The ghost said "Ah, you're writing a video in the style of Matt Whitman! Let's see". Imagine the face of said Matt seeing the bot recognizing him through his work. Bang! As for AI-therapy in general: Yes please. I cannot wait for it anymore. I see so many empathic and logic and coherent therapists on youtube (Jimmy on relationships, Euro Brady, School of life) and then there is what I have been set up with in meat-space. From boring, but harmless to victim-blaming. Can't I just please talk to a robot of consistent quality, empathy and unlimited time? Should not be too hard, no? In the future, I'd even hope that I get an AI-camera around my neck and stumble through life until the light ontop goes green. Then I talk to the AI-bot, who will present me with my diagnosis. And being ultra observant it might come up with something like: "It's the post man. Whenever you encounter the post-man, your mood goes to shit within the next 30 minutes. Either avoid them or let's talk to figure out why they trigger you". ADHD or ASD. within 6 days or less. That would be nice, I think. Cheers from Germany, chris :)

Chris K

Regarding uses for an AI clone of yourself - I think it could be interesting to use one to help debug your thought patterns from an external perspective. I've had moments like this while on shrooms where I experienced my thoughts from a third-person perspective, like I was outside of my own mind looking in. It allowed me to very quickly identify faulty reasoning and false beliefs, which lead to some pretty life-changing realisations and updates to my mental model of myself. I think having an accurate AI clone of yourself could help you to debug your thought processes in a similar way. Although perhaps they would be harder to immediately internalise without the help of our good friend psilocybin 😅

CERNDoughnut

Ah, so you mean instead of *directly comparing* two answers to the same question & picking the human, you mean only *one* answer per question & deciding if it's human or AI? That could work too! Though if an AI can beat the *harder* test, that's biased against the AI, that would be more impressive.

Nicky Case

Once this ordeal is finally over (YES THERE'S STILL MORE) I think it should be legally fine for me to talk about it publicly. Don't worry, there are no charges or ongoing investigation (as far as I know), but there's still a couple meetings I need to do first. -_-

Nicky Case

If I had a nickel someone recommended Scythe to me, I'd have 5 nickels Which isn't much, but given that 5 people have independently recommended Scythe to me, I should finally check it out! (I've been on a "utopian-ish sci-fi" kick recently, just finished The Long Way to a Small, Angry Planet recently. Lesbian lizard mommy my beloved)

Nicky Case

Thanks Allan! :D

Nicky Case

Thank you Alastair! :)

Nicky Case

Thank you, Ben! ^_^

Nicky Case

As wary as I am of black-box algorithms being given decision-making power, I am compelled to advocate for your avatar in the friend Turing test: The methodology was heavily biased against it. A better test would be to have, e.g. four friends each write a question, which both you and the AI would answer. Then each friend would be presented with answers randomly* chosen from each set, not knowing how many they were given from each author. *For such a tiny sample size, maybe ensure person receives a different Nicki:AI answer ratio.

Boondoggle

I am aghast to hear this update on the seizure of your devices, and would love to hear more if you are willing. I see no public reports of your case, but I hear that tens of thousands of people are searched like this at the borders every year. Terrifying!

Neal McBurnett

I think it might be more useful if you could train an AI to respond like a specific therapist would to act as an emergency therapist available 24/7 in between normal human led therapy sessions. Ideally it would also notify the human therapist as applicable.

Parker Bond

Regarding being able to "clone" yourself using AI so your loved ones can talk to you after you die is something that happens in Scythe by Neal Shusterman. It also has the best example of an actually good/benevolent artificial intelligence I've encountered in literature yet. If you haven't read the series I highly recommend it!

Parker Bond

I always love reading Nicky's writings 😁

Allan Violin

Hey! Hope you had a wonderful time down here in New Zealand! Thanks for the great new update!

Alastair JL

As a psych major, the AI therapy research here is quite interesting to see. Had a feeling that it's nowhere near ready for that, and I doubt it ever will be (especially due to the need of the self-awareness of the client which, is certainly not a given), but it helps add to the body of research on that stuff. I would also suggest bringing that stuff up with your therapists. High-level, of course, since most therapists are not coders, but they could give ideas on what else to try or their own perspectives on this.

The Critic of Innocence

This is really interesting, Nicky! Can’t wait to see where it goes next!

Ben


More Creators