XaiJu
ncase
ncase

patreon


What's Nicky Learning? Decision Theory, Ottawa, Existential Risk

(Three main sections, that take 16 minutes / 6 minutes / 5 minutes to read.)
(32 minutes reading time in total, not counting links)

Hey all! I haven't done a What's Nicky Learning? post in a while. So here's a thought-dump of what I've been learning recently. (Alas, I did not get any publishable work done last month. This is 10% because the world has been... distracting... but honestly 90% it's just me anxiety-ruminating and media-binging as usual. This Patreon will be paused one more month; you weren't charged on March 1st, but will be on April 1st.)

Table of Contents:

————————————————————————————————

————————————————————————————————

A Rationalist Valentine's

This comic did not do well on Twitter. I thought there'd be a bigger overlap between "people who know about weird thought experiments in decision theory" and "people who are into cyborg scalie girls".

No seriously, I honestly thought that. I worked so hard on that drawing! 😿

But, speaking of weird thought experiments in decision theory...

————————————————————————————————

————————————————————————————————

A Satisfying Solution to a 50-Year-Old Paradox

(16 min read)

Part I: Newcomb's Paradox

You're approached by an eccentric trillionaire named Omega. They say: I've bought all your personal data, and can predict with 99.9% accuracy what you'll do in the following dilemma. (Omega is nuts, but honest. They really can predict you with 99.9% accuracy.)

The dilemma:

So... would you one-box, or two-box?

This is (a slightly modified version of) Newcomb's Paradox. [wikipedia] You probably think the answer's obvious. And you're right, it is. According to Robert Nozick, the philosopher who first analyzed this puzzle in 1969,

"To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly."

(According to a 2020 survey of philosophers, it's still near-evenly divisive fifty-plus years later! 31% one-box, 39% two-box, the rest other/undecided.)

The one-boxer says: It's obvious! Omega's 99.9% accurate. So if I one-box, I'm near-guaranteed $1,000,000. But if I two-box, I'm near-guaranteed only $1,000. Clearly, one-boxing is better. (And you can rigorously check this by calculating the choices' "expected values".)

The two-boxer says: It's obvious! Whatever I choose can't change the money that's already in the boxes, so no matter what Omega predicted, I'd get an extra $1,000 by taking both boxes. Clearly, two-boxing is better. (And you can rigorously check this by applying Causal Decision Theory, the standard theory of decision-making found in all economics/game theory textbooks. More on this later.)

So... who's right?

Part II: Why Would We Even Care About This Contrived Problem

In Newcomb's Paradox, you can't cause the past to change, but the past was very good at predicting the present.

The story with Omega-the-trillionaire may be contrived, but real people try to predict each other all the time. Specifically, we try to predict if someone will cheat even if their cheating can't be detected or punished.

For evolutionary reasons, most non-autism-spectrum people are good at predicting others' character – (at least face-to-face... online, not so much) – a facial micro-expression, a vocal inflection, the tiniest fidget, can give away your intentions.

For example:

I'll tell you a juicy secret, which would give you 1,000,000 Happiness Points.

However, I'll only tell you if I'm 99.9% sure you won't tell anyone else. Though, telling someone else would give you 1,000 Happiness Points, and I can't find out if you do tell, so no punishment is possible.

Will I tell you my secret?

Well, if I knew you studied Economics/Game Theory 101 (and had no moral scruples), I know you'd think: Well, once Nicky tells me their secret, I'd already get 1,000,000 Happiness Points. My sharing the secret can't cause a different past nor future punishment, but it can cause me to get another 1,000 Happiness Points!

Even if you promise not to tell, I know you know game theory, so I know you'd break your promise. Therefore, I will not tell you my secret, giving you 0 Happiness Points.

Meanwhile, someone who doesn't know game theory at all (or has moral scruples) will get my juicy secret, and 1,000,000 Happiness Points. That is: a "rational" happiness-maximizing agent will get less happiness than an "irrational" agent.

Other real-life Newcomb-like problems:

So: Newcomb's Paradox isn't (just) a niche math problem – it may give us insight into real-life human moral psychology!

Part III: Decisions using Causation, vs Correlation

The standard theory taught in all economics & game theory textbooks is Causal Decision Theory (CDT). It says you should choose the action that, given your situation right now, will cause the best outcome.

In Newcomb's Paradox: whatever Omega predicted, what you do right now cannot cause Box B's contents to change. But you can cause yourself to get an extra $1,000 from Box A. Therefore, you should two-box.

And yet! Someone who follows CDT will (99.9% chance) only get $1,000, while someone who knows no game theory will (99.9% chance) get that cool $1,000,000. CDT, despite advising you to cause the best outcome, did not cause the best outcome.

(And as mentioned above, Newcomb-like problems abound in real life, and CDT will make you do worse on all of them.)

So, mathematician-philosophers have proposed a rival decision theory: Evidential Decision Theory (EDT). It says you should choose the action that, given your situation right now, is correlated with the best outcome.

In Newcomb's Paradox: 99.9% of the time, one-boxing gets you $1,000,000 and two-boxing gets you $1,000. Therefore, you should one-box.

So, following EDT gets you $1,000,000 while CDT only gets you $1,000. Which means, even by CDT's own rules, you ought to "cause" yourself to stop believing in CDT, and follow EDT instead.

Unfortunately, since EDT focuses on correlation instead of causation, it has all the problems of mixing up correlation with causation.

Here's a problem where EDT fails: 99.9% of the time you eat chocolate, you get a headache the day after. So, you strongly consider quitting chocolate for good. But suddenly, you learn it's not chocolate causing headaches, but rather Pre-Menstrual Syndrome (PMS) causing both chocolate cravings & headaches – a mere correlation.

Should you quit chocolate anyway (for headache-related reasons)?

Well, obviously not, and CDT agrees: since now you know chocolate doesn't cause headaches, quitting chocolate won't help. But EDT can't consider causality at all, so EDT still recommends you quit chocolate anyway, and keep getting the pain of headaches without the joy of chocolate, until your lifetime correlation between chocolate & headaches balances out.

(This is another type of real-life Newcomb-like problem, called "Medical Newcomb Problems". The above example was taken from this paper.)

So both causation-based CDT & correlation-based EDT suck at different kinds of real-life Newcomb-like Problems.

If not causation or correlation... what now?

Part IV: Functional Decision Theory, or, a game theory Code of Honor

Before we look at our third theory, let's see CDT's "escape hatch": pre-commitment.

In Newcomb's Paradox, someone who follows CDT – let's call her Carol – only gets $1,000. But if Carol knew in advance she'd face this dilemma in a week, she could credibly pre-commit to one-boxing. For example, she could hire a goon to kick her teeth out if she two-boxed. Omega knows this (from the goon market transaction data), so they predict Carol will one-box, thus Box B will have $1,000,000, and Carol takes just that box – $1,000,000 richer and all teeth intact.

Sadly, Carol doesn't know about Omega's scheme in advance. And it's impossible to predict & pre-commit for every real-life Newcomb-like problem you may face in the future. So:

Can you make a fully general pre-commitment? Can you pre-commit to acting to how you wish you could have pre-committed to act?

In some of the above Newcomb-like problems (keeping a secret, avenging a comrade, respecting privacy) you may have noticed they all seem to involve "honor". That is, can others reliably predict you'll not cheat, even if cheating can't cause a different past, nor future punishment?

That's honor: the general pre-commitment we need!

So, can we turn "honor" into math?

Functional Decision Theory (FDT) says yes! After a decade-plus of teasing about having a cool new decision theory, Eliezer Yudkowsky (with Nate Soares) finally published this paper in 2017, which I only found out about last month.

To compare & contrast:

Causal Decision Theory says: In situation X, do what causes the best results.

Evidential Decision Theory says: In situation X, do what correlates with the best results.

Functional Decision Theory says: In situation X, do what would cause the best result if Functional Decision Theory said to do that in situation X.

In less weird words: FDT says to act however it would have been best to pre-commit to act. It's the game theory equivalent of a "code of honor".

(Is it weird that FDT defines itself partly in terms of itself? Not any weirder than mathematicians defining the Fibonacci Numbers as "each Fibonacci number is the sum of the previous two Fibonacci numbers (and the first two are 0, 1)". When we do self-reference in math/programming, it's called the "functional" approach, hence "Functional" Decision Theory.)

(I note a parallel with virtue ethics: one thing that confused me about virtue ethicists is that they say stuff like "virtue is doing what a virtuous person would do". Maybe that self-reference was not a bug, but a feature after all?)

Let's see how FDT does in Newcomb's Paradox (where CDT fails) and the Chocolate-Headache Problem (where EDT fails).

Newcomb's Paradox: "Let's see... Omega knows I follow FDT. If FDT says I should two-box, Omega will know I'll two-box, put $0 in Box B, and I'd only get $1,000. But if FDT says I should one-box, Omega will know I'll one-box, put $1,000,000 in Box B, and I get $1,000,000. It's a better result if FDT tells me to one-box, therefore FDT does in fact tell me to one-box."

Chocolate-Headache Problem: "Let's see... if FDT says to quit chocolate, I'll still get headaches without the joy of chocolate. If FDT says to keep eating chocolate, I'll still get headaches, but at least I get chocolate. It's a better result if FDT tells me to eat chocolate, therefore FDT does in fact tell me to eat chocolate.

What's the subtle difference between these two problems, that FDT can pick up on, but CDT and EDT can't?

For both problems, let's draw a diagram of what-causes-what:
("x causes y" is shown as x → y)

EDT can't pick up on the subtle difference in what-causes-what, because EDT can't pick up on causation at all.

CDT can pick up on causation, but it still sees no useful difference between the two scenarios, because neither has a causal path (shown as →'s) from "your action" to "the important thing".

FDT picks up on causation and the fact that 1) in Newcomb's, there IS a causal path from "your predisposition" to "the important thing", but 2) in Chocolate-Headache, there is NO such path.

(I notice a similarity to the debate over the Big Three main branches of moral philosophy: 1) virtue ethics, which focuses on personal character, 2) deontology, which focuses on actions, and 3) consequentialism, which focuses on the consequences of actions. Since one's character causes one's actions, which then causes consequences, we could draw it like this:)

(So... FDT may be a mathematical reason for why "virtue/character ethics" evolved, biologically and/or culturally, to be the most people's moral intuition? At least pre-1950's / non-Western people.)

Part V: Mo' problems

It's not just Omega & chocolate! Here's two more problems where FDT beats CDT: (I don't care about EDT anymore. It can't even deal with correlation-not-causation problems, way too common in real life.)

Twin Prisoner's Dilemma:

Omega uses your data to make a "twin" AI, which imitates your decision-making predisposition with 99.9% accuracy. Then, they pit you against your twin in a "Prisoner's Dilemma" game: each player chooses to give up $0 or $1000, and the other player gets a thousandfold what they gave up. Both players choose independently, they can't communicate.

CDT says: You can't cause what your twin gives you – the AI was already made in the past – you can only cause what you give up. It's better to give up $0 than $1000. So, you and your twin (with 99.9% probability) both give up $0, and both get a thousand times $0, or still $0.

FDT says: You & your twin follow FDT. If FDT says "give up $0", you both get a thousand times $0, or $0. If FDT says "give up $1000", you both get a thousand times $1000, or $1,000,000 (minus giving up $1,000). It's a better result if FDT says to give up $1000, therefore FDT in fact says to give up $1000. So you both do, and get back $1,000,000.

Thus, the CDT twins get nothing, the FDT twins become millionaires (minus a $1,000 fee).

A contrived example? Perhaps, but consider this: if you grew up in the same culture as someone, read the same books, know the same ideas... you may still be very psychologically different, but in terms of your decision-making predisposition, you're close to being "twins"!

So here's a very real-life analog of the twin problem: voting.

CDT says: Your one vote out of millions is vastly unlikely to cause a different outcome. Therefore, don't bother. (The "irrationality" of voting has been known for centuries [wikipedia], with some founding figures in economics just straight-up saying, "don't vote". example, 1-min article)

FDT says: You and a chunk of your country (say, a fifth), have similar values & decision-making predisposition – as if you're "decision-theory twins"! Let's make it harder: each twin is only 50% likely to do what you do. Now, whether your shared code says to vote or not, half of your twins – so half of a fifth of the country, or a tenth of the country – will do the same. Changing a tenth of the total vote is very likely to cause a different outcome! Therefore, vote!

(Of course, no-one thinks in FDT that explicitly –– they just have the moral instinct closest to FDT: honor. "It's our civic duty to vote".)

Predictive Blackmail:

Omega uses your data to predict (with 99.9% accuracy) if you're susceptible to blackmail. If and only if they predict you are, they'll send you a letter demanding $1,000, or else they'll publish scandalous secrets that cause you $5,000 worth of damage.

Let's say you get the blackmail letter. What now?

CDT says: You can't cause the blackmail to not have already happened. All you can cause now is avoiding $5000 of damage, at the cost of $1000. So, you'd pay up. (And since Omega knows you follow CDT, they can predict you'll pay, so 99.9% chance you'll get the letter.)

FDT says: "Let's see... Omega knows I follow FDT. If FDT says to pay up to blackmail, Omega will send the letter, I'll pay, and be $1000 worse off. If FDT says to not pay up, Omega will know this, and 99.9% of the time will not send me the letter at all, so I lose nothing. It's a better result if FDT says don't pay, therefore FDT in fact says don't pay.

In sum: to get an only 0.1% chance of getting the letter, FDT tells you to not pay the blackmail... even if you're already holding the letter right now.

Is that an obviously wrong result for FDT?

It's strange, but I don't think so!

1) It's basically the "don't negotiate with terrorists" policy.

2) FDT still beats CDT. 99.9% of the time, FDT gets no letter and loses $0, while CDT gets the letter and loses $1000. (And 0.1% of the time, FDT gets a letter and loses $5000, while CDT gets no letter and loses $0.) I won't do the calculation here, but the "expected value" of doing FDT is much higher than CDT's.

(Thus: even by CDT's own rules, you ought to "cause" yourself to stop believing in CDT, and follow FDT instead!)

Note: if Omega was only, say, 50% accurate, then FDT would pay up. But in that case, Omega isn't acting like a rational blackmailer, but more like a random lightning strike. FDT can respond to this change of facts!

3) Another way to think about it: if CDT-Carol knew in advance of Omega's blackmail scheme, she'd pay a goon $2 to kick her teeth out if she gave into the blackmail, to pre-commit to not paying even if she got the letter. Omega will know this, predict non-payment, so Carol gets no letter. FDT accomplishes the same result without the costly pre-commitment, or needing to know stuff in advance.

Conclusion

I, uh, spoiled all the main surprises of the paper. Whoops. But if you want technical details & more fun dilemmas, read the paper for yourself!

In sum: Yudkowsky & Soares have come up with the most elegant, satisfying solution to Newcomb's Paradox I've seen so far. Not only that, it works with a wide variety of seemingly unrelated problems: the Chocolate-Headache Problem, Twin Prisoner's Dilemma, predictive blackmail, etc. That's huge evidence that this may in fact be a better decision theory, and deserves to be the new mainstream-textbook theory.

(Not from the original paper: my own hand-wavey connections to honor, character, virtue ethics, and the evolutionary basis of moral psychology. But I hope they were interesting hypotheses worth sharing!)

Could you make a dilemma where FDT fails to other decision theories? Sure: Omega scans your data, then punishes you if you've even heard about FDT. But that's not a fair problem. To the best of the authors' knowledge so far, FDT seems to get the best outcome in all fair problems. (However, they can't prove it yet, without a formal definition of "fair".)

One main reason the authors care about this result is because they're AI-Alignment researchers, so they want a mathematical decision theory that works when others act based on predictions of you'll act (where CDT fails), or correlation-not-causation scenarios (where EDT fails). Personally, I think this result also may help explain why our moral intuitions of honor & character biologically/culturally evolved, and why real-life people sometimes do better than Game Theory 101 predicts.

But either way, it's a cool elegant solution to a niche problem that's been bugging mathematician/philosophers for 50 years, so I'm glad we can all finally shut up and move on with our lives.

————————————————————————————————

————————————————————————————————

Living in Ottawa during the Trucker Convoy

(6 min read)

Man who even gives a fig anymore, the Cold War probably re-started last week.

But, if anyone still cares, I live in Ottawa and here's my notes on the recent, 31-day-long, anti-Covid-restriction Freedom Convoy protest. In chronological order:

————————————————————————————————

————————————————————————————————

A mini-review of Toby Ord's book on existential risk

(5 min read)

⭐️⭐️⭐️⭐️☆ 4/5, recommended for compassionate, smart people who want to feel depressed for a few weeks

Toby Ord's book The Precipice [official book site] is about all the horrible ways humanity could go extinct (or worse than extinct), from supervolcanoes, to nuclear war, to "natural"/lab-leak/engineered pandemics, to global totalitarianism, to tail-risk climate change, to unaligned AI. (This book came out a few weeks before the WHO officially declared a pandemic.) But it's not a book of despair! Ord confronts the odds objectively, without sensationalism, and lists many tractable solutions to mitigate these existential threats to humanity's long-term potential.

(Well, after reading it I ended up despairing anyway. Did you know the international Biological Weapons Convention has just 4 employees? Have you seen Wikipedia's list of nuclear close calls?!)

The best and worst thing about this book, though, is that it's so thorough. The book is 468 pages long. The "main" book actually ends halfway – the rest is ALL notes, appendices, and citations. Early on in the book, he spends ~20 pages defending the idea that humanity going extinct is bad.

No, really.

He addresses Epicurus's Stoic idea: if sadness is in the human mind, how can it be sad if there are no humans left to grieve? He addresses the economic idea that exponential discounting implies that the entire future of humanity is worth only 20x the next year. He addresses moral systems that places low or zero weight on human welfare, and all on animal/environmental welfare. He addresses problems with utilitarianism, but notes that existential-risk-reduction can still be justified with duty-based or virtue-based meta-ethics. He addresses the idea that the loss of the human species isn't a cosmic loss, given the chance of intelligent aliens in a vast universe.

He's that thorough.

If you're the kind of person that likes that much scientific & philosophical detail – and I do! – pick up the book. All profits go to effective charities! (But, uh, I do strongly value brevity in layperson-targeted books, so that's why I've taken off 1 point from an otherwise perfect 5/5 rating, for this admirable work of rigorous scholarship.)

In a rush? Here's the book in a nutshell:

1) THE PROBLEM

We could all die this century. Loose tongues & cheap action films have made "end of the world" sound so trite, but, no, really, this could happen.

2) THE RISKS

Here's Toby Ord's table of estimated risks:

(Yes, it's "funny" how many people would be relieved to hear that existential risk is only 1-in-6.)

The book has his thorough reasoning for each estimate. Here's my notes on some of his more surprising estimates:

If you're wondering about my personal biggest concerns... as I wrote in last month's Patreon post, on 2/2/22:

I have sincerely lost sleep, realizing there's a 1-in-6 "Russian Roulette" chance that, in my lifetime, I will see an SK-Class Mass Suffering Event, such as (not mutually exclusive): a bioweapon pandemic, nuclear war, World War 3, a new age of autocratic empires.

(Note to self: probably retire the phrase "Russian Roulette".)

3) THE POSSIBLE SOLUTIONS?

But there's hope! In the book's Appendix F – and yes, the book has SEVEN appendices – Ord has a list of concrete policy/research proposals for each existential risk: 5-min read

(Some, like "US rejoins Paris Agreement", happened! Some, like "US and Russia restart nuclear treaties"... urgh... hate to tell you from the year 2022, Toby Ord from 2020...)

Ord is an idealist, but a practical idealist. (The best kind!) He recommends that we, as a society and as individuals, focus on what's cost-effective:

Cost-Effectiveness = Neglectedness × Tractability × Importance

Or: unpicked, low-hanging fruit. Go for diverse portfolio of cost-effective interventions, a big ol' basket of fruit, and maybe, just maybe, we can live to see humanity's true, trillion-year potential.

Anyway that's the entire book. Or at least, I think that's all that the vast majority of people need to know.

And people very, very badly need to know.

————————————————————————————————

————————————————————————————————

Miscellaneous Bits n' Bobs

Question 1: Was putting a 30-minute long email in your inbox a bit too much? Would you prefer shorter posts, or have the Patreon update just link to a draft blog post, so you can read it in-browser, not in-email-client?

Question 2: Do you have any thoughts, feedback, further resources, or cute cat videos to help me feel less depressed? Dump 'em in the comments below!

I don't know if I have any readers in Ukraine, but where-ever you are, I hope you're staying safe. Please, be safe.

~ Nicky

Comments

Wonderful post, way clearer than the original paper which I struggled with so much that I didn't finish reading and hence learned nothing. At least it didn't discourage me enough from the topic itself to not to try reading your post. Small note: I think this sentence misses a word (or at least adding a word would make it clearer). Missing word in square bracket: "One main reason the authors care about this result is because they're AI-Alignment researchers, so they want a __mathematical __decision theory that works when others act based on predictions of [how] __you'll__ act (where CDT fails)"

Marta Krzeminska

> y'all patrons get sneak peeks at educational blog posts, with or without interactives, and help shape 'em for public release Sounds like a plan!

Anton Iokov

https://www.emergencykitten.com/

Chris K

Ah, so my vague notion was apparently correct after all. Glad the reminder was helpful. It's been a while since I read The Story of Us. Forgot that he actually cites you! Anyway, I love your work. Keep it up! In a way that is ultimately healthy for you, of course. ;)

Syvanus

Thank you Tom, that's really encouraging to hear! :) And this Patreon update *is* the beta test! (I'll be adding a couple sections in response to patron feedback/critiques – whether it's too hard to implement FDT in practice, isn't this just Kant's categorical imperative, what the heck the authors mean when they say "subjunctive not causal dependence", etc)

Nicky Case

Thanks for reminding me to re-read that soon! But yes, I have read that series – in fact, I'm cited in Part 10. ;)

Nicky Case

I've read about a half dozen articles explaining CDT and FDT, yet I couldn't understand the theories until I read yours. This is really, really good, Nicky, thank you! Did you beta test this post?

Tom Lieber

I'm very late to the game here, but I really loved the section on FDT! Also, I have a vague notion that you are already familiar with Tim Urban and The Story of Us but in case you're not, it's a great piece on the "How to understand domestic & foreign politics, from a "bottom-up" perspective." front. https://waitbutwhy.com/2019/08/story-of-us.html

Syvanus

Don't mind me, just checked, and it *is* publically available.

Albert ARIBAUD

Wonder if this post is going to become publically visible at some point? I would like to show it to my daughter. I could copy-paste it for her, of course, but I won't do that unless explicitly allowed, so if it becomes public eventually, I'll just wait. :)

Albert ARIBAUD

I haven't been following you for long so the topic is still new to me, but I'm fascinated by the link between game theory and politics you just drew. I've always been too lazy to get seriously interested in politics, but this angle makes it interesting enough that I actually want to research it this time. Which is a good time for it, since there's a major election coming up in a few months in my country. Thank you for that.

A lama

Ok, repost as the note did not get found .... The length was fine for me. I enjoyed this article a lot, partially because I have spent a lot of time thinking about Newcomb's Paradox. Here is another interesting twist. Assume I get together a large number of like-minded individuals. We don't care about the money. We're committed to Discovering the Truth, instead. And the Truth we want to examine is about the nature of Randomness. So we get ourselves a device that is connected to some sort of really random number generator -- a cesium clock, atmospheric noise, what have you. We only want 2 bits of information here, a 50/50 split corresponding to 'one box' or 'two'. By current scientific understanding we have no way to predicting what the generator will say. And neither does Facebook or any billionaires or anybody else who is living within this universe. And we all line up, having pledged to do whatever our generator says to do. (And we keep our promise.) Now what happens? My position is that either a) the accuracy of the prediction will decline to the point where, if there are enough of us pledged people doing this thing (i.e. almost all of us) the predictive ability will end up at 50%. We have been investigating a natural phenomenon, the predictive power of the billionaire, and demonstrated that it is indeed behaves like a natural phenomenon, and has limits. or b) the predictions continue to hold up. At this point I claim that we are not investigating a natural phenomenon at all. We are investigating a miracle. And proven that at least one miracle exists! Either way, you have generated knowledge about the world, which I would like to have more than the money. All the best from here in Sweden.

Grävling

FWIW, I like this kind of update. Almost like a little zine. I seldom read emails like this anymore, but it really helps that you seem to have the same kinds of interests as I do, and they're varied, so basically everything you write is catnip for me. I can't help but notice that Functional Decision Theory is very close to Kant's Categorical Imperative from _Groundwork of the Metaphysic of Morals_ (1785): "Act only according to that maxim whereby you can, at the same time, will that it should become a universal law." Maybe I'm not understanding FDT fully, but to me it seems like a different way to express the same idea. Is this something the authors of FDT address, this similarity?

Filip Hracek

…And I finally read the rest of the post, including the Misc. section where you basically said half of what I just did. Basically: I agree, go for it.

Kronopath

Why the clown makeup meme? The decision theory section of this post is genuinely fantastic and plays into your strength of being able to explain complex concepts in an understandable way. Seriously: pull that section out of this post and post it to that new blog you’re planning, and you’ll have a fantastic post. Since you’re talking about rationalist-related stuff here: a lot of people around that community have found surprising amounts of success in doing exactly what you just did, taking complicated rationalist or rationalist-adjacent ideas (that they themselves did not come up with) and explaining it in simpler and entertaining language. Examples include Rob Wiblin’s Medium post on Ugh Fields, Lars Doucet’s intro to Georgism (which won the ACX book review contest), and, like, a significant chunk of Scott Alexander’s entire blog.

Kronopath

Try re-posting it as a reply to this comment? I didn't receive any notes (as notified by email notifications) other than the blank-line-less ones above.

Nicky Case

But the note is still AWOL ...

Grävling

checking to see if shift-enter works! Yay! It Does! Thank you!

Grävling

I didn't remove anything! I got the following email notifications with what you wrote: "re: Newcomb's paradox and it's successors." "Grrrr 2! Other people get to insert blank lines and paragraphs in their replies! Where can I learn what they know and I clearly don't!" For newlines, Shift-Enter works on my end! (Very not clearly explained, Patreon UI...) Does Shift-Enter work for you?

Nicky Case

I posted something here. Edited it too. Now it is gone. If Nicky wanted it gone, and removed it, that's ok, but otherwise we have a vanishing response bug ....

Grävling

Wow, I loved this post! Some great food for thought. I'm not always in a position to read long emails when I get them, but I was today and I feel grateful for that experience.

Rachel Helps

1) No, I enjoyed reading it before breakfast (I'm retired so have a fair amount of free time). There are some email correspondents who cause my heart to sink when something from them appears in the inbox but that will never be the case with you! 2) Is it unethical of me to think of running a Newcomb's paradox experiment for real on my grandchildren (6 and 9 years old)? One sweet v a bag of sweets? I'd guess the 6 year old would go for both and so only get one, but the 9 year old I'm not sure.

John Stout

Glad to see this post! 1) No problem, your emails go to my RSS reader anyway. 2) No real links, but, maybe you'd be amused to think about this? FDT feels very "strange loopy" to me, in the Gödel, Escher, Bach way. I wonder if it has any interesting failure modes as a model that are similar to Gödel's Incompleteness Theorem? And if so, if those correspond to any actual areas real humans following reasonable ethics would have trouble with? (After all, a key part of the broad effect of Gödel's theorem was learning that it didn't only apply to completely contrived statements; there are interesting mathematical questions that can be translated to include "This statement is false.")

Eric Willisson

A fascinating post, I found the decision theory piece enlightening-- it's reputation and societal virtues in mathematical form!

Conrad Wong

I think this was great! These are topics dear to my heart, and you've breathed fresh life into some of them - especially with the interesting association between FDT and virtue ethics. As a semi-tangent, the CDT folks are wrong about whether voting is worthwhile *even under their own framework*! In elections with close polls, ties (and near-ties, which imply that ties are not that rare) happen just as often as you would expect given the statistical model where you integrate the chance of a tie given the true electoral lean, against the chance of that lean given the state of the polls. Which is to say, the chance of a tie is several times larger than one divided by the number of voters. If a CDT person would bother voting in an election where the winner was picked by selecting one ballot at random (and they should), then they should vote in a standard election. /rant

Patrick L

cyborg scalie girl ftw!!! 1. I personally loved this format! It's my first update of yours I'm reading, so it's hard to compare to anything else, but, it was a wonderful read and I took a break between each chapter. Plus knowing how long each part takes to read is awesome.

Detective Chiyo

Email length was not an issue for me. Thank you for writing it 🙂

Randy Gingeleski

Thank yoooouse 🐉🤖

Nicky Case

I like your cyborg scalie girl. <3

Aeryn Light

Thank you for email! I enjoyed every minute of it. As someone who has been procrastinating while binging the news, your insights and musings shine like a beacon in the dark fog of current events :D Here's a playlist I use as anti-depressant: https://www.youtube.com/watch?v=--9kqhzQ-8Q&list=PLaLpAOF66jPGrtuzk8y1aTF8LlB3sphHC

Sylvester Lan

Brilliant thought-provoking post. Thanks for it! Regarding FDT: this blew my mind, and articulated something I've been struggling with for years. Specifically relevant re: the "voting" example. I literally used to not vote because of CDT, but eventually struggled with realizing that "the kind of people not willing to engage with CDT were WINNING ELECTIONS". So the way I eventually articulated it to myself was through the humility that "I'm not especially unique; there are probably millions of people like me; if I can convince MYSELF to vote (or even simply 'let myself be convinced'), it's reasonable to assume those millions of people will have a similar realization and I can give myself the power to win elections"! Now I have FDT which is much more cleanly generalizable! SELF INDULGENT TANGENT- FEEL FREE TO IGNORE: Reading about FDT also has me thinking about something else I've been informally mulling over for a few years now on the topic of metaethics. Here's a sloppy, not-quite-complete articulation of Kant's "Categorical Imperitive": an action is good iff it can be universalized (could be adopted universally without self defeating). Example: murder is wrong, because if everybody murdered, then nobody would be around to do murders! It's obviously flawed (especially in this mis-articulation), but I present it only as the inspiration for what I've been wrestling with: A DT is good iff its universalization maximizes the good, and an action is good iff it follows from a good DT. This has been the working basis in articulating my moral compass (a work in progress!), and reading this has helped in the process of making it more rigorous. It seems like it's not far off from "FDT + universalization"! Anyways, I'm curious if there are any glaring red flags anyone can see, or if there are other similar metaethical theories I might be interested in. (As I said, I'm engaging with this as a personal, informal struggle, so this line of thought is certainly not novel :P)

Phil Dougherty

Thanks for your feedback, Olu! I hope my post didn't break your email reader 😅 But, yeah if you're prone to depression, I do not recommend The Precipice (I should've been a bit clearer that I wasn't joking, it actually is a mild anxiety/depression hazard). My 5-minute book review + Appendix F linked above + this 20-minute book review ( https://slatestarcodex.com/2020/04/01/book-review-the-precipice/?utm_source=pocket_mylist ) should be able to get all the important ideas through, I hope!

Nicky Case

1. I think it's fine for it to be so long as you've been diligent about headings! 2. I'm in two minds about reading depressing books when I'm prone to depression, but i guess if it turns out the risk is only 1 in 6 for this maybe 'the end of everything (astrophysically speaking)' will have some kind of uplifting twist also? i doubt it but you never know. Would be really interested in a post about AI and why i should worry about it ever being intelligent enough for unalignment to be a worry, though the links you've linked might be all i need, maybe?

Olu


More Creators