What's Nicky Learning? Decision Theory, Ottawa, Existential Risk
Added 2022-03-02 21:51:32 +0000 UTC(Three main sections, that take 16 minutes / 6 minutes / 5 minutes to read.)
(32 minutes reading time in total, not counting links)
Hey all! I haven't done a What's Nicky Learning? post in a while. So here's a thought-dump of what I've been learning recently. (Alas, I did not get any publishable work done last month. This is 10% because the world has been... distracting... but honestly 90% it's just me anxiety-ruminating and media-binging as usual. This Patreon will be paused one more month; you weren't charged on March 1st, but will be on April 1st.)
Table of Contents:
- A silly Valentine's Day thing I drew
- Functional Decision Theory: a satisfying solution to a rationality paradox [16 min read]
- Reflections on the trucker convoy in my home city, Ottawa, if anyone still cares after all this other worse news. [6 min read]
- A mini-book review of The Precipice, a book on existential risk. [5 min read]
- Miscellaneous bits n' bobs
————————————————————————————————
————————————————————————————————
A Rationalist Valentine's

This comic did not do well on Twitter. I thought there'd be a bigger overlap between "people who know about weird thought experiments in decision theory" and "people who are into cyborg scalie girls".
No seriously, I honestly thought that. I worked so hard on that drawing! 😿
But, speaking of weird thought experiments in decision theory...
————————————————————————————————
————————————————————————————————
A Satisfying Solution to a 50-Year-Old Paradox
(16 min read)
Part I: Newcomb's Paradox
You're approached by an eccentric trillionaire named Omega. They say: I've bought all your personal data, and can predict with 99.9% accuracy what you'll do in the following dilemma. (Omega is nuts, but honest. They really can predict you with 99.9% accuracy.)
The dilemma:

- I show you two boxes, Box A and Box B.
- You can choose to take only Box B (one-boxing), or both Box A and Box B (two-boxing).
- Box A contains $1,000.
- Box B contains $1,000,000 if and only if I predicted you'd one-box (only taking Box B). Otherwise Box B contains $0.
- The money's already in their boxes, what you choose now can't change the past. But remember, I can predict what you'll do with 99.9% accuracy.
- (Also, no tricks: you can't flip a quantum coin, or get someone else to take a box for you, etc. Try anything other than "take only Box B" or "take both boxes", you get nothing & I kick you in the crotch.)
So... would you one-box, or two-box?
This is (a slightly modified version of) Newcomb's Paradox. [wikipedia] You probably think the answer's obvious. And you're right, it is. According to Robert Nozick, the philosopher who first analyzed this puzzle in 1969,
"To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem, with large numbers thinking that the opposing half is just being silly."
(According to a 2020 survey of philosophers, it's still near-evenly divisive fifty-plus years later! 31% one-box, 39% two-box, the rest other/undecided.)
The one-boxer says: It's obvious! Omega's 99.9% accurate. So if I one-box, I'm near-guaranteed $1,000,000. But if I two-box, I'm near-guaranteed only $1,000. Clearly, one-boxing is better. (And you can rigorously check this by calculating the choices' "expected values".)
The two-boxer says: It's obvious! Whatever I choose can't change the money that's already in the boxes, so no matter what Omega predicted, I'd get an extra $1,000 by taking both boxes. Clearly, two-boxing is better. (And you can rigorously check this by applying Causal Decision Theory, the standard theory of decision-making found in all economics/game theory textbooks. More on this later.)
So... who's right?
Part II: Why Would We Even Care About This Contrived Problem
In Newcomb's Paradox, you can't cause the past to change, but the past was very good at predicting the present.
The story with Omega-the-trillionaire may be contrived, but real people try to predict each other all the time. Specifically, we try to predict if someone will cheat even if their cheating can't be detected or punished.
For evolutionary reasons, most non-autism-spectrum people are good at predicting others' character – (at least face-to-face... online, not so much) – a facial micro-expression, a vocal inflection, the tiniest fidget, can give away your intentions.
For example:
I'll tell you a juicy secret, which would give you 1,000,000 Happiness Points.
However, I'll only tell you if I'm 99.9% sure you won't tell anyone else. Though, telling someone else would give you 1,000 Happiness Points, and I can't find out if you do tell, so no punishment is possible.
Will I tell you my secret?
Well, if I knew you studied Economics/Game Theory 101 (and had no moral scruples), I know you'd think: Well, once Nicky tells me their secret, I'd already get 1,000,000 Happiness Points. My sharing the secret can't cause a different past nor future punishment, but it can cause me to get another 1,000 Happiness Points!
Even if you promise not to tell, I know you know game theory, so I know you'd break your promise. Therefore, I will not tell you my secret, giving you 0 Happiness Points.
Meanwhile, someone who doesn't know game theory at all (or has moral scruples) will get my juicy secret, and 1,000,000 Happiness Points. That is: a "rational" happiness-maximizing agent will get less happiness than an "irrational" agent.
Other real-life Newcomb-like problems:
- You're a hitchhiker. A car comes by, but the driver will only give you a lift if they predict you'll pay for gas after you've already gotten your lift. Your actions can't cause a different past, nor future punishment. (see also: parfit's hitchhiker, 2-min article)
- You're a warrior. Your comrade will give their life to save yours, but only if they predict you'll avenge them after your life has been saved & they're dead.
- You're a parent. Your kid will trust you to enter their room, but only if they predict you won't read their diary, even if there's no way they can ever know.
So: Newcomb's Paradox isn't (just) a niche math problem – it may give us insight into real-life human moral psychology!
Part III: Decisions using Causation, vs Correlation
The standard theory taught in all economics & game theory textbooks is Causal Decision Theory (CDT). It says you should choose the action that, given your situation right now, will cause the best outcome.
In Newcomb's Paradox: whatever Omega predicted, what you do right now cannot cause Box B's contents to change. But you can cause yourself to get an extra $1,000 from Box A. Therefore, you should two-box.
And yet! Someone who follows CDT will (99.9% chance) only get $1,000, while someone who knows no game theory will (99.9% chance) get that cool $1,000,000. CDT, despite advising you to cause the best outcome, did not cause the best outcome.
(And as mentioned above, Newcomb-like problems abound in real life, and CDT will make you do worse on all of them.)
So, mathematician-philosophers have proposed a rival decision theory: Evidential Decision Theory (EDT). It says you should choose the action that, given your situation right now, is correlated with the best outcome.
In Newcomb's Paradox: 99.9% of the time, one-boxing gets you $1,000,000 and two-boxing gets you $1,000. Therefore, you should one-box.
So, following EDT gets you $1,000,000 while CDT only gets you $1,000. Which means, even by CDT's own rules, you ought to "cause" yourself to stop believing in CDT, and follow EDT instead.
Unfortunately, since EDT focuses on correlation instead of causation, it has all the problems of mixing up correlation with causation.
Here's a problem where EDT fails: 99.9% of the time you eat chocolate, you get a headache the day after. So, you strongly consider quitting chocolate for good. But suddenly, you learn it's not chocolate causing headaches, but rather Pre-Menstrual Syndrome (PMS) causing both chocolate cravings & headaches – a mere correlation.
Should you quit chocolate anyway (for headache-related reasons)?
Well, obviously not, and CDT agrees: since now you know chocolate doesn't cause headaches, quitting chocolate won't help. But EDT can't consider causality at all, so EDT still recommends you quit chocolate anyway, and keep getting the pain of headaches without the joy of chocolate, until your lifetime correlation between chocolate & headaches balances out.
(This is another type of real-life Newcomb-like problem, called "Medical Newcomb Problems". The above example was taken from this paper.)
So both causation-based CDT & correlation-based EDT suck at different kinds of real-life Newcomb-like Problems.
If not causation or correlation... what now?
Part IV: Functional Decision Theory, or, a game theory Code of Honor
Before we look at our third theory, let's see CDT's "escape hatch": pre-commitment.
In Newcomb's Paradox, someone who follows CDT – let's call her Carol – only gets $1,000. But if Carol knew in advance she'd face this dilemma in a week, she could credibly pre-commit to one-boxing. For example, she could hire a goon to kick her teeth out if she two-boxed. Omega knows this (from the goon market transaction data), so they predict Carol will one-box, thus Box B will have $1,000,000, and Carol takes just that box – $1,000,000 richer and all teeth intact.
Sadly, Carol doesn't know about Omega's scheme in advance. And it's impossible to predict & pre-commit for every real-life Newcomb-like problem you may face in the future. So:
Can you make a fully general pre-commitment? Can you pre-commit to acting to how you wish you could have pre-committed to act?
In some of the above Newcomb-like problems (keeping a secret, avenging a comrade, respecting privacy) you may have noticed they all seem to involve "honor". That is, can others reliably predict you'll not cheat, even if cheating can't cause a different past, nor future punishment?
That's honor: the general pre-commitment we need!
So, can we turn "honor" into math?
Functional Decision Theory (FDT) says yes! After a decade-plus of teasing about having a cool new decision theory, Eliezer Yudkowsky (with Nate Soares) finally published this paper in 2017, which I only found out about last month.
To compare & contrast:
Causal Decision Theory says: In situation X, do what causes the best results.
Evidential Decision Theory says: In situation X, do what correlates with the best results.
Functional Decision Theory says: In situation X, do what would cause the best result if Functional Decision Theory said to do that in situation X.
In less weird words: FDT says to act however it would have been best to pre-commit to act. It's the game theory equivalent of a "code of honor".
(Is it weird that FDT defines itself partly in terms of itself? Not any weirder than mathematicians defining the Fibonacci Numbers as "each Fibonacci number is the sum of the previous two Fibonacci numbers (and the first two are 0, 1)". When we do self-reference in math/programming, it's called the "functional" approach, hence "Functional" Decision Theory.)
(I note a parallel with virtue ethics: one thing that confused me about virtue ethicists is that they say stuff like "virtue is doing what a virtuous person would do". Maybe that self-reference was not a bug, but a feature after all?)
Let's see how FDT does in Newcomb's Paradox (where CDT fails) and the Chocolate-Headache Problem (where EDT fails).
Newcomb's Paradox: "Let's see... Omega knows I follow FDT. If FDT says I should two-box, Omega will know I'll two-box, put $0 in Box B, and I'd only get $1,000. But if FDT says I should one-box, Omega will know I'll one-box, put $1,000,000 in Box B, and I get $1,000,000. It's a better result if FDT tells me to one-box, therefore FDT does in fact tell me to one-box."
Chocolate-Headache Problem: "Let's see... if FDT says to quit chocolate, I'll still get headaches without the joy of chocolate. If FDT says to keep eating chocolate, I'll still get headaches, but at least I get chocolate. It's a better result if FDT tells me to eat chocolate, therefore FDT does in fact tell me to eat chocolate.
What's the subtle difference between these two problems, that FDT can pick up on, but CDT and EDT can't?
For both problems, let's draw a diagram of what-causes-what:
("x causes y" is shown as x → y)

EDT can't pick up on the subtle difference in what-causes-what, because EDT can't pick up on causation at all.
CDT can pick up on causation, but it still sees no useful difference between the two scenarios, because neither has a causal path (shown as →'s) from "your action" to "the important thing".
FDT picks up on causation and the fact that 1) in Newcomb's, there IS a causal path from "your predisposition" to "the important thing", but 2) in Chocolate-Headache, there is NO such path.
(I notice a similarity to the debate over the Big Three main branches of moral philosophy: 1) virtue ethics, which focuses on personal character, 2) deontology, which focuses on actions, and 3) consequentialism, which focuses on the consequences of actions. Since one's character causes one's actions, which then causes consequences, we could draw it like this:)

(So... FDT may be a mathematical reason for why "virtue/character ethics" evolved, biologically and/or culturally, to be the most people's moral intuition? At least pre-1950's / non-Western people.)
Part V: Mo' problems
It's not just Omega & chocolate! Here's two more problems where FDT beats CDT: (I don't care about EDT anymore. It can't even deal with correlation-not-causation problems, way too common in real life.)
Twin Prisoner's Dilemma:
Omega uses your data to make a "twin" AI, which imitates your decision-making predisposition with 99.9% accuracy. Then, they pit you against your twin in a "Prisoner's Dilemma" game: each player chooses to give up $0 or $1000, and the other player gets a thousandfold what they gave up. Both players choose independently, they can't communicate.
CDT says: You can't cause what your twin gives you – the AI was already made in the past – you can only cause what you give up. It's better to give up $0 than $1000. So, you and your twin (with 99.9% probability) both give up $0, and both get a thousand times $0, or still $0.
FDT says: You & your twin follow FDT. If FDT says "give up $0", you both get a thousand times $0, or $0. If FDT says "give up $1000", you both get a thousand times $1000, or $1,000,000 (minus giving up $1,000). It's a better result if FDT says to give up $1000, therefore FDT in fact says to give up $1000. So you both do, and get back $1,000,000.
Thus, the CDT twins get nothing, the FDT twins become millionaires (minus a $1,000 fee).
A contrived example? Perhaps, but consider this: if you grew up in the same culture as someone, read the same books, know the same ideas... you may still be very psychologically different, but in terms of your decision-making predisposition, you're close to being "twins"!
So here's a very real-life analog of the twin problem: voting.
CDT says: Your one vote out of millions is vastly unlikely to cause a different outcome. Therefore, don't bother. (The "irrationality" of voting has been known for centuries [wikipedia], with some founding figures in economics just straight-up saying, "don't vote". example, 1-min article)
FDT says: You and a chunk of your country (say, a fifth), have similar values & decision-making predisposition – as if you're "decision-theory twins"! Let's make it harder: each twin is only 50% likely to do what you do. Now, whether your shared code says to vote or not, half of your twins – so half of a fifth of the country, or a tenth of the country – will do the same. Changing a tenth of the total vote is very likely to cause a different outcome! Therefore, vote!
(Of course, no-one thinks in FDT that explicitly –– they just have the moral instinct closest to FDT: honor. "It's our civic duty to vote".)
Predictive Blackmail:
Omega uses your data to predict (with 99.9% accuracy) if you're susceptible to blackmail. If and only if they predict you are, they'll send you a letter demanding $1,000, or else they'll publish scandalous secrets that cause you $5,000 worth of damage.
Let's say you get the blackmail letter. What now?
CDT says: You can't cause the blackmail to not have already happened. All you can cause now is avoiding $5000 of damage, at the cost of $1000. So, you'd pay up. (And since Omega knows you follow CDT, they can predict you'll pay, so 99.9% chance you'll get the letter.)
FDT says: "Let's see... Omega knows I follow FDT. If FDT says to pay up to blackmail, Omega will send the letter, I'll pay, and be $1000 worse off. If FDT says to not pay up, Omega will know this, and 99.9% of the time will not send me the letter at all, so I lose nothing. It's a better result if FDT says don't pay, therefore FDT in fact says don't pay.
In sum: to get an only 0.1% chance of getting the letter, FDT tells you to not pay the blackmail... even if you're already holding the letter right now.
Is that an obviously wrong result for FDT?
It's strange, but I don't think so!
1) It's basically the "don't negotiate with terrorists" policy.
2) FDT still beats CDT. 99.9% of the time, FDT gets no letter and loses $0, while CDT gets the letter and loses $1000. (And 0.1% of the time, FDT gets a letter and loses $5000, while CDT gets no letter and loses $0.) I won't do the calculation here, but the "expected value" of doing FDT is much higher than CDT's.
(Thus: even by CDT's own rules, you ought to "cause" yourself to stop believing in CDT, and follow FDT instead!)
Note: if Omega was only, say, 50% accurate, then FDT would pay up. But in that case, Omega isn't acting like a rational blackmailer, but more like a random lightning strike. FDT can respond to this change of facts!
3) Another way to think about it: if CDT-Carol knew in advance of Omega's blackmail scheme, she'd pay a goon $2 to kick her teeth out if she gave into the blackmail, to pre-commit to not paying even if she got the letter. Omega will know this, predict non-payment, so Carol gets no letter. FDT accomplishes the same result without the costly pre-commitment, or needing to know stuff in advance.
Conclusion

I, uh, spoiled all the main surprises of the paper. Whoops. But if you want technical details & more fun dilemmas, read the paper for yourself!
In sum: Yudkowsky & Soares have come up with the most elegant, satisfying solution to Newcomb's Paradox I've seen so far. Not only that, it works with a wide variety of seemingly unrelated problems: the Chocolate-Headache Problem, Twin Prisoner's Dilemma, predictive blackmail, etc. That's huge evidence that this may in fact be a better decision theory, and deserves to be the new mainstream-textbook theory.
(Not from the original paper: my own hand-wavey connections to honor, character, virtue ethics, and the evolutionary basis of moral psychology. But I hope they were interesting hypotheses worth sharing!)
Could you make a dilemma where FDT fails to other decision theories? Sure: Omega scans your data, then punishes you if you've even heard about FDT. But that's not a fair problem. To the best of the authors' knowledge so far, FDT seems to get the best outcome in all fair problems. (However, they can't prove it yet, without a formal definition of "fair".)
One main reason the authors care about this result is because they're AI-Alignment researchers, so they want a mathematical decision theory that works when others act based on predictions of you'll act (where CDT fails), or correlation-not-causation scenarios (where EDT fails). Personally, I think this result also may help explain why our moral intuitions of honor & character biologically/culturally evolved, and why real-life people sometimes do better than Game Theory 101 predicts.
But either way, it's a cool elegant solution to a niche problem that's been bugging mathematician/philosophers for 50 years, so I'm glad we can all finally shut up and move on with our lives.
————————————————————————————————
————————————————————————————————
Living in Ottawa during the Trucker Convoy
(6 min read)
Man who even gives a fig anymore, the Cold War probably re-started last week.
But, if anyone still cares, I live in Ottawa and here's my notes on the recent, 31-day-long, anti-Covid-restriction Freedom Convoy protest. In chronological order:
- What's that you say? Truckers are coming to town to protest the vaccine mandates against truckers? Well,
- 1) I'm grateful for truckers in general, someone's gotta keep the spice flowing,
- 2) “I disapprove of what you say, but I will defend to the death your right to say it,” quote Fake Voltaire, and
- 3) I'm pro-vaccine & triple-dosed, but in my humble opinion, vaccine mandates for cross-border truckers don't make sense given: Omicron, already-high (Canadian) vaccine uptake, and better treatments. My cost-benefit analysis:
- If you want to incentivize vaccine uptake, it's better for long-term trust in public health to just pay people to take it, vs threatening working-class people with job loss. Trust in public health institutions is already rock-bottom, why are we frittering away the rest of it.
- While mRNA vaccines are 94% effective against Omicron hospitalization and death [paper] and that's why I highly recommend them, six-month-ago 2-dose is ~0%(!) effective against symptomatic infection, 3-dose is ~67% effective. [paper] Omicron's R0 value of ~10 means you need 90% of people to be actually-immune for herd immunity, and 67% is less than 90% [citation needed], which means we can't get herd immunity against Omicron even if everyone was 3-dosed.
- The above, combined with...
- a) Vaccines + boosters + treatments have cut Covid's infection fatality rate (IFR) by 10-fold(!) 🙌 since Jan 2021 [infographic], making Covid's IFR now (I think?) on par with seasonal influenza. (From this Nov 2020 paper: "Our [Covid IFR estimates] were about 10 times larger than those for seasonal influenza.") IFR isn't everything, but still!
- b) New Omicron cases are almost all community-transmission, not from across the border
- c) Truckers barely interact with anyone
- ...all this means the public benefit of mandates on the 15% of remaining unvaccinated Canadian truckers is very small. At least, the benefit < the cost of enforcing the mandate & eroding public trust, making it harder to deal with future "natural"/bioweapon pandemics, which will likely be worse.
- (It's upsetting how "bundled" politics is: if you're pro/anti-vaccine, you're assumed to be pro/anti-mandate and pro/anti-"trust the [politicians who claim to speak on behalf of] experts". Please, let's de-bundle beliefs!)
- So whatevs, bring it on, maybe Ottawa will stop being so boring for once!
- [Non-stop honking on clogged main streets, loads of downtown businesses close]
- I regret my wish.
- Ok, other than that, notes on the protest's vibe:
- Lots of "F🍁ck Trudeau" signs. With maple leaf. (Note for non-Canadian readers: Justin Trudeau is our prime minister.) Not unique to this protest, but I wonder if protestors think at all on how to persuade people. I fail to see how "F🍁ck Trudeau", even if the reader also hates Trudeau, would persuade them to be anti-Covid-restriction. Let alone convince Trudeau supporters, or Trudeau himself. (I saw exactly one sign that attempted persuasion. Good for them.)
- I didn't personally see any swastikas or confederate flags. I know there were photos circulating online, but they weren't common. I suspect the swastikas were (stupid) attempts at saying Trudeau is the Nazi. Or they're just being childish edgelords.
- Although, re: the horrid "bundling" of politics, it seems most protestors actually were anti-vaccine, not just anti-vaccine-mandate. Lots of signs saying Pfizer (& even masks) are poisonous. Come on folks, just de-bundle it.
- I didn't see it mentioned in any news outlets, but quite a few protestors had Every Child Matters flags, in protest of the recent revelations of the ~215 unmarked children's graves found near a Canadian Indian residential school. [wikipedia] So there were at least some pro-Indigenous-rights protestors in the mix.
- One of my datefriends works at the soup kitchen, Shepherds of Good Hope, where some protestors assaulted the staff. 😢
- After 2020–21, my standard for "is this a peaceful protest" has been lowered to "has someone been actually killed". That did not happen, so... yay?
- But seriously, Russell's Conjugation [wikipedia] is one of the worst sins of politics:
- we do advocacy, they spread propaganda
- we have a strategy, they have an agenda
- we have principles & beliefs, they have dogma & ideology
- we liberated country X, they invaded country X
- etc
- One mental habit I have to avoid Russell's Conjugation is, "If someone did the exact same thing for a cause I care about, would I be okay with it?" (This helps me accurately pinpoint if I'm actually against their methods, or "just" their goals.) So: if downtown roads & critical infrastructure were blocked in the name of, say, pandemic preparation or trans acceptance, would I be ok with it?
- . . .
- Yes, if it was a few days to a week. But 3 weeks? No: I don't want my civil disobedience to hurt the jobs of innocent third-parties, let alone act like jackasses & make them resent my cause. Counterproductive.
- But seriously, Russell's Conjugation [wikipedia] is one of the worst sins of politics:
- [3 weeks later]
- I am very very reluctant to use police force –– given the potential for police brutality, and given Trudeau's father was involved in an infamous civil rights controversy [wikipedia] –– but also, it's been 3 weeks, there's honking, fireworks, and drunk screaming near my friends' homes late at night, please stop.
- [They finally use police to shoo away & arrest protestors]
- Thanks
- [Trudeau uses the Emergencies Act – which hadn't been activated in over 30 years – to freeze the bank accounts of everyone involved in the protest, even if they left before the police declared it illegal, or donated even if they didn't know it was illegal. Without a legal trial. And having your bank frozen in a cashless society means you can't then pay for a lawyer, or rent, or food.]
- Why, Justin
- Why.
- That is such a disturbing civil-rights precedent to set. What happens when, say, a future government uses that legal precedent to freeze funds of environmental/Indigenous groups who occupy infrastructure to protest? Did you not work through Russell's Conjugation? And it was so unnecessary; the Highway Traffic Act already gave us legal power to tow away the trucks! Why.
- [One week later Trudeau calls off the Emergencies Act, and unfreezes the accounts]
- Thanks
- God
- Please let's all just move on, and have the world's opportunistic pundits stop Russell Conjugating and politics-bundling this stupid, unnecessary debacle in my hometown that persuaded nobody, and nobody learnt a thing. Please, just get this nonsense out of the news cycle.
- [Russia invades Ukraine]
- I regret my wish.
————————————————————————————————
————————————————————————————————
A mini-review of Toby Ord's book on existential risk
(5 min read)
⭐️⭐️⭐️⭐️☆ 4/5, recommended for compassionate, smart people who want to feel depressed for a few weeks
Toby Ord's book The Precipice [official book site] is about all the horrible ways humanity could go extinct (or worse than extinct), from supervolcanoes, to nuclear war, to "natural"/lab-leak/engineered pandemics, to global totalitarianism, to tail-risk climate change, to unaligned AI. (This book came out a few weeks before the WHO officially declared a pandemic.) But it's not a book of despair! Ord confronts the odds objectively, without sensationalism, and lists many tractable solutions to mitigate these existential threats to humanity's long-term potential.
(Well, after reading it I ended up despairing anyway. Did you know the international Biological Weapons Convention has just 4 employees? Have you seen Wikipedia's list of nuclear close calls?!)
The best and worst thing about this book, though, is that it's so thorough. The book is 468 pages long. The "main" book actually ends halfway – the rest is ALL notes, appendices, and citations. Early on in the book, he spends ~20 pages defending the idea that humanity going extinct is bad.
No, really.
He addresses Epicurus's Stoic idea: if sadness is in the human mind, how can it be sad if there are no humans left to grieve? He addresses the economic idea that exponential discounting implies that the entire future of humanity is worth only 20x the next year. He addresses moral systems that places low or zero weight on human welfare, and all on animal/environmental welfare. He addresses problems with utilitarianism, but notes that existential-risk-reduction can still be justified with duty-based or virtue-based meta-ethics. He addresses the idea that the loss of the human species isn't a cosmic loss, given the chance of intelligent aliens in a vast universe.
He's that thorough.
If you're the kind of person that likes that much scientific & philosophical detail – and I do! – pick up the book. All profits go to effective charities! (But, uh, I do strongly value brevity in layperson-targeted books, so that's why I've taken off 1 point from an otherwise perfect 5/5 rating, for this admirable work of rigorous scholarship.)
In a rush? Here's the book in a nutshell:
1) THE PROBLEM
We could all die this century. Loose tongues & cheap action films have made "end of the world" sound so trite, but, no, really, this could happen.
2) THE RISKS
Here's Toby Ord's table of estimated risks:

(Yes, it's "funny" how many people would be relieved to hear that existential risk is only 1-in-6.)
The book has his thorough reasoning for each estimate. Here's my notes on some of his more surprising estimates:
- Nuclear War: It's easy to kill 1 billion people with our ~13,000 nukes, but surprisingly hard to kill all 8 billion, since we're so spread out. But even if it's not "extinction", KILLING ONE BILLION INNOCENT CIVILIANS –- WHO HAVE NO CONTROL OVER THEIR INSANE LEADERS –– IS STILL TERRIBLE, HORRIBLE, NO GOOD, VERY BAD. DO NOT BE TEMPTED EVEN IF IT FEELS MORALLY RIGHT. IT IS NOT.
- Climate Change: Contrary to Extinction Rebellion, none of IPCC's models predict anything close to human extinction. It's still very bad, and very worth solving. (P.S: See this 8-min article from Our World In Data's Hannah Ritchie. Ignore the clickbait title; authors don't get control over the headline, the editor picks that.)
- Unaligned AI: Ord explicitly means super-intelligent AI not aligned with human values, not "just" regular AI that helps someone build a supervirus and/or 1984.
- (Personally, I respectfully disagree. I think AI with human-level domain-general reasoning is 80% likely by 2100, but I don't think that'll cause a runaway self-improvement explosive FOOM! scenario. This deserves a whole future post, but my intuitions were captured by Ramez Naam (Singularity University)'s argument-from-computational-complexity [3-min read], and I think the general idea still holds up even after reading Gwern's counter-arguments. [35-min read] In any case, I currently think most of AI's risk is in supporting other risks, i.e. "helps build supervirus/1984". On the other hand, biotech AI could cure all cancers? eh, ya win some ya lose some)
If you're wondering about my personal biggest concerns... as I wrote in last month's Patreon post, on 2/2/22:
I have sincerely lost sleep, realizing there's a 1-in-6 "Russian Roulette" chance that, in my lifetime, I will see an SK-Class Mass Suffering Event, such as (not mutually exclusive): a bioweapon pandemic, nuclear war, World War 3, a new age of autocratic empires.
(Note to self: probably retire the phrase "Russian Roulette".)
3) THE POSSIBLE SOLUTIONS?
But there's hope! In the book's Appendix F – and yes, the book has SEVEN appendices – Ord has a list of concrete policy/research proposals for each existential risk: 5-min read
(Some, like "US rejoins Paris Agreement", happened! Some, like "US and Russia restart nuclear treaties"... urgh... hate to tell you from the year 2022, Toby Ord from 2020...)
Ord is an idealist, but a practical idealist. (The best kind!) He recommends that we, as a society and as individuals, focus on what's cost-effective:
Cost-Effectiveness = Neglectedness × Tractability × Importance
Or: unpicked, low-hanging fruit. Go for diverse portfolio of cost-effective interventions, a big ol' basket of fruit, and maybe, just maybe, we can live to see humanity's true, trillion-year potential.
Anyway that's the entire book. Or at least, I think that's all that the vast majority of people need to know.
And people very, very badly need to know.
————————————————————————————————
————————————————————————————————
Miscellaneous Bits n' Bobs
- I might post some of the above as standalone blog articles, for your linking pleasure! Please let me know your feedback (even if just typos) before I toss the bottle out into the ocean.Such as:
- I'm thinking that might be my "Patreon model" this year – y'all patrons get sneak peeks at educational blog posts, with or without interactives, and help shape 'em for public release? Thoughts?
- Other stuff I'm researching:
- How to understand domestic & foreign politics, from a "bottom-up" perspective. By "bottom-up", I mean starting with the human psyche (or approximations thereof). Such as:
- Game theory models like Exit/Voice/Loyalty [wikipedia]
- Bruce Bueno de Mesquita et al's Selectorate Theory [18-min CGP Grey video]
- Bruce Bueno de Mesquita's "Expected Utility Model" [22-min article] though it seems none of his models' code is publicly available, which is suspicious & frustrating. 😕
- Public Choice Theory [33-min intro video], which shows how even fully-rational individuals can, as groups of voters or groups of politicians, be collectively irrational.
- (Keep in mind "all models are wrong, but some are useful". Also keep in mind that real human beings are not Game Theory 101 rational actors. Sometimes we act better. Sometimes we act way worse.)
- Any other suggestions?
- Speaking of economics, Lars Doucet's Georgism article [82 min(!!!) read] was pretty nifty! Georgism has a soft spot in my heart, because the first "educational game to explain a complex system" I know of was The Landlord's Game, by Georgist Elizabeth Magie, later plagiarized by Parker Brothers into "Monopoly". I oughta research the arguments for/against Georgism more. (Possible future explorable explanation?)
- What are the neglected low-hanging fruits for preparing for the next "natural"/engineered pandemic? The most contagious viruses (at least, in the developed world with clean water & fewer mosquitos) all use aerosol transmission [infographic]. So if (when?) a bio-terrorist death cult tries to kill everyone, their best bet is to use aerosol transmission. That seems we have a neglected low-hanging fruit that few people, even in the existential-risk & pandemic-preparation community, are talking about: fixing our air! 💨💨💨
- By analogy: in the developed world, we've practically eliminated all waterborne viruses (e.g. dysentery, cholera) thanks to clean water infrastructure. But our air?... the air in many public places is the equivalent of un-sanitized, still, standing water.
- [Warning: low-confidence back-of-envelope calculations!] For Covid, outdoor transmission was 1/20th as likely as indoor transmission. In comparison, Pfizer's 95% efficacy against Covid Classic meant a vaccinated person was 1/20th as likely to get infected as an unvaccinated person. Which means: if we can get indoor air as fresh as outdoor air, that'd be like automatically vaccinating everyone for every future airborne disease.
- Even if we only get indoor air a fifth as fresh as outdoor air... which means instead of cutting infections by 20-fold, we cut it by 4-fold... which means only a quarter (25%) of the remaining risk... or equivalent to a 100% – 25% = 75% effective vaccine... that's still more effective than 3-dose Pfizer against Omicron is. And probably enough to eliminate seasonal influenza, with its R0 of a mere ~1.3. (Reminder that "the flu" is really bad: ~30,000 US deaths per year, which is ~50% the US deaths from opioid overdoses.)
- Here's a great paper on making our air infrastructure as good as our water infrastructure! It claims that cost of fixing our indoor air would "likely be less than one percent increase in the construction cost of a typical building", while in the US alone, the annual cost of influenza is ~$11.2 billion, and for other respiratory infections, ~$40 billion. (in terms of deaths, healthcare & workplace absenteeism, etc)
- (Note: I'm highlighting the bio-terrorism and $billions coz, ugh, nothing gets major funding until you persuade the national security/business interests. I hate it, but I hate even more the idea of humanity going extinct when some grad student CRISPR's a cyanide-producing gene into the common cold.)
- For other ideas on preventing the next pandemic, see Ord's 5-min list of policy/research recommendations. Other promising ideas not in that list: wastewater/air monitoring, meta-genomic screening, alternatives to and/or moratorium on Gain-of-Function research, helping developing world re: clean water & those damn mosquitos.
- A final cautious note: given how, in the very recent past, rightful condemnation of China's government led to wrongful hatred of Chinese people... I'd just like to gently remind you re: Russia that governments (even democratic ones!) very much do not reflect their citizens & ex-pats, let alone everyone of that ethnicity. I'm sure none of you would fall into that trap, but... watch out for it in your circles, ok? In case your uncle starts saying "nuke 'em all" or something?
Question 1: Was putting a 30-minute long email in your inbox a bit too much? Would you prefer shorter posts, or have the Patreon update just link to a draft blog post, so you can read it in-browser, not in-email-client?
Question 2: Do you have any thoughts, feedback, further resources, or cute cat videos to help me feel less depressed? Dump 'em in the comments below!
I don't know if I have any readers in Ukraine, but where-ever you are, I hope you're staying safe. Please, be safe.
~ Nicky
Comments
Wonderful post, way clearer than the original paper which I struggled with so much that I didn't finish reading and hence learned nothing. At least it didn't discourage me enough from the topic itself to not to try reading your post. Small note: I think this sentence misses a word (or at least adding a word would make it clearer). Missing word in square bracket: "One main reason the authors care about this result is because they're AI-Alignment researchers, so they want a __mathematical __decision theory that works when others act based on predictions of [how] __you'll__ act (where CDT fails)"
Marta Krzeminska
2022-12-06 08:47:36 +0000 UTC> y'all patrons get sneak peeks at educational blog posts, with or without interactives, and help shape 'em for public release Sounds like a plan!
Anton Iokov
2022-05-22 13:49:21 +0000 UTChttps://www.emergencykitten.com/
Chris K
2022-05-02 21:08:11 +0000 UTCAh, so my vague notion was apparently correct after all. Glad the reminder was helpful. It's been a while since I read The Story of Us. Forgot that he actually cites you! Anyway, I love your work. Keep it up! In a way that is ultimately healthy for you, of course. ;)
Syvanus
2022-03-30 01:52:23 +0000 UTCThank you Tom, that's really encouraging to hear! :) And this Patreon update *is* the beta test! (I'll be adding a couple sections in response to patron feedback/critiques – whether it's too hard to implement FDT in practice, isn't this just Kant's categorical imperative, what the heck the authors mean when they say "subjunctive not causal dependence", etc)
Nicky Case
2022-03-29 18:24:42 +0000 UTCThanks for reminding me to re-read that soon! But yes, I have read that series – in fact, I'm cited in Part 10. ;)
Nicky Case
2022-03-29 18:24:19 +0000 UTCI've read about a half dozen articles explaining CDT and FDT, yet I couldn't understand the theories until I read yours. This is really, really good, Nicky, thank you! Did you beta test this post?
Tom Lieber
2022-03-16 05:09:36 +0000 UTCI'm very late to the game here, but I really loved the section on FDT! Also, I have a vague notion that you are already familiar with Tim Urban and The Story of Us but in case you're not, it's a great piece on the "How to understand domestic & foreign politics, from a "bottom-up" perspective." front. https://waitbutwhy.com/2019/08/story-of-us.html
Syvanus
2022-03-13 02:29:18 +0000 UTCDon't mind me, just checked, and it *is* publically available.
Albert ARIBAUD
2022-03-06 10:29:29 +0000 UTCWonder if this post is going to become publically visible at some point? I would like to show it to my daughter. I could copy-paste it for her, of course, but I won't do that unless explicitly allowed, so if it becomes public eventually, I'll just wait. :)
Albert ARIBAUD
2022-03-05 18:07:04 +0000 UTCI haven't been following you for long so the topic is still new to me, but I'm fascinated by the link between game theory and politics you just drew. I've always been too lazy to get seriously interested in politics, but this angle makes it interesting enough that I actually want to research it this time. Which is a good time for it, since there's a major election coming up in a few months in my country. Thank you for that.
A lama
2022-03-05 08:11:16 +0000 UTCOk, repost as the note did not get found .... The length was fine for me. I enjoyed this article a lot, partially because I have spent a lot of time thinking about Newcomb's Paradox. Here is another interesting twist. Assume I get together a large number of like-minded individuals. We don't care about the money. We're committed to Discovering the Truth, instead. And the Truth we want to examine is about the nature of Randomness. So we get ourselves a device that is connected to some sort of really random number generator -- a cesium clock, atmospheric noise, what have you. We only want 2 bits of information here, a 50/50 split corresponding to 'one box' or 'two'. By current scientific understanding we have no way to predicting what the generator will say. And neither does Facebook or any billionaires or anybody else who is living within this universe. And we all line up, having pledged to do whatever our generator says to do. (And we keep our promise.) Now what happens? My position is that either a) the accuracy of the prediction will decline to the point where, if there are enough of us pledged people doing this thing (i.e. almost all of us) the predictive ability will end up at 50%. We have been investigating a natural phenomenon, the predictive power of the billionaire, and demonstrated that it is indeed behaves like a natural phenomenon, and has limits. or b) the predictions continue to hold up. At this point I claim that we are not investigating a natural phenomenon at all. We are investigating a miracle. And proven that at least one miracle exists! Either way, you have generated knowledge about the world, which I would like to have more than the money. All the best from here in Sweden.
Grävling
2022-03-04 22:42:46 +0000 UTCFWIW, I like this kind of update. Almost like a little zine. I seldom read emails like this anymore, but it really helps that you seem to have the same kinds of interests as I do, and they're varied, so basically everything you write is catnip for me. I can't help but notice that Functional Decision Theory is very close to Kant's Categorical Imperative from _Groundwork of the Metaphysic of Morals_ (1785): "Act only according to that maxim whereby you can, at the same time, will that it should become a universal law." Maybe I'm not understanding FDT fully, but to me it seems like a different way to express the same idea. Is this something the authors of FDT address, this similarity?
Filip Hracek
2022-03-03 19:17:34 +0000 UTC…And I finally read the rest of the post, including the Misc. section where you basically said half of what I just did. Basically: I agree, go for it.
Kronopath
2022-03-03 17:21:09 +0000 UTCWhy the clown makeup meme? The decision theory section of this post is genuinely fantastic and plays into your strength of being able to explain complex concepts in an understandable way. Seriously: pull that section out of this post and post it to that new blog you’re planning, and you’ll have a fantastic post. Since you’re talking about rationalist-related stuff here: a lot of people around that community have found surprising amounts of success in doing exactly what you just did, taking complicated rationalist or rationalist-adjacent ideas (that they themselves did not come up with) and explaining it in simpler and entertaining language. Examples include Rob Wiblin’s Medium post on Ugh Fields, Lars Doucet’s intro to Georgism (which won the ACX book review contest), and, like, a significant chunk of Scott Alexander’s entire blog.
Kronopath
2022-03-03 17:10:55 +0000 UTCTry re-posting it as a reply to this comment? I didn't receive any notes (as notified by email notifications) other than the blank-line-less ones above.
Nicky Case
2022-03-03 16:22:23 +0000 UTCBut the note is still AWOL ...
Grävling
2022-03-03 15:55:10 +0000 UTCchecking to see if shift-enter works! Yay! It Does! Thank you!
Grävling
2022-03-03 15:53:58 +0000 UTCI didn't remove anything! I got the following email notifications with what you wrote: "re: Newcomb's paradox and it's successors." "Grrrr 2! Other people get to insert blank lines and paragraphs in their replies! Where can I learn what they know and I clearly don't!" For newlines, Shift-Enter works on my end! (Very not clearly explained, Patreon UI...) Does Shift-Enter work for you?
Nicky Case
2022-03-03 15:33:05 +0000 UTCI posted something here. Edited it too. Now it is gone. If Nicky wanted it gone, and removed it, that's ok, but otherwise we have a vanishing response bug ....
Grävling
2022-03-03 14:40:56 +0000 UTCWow, I loved this post! Some great food for thought. I'm not always in a position to read long emails when I get them, but I was today and I feel grateful for that experience.
Rachel Helps
2022-03-03 09:08:35 +0000 UTC1) No, I enjoyed reading it before breakfast (I'm retired so have a fair amount of free time). There are some email correspondents who cause my heart to sink when something from them appears in the inbox but that will never be the case with you! 2) Is it unethical of me to think of running a Newcomb's paradox experiment for real on my grandchildren (6 and 9 years old)? One sweet v a bag of sweets? I'd guess the 6 year old would go for both and so only get one, but the 9 year old I'm not sure.
John Stout
2022-03-03 08:35:35 +0000 UTCGlad to see this post! 1) No problem, your emails go to my RSS reader anyway. 2) No real links, but, maybe you'd be amused to think about this? FDT feels very "strange loopy" to me, in the Gödel, Escher, Bach way. I wonder if it has any interesting failure modes as a model that are similar to Gödel's Incompleteness Theorem? And if so, if those correspond to any actual areas real humans following reasonable ethics would have trouble with? (After all, a key part of the broad effect of Gödel's theorem was learning that it didn't only apply to completely contrived statements; there are interesting mathematical questions that can be translated to include "This statement is false.")
Eric Willisson
2022-03-03 05:20:57 +0000 UTCA fascinating post, I found the decision theory piece enlightening-- it's reputation and societal virtues in mathematical form!
Conrad Wong
2022-03-03 04:23:32 +0000 UTCI think this was great! These are topics dear to my heart, and you've breathed fresh life into some of them - especially with the interesting association between FDT and virtue ethics. As a semi-tangent, the CDT folks are wrong about whether voting is worthwhile *even under their own framework*! In elections with close polls, ties (and near-ties, which imply that ties are not that rare) happen just as often as you would expect given the statistical model where you integrate the chance of a tie given the true electoral lean, against the chance of that lean given the state of the polls. Which is to say, the chance of a tie is several times larger than one divided by the number of voters. If a CDT person would bother voting in an election where the winner was picked by selecting one ballot at random (and they should), then they should vote in a standard election. /rant
Patrick L
2022-03-03 04:05:48 +0000 UTCcyborg scalie girl ftw!!! 1. I personally loved this format! It's my first update of yours I'm reading, so it's hard to compare to anything else, but, it was a wonderful read and I took a break between each chapter. Plus knowing how long each part takes to read is awesome.
Detective Chiyo
2022-03-03 03:57:40 +0000 UTCEmail length was not an issue for me. Thank you for writing it 🙂
Randy Gingeleski
2022-03-03 03:37:40 +0000 UTCThank yoooouse 🐉🤖
Nicky Case
2022-03-03 02:53:21 +0000 UTCI like your cyborg scalie girl. <3
Aeryn Light
2022-03-03 01:48:28 +0000 UTCThank you for email! I enjoyed every minute of it. As someone who has been procrastinating while binging the news, your insights and musings shine like a beacon in the dark fog of current events :D Here's a playlist I use as anti-depressant: https://www.youtube.com/watch?v=--9kqhzQ-8Q&list=PLaLpAOF66jPGrtuzk8y1aTF8LlB3sphHC
Sylvester Lan
2022-03-03 00:38:20 +0000 UTCBrilliant thought-provoking post. Thanks for it! Regarding FDT: this blew my mind, and articulated something I've been struggling with for years. Specifically relevant re: the "voting" example. I literally used to not vote because of CDT, but eventually struggled with realizing that "the kind of people not willing to engage with CDT were WINNING ELECTIONS". So the way I eventually articulated it to myself was through the humility that "I'm not especially unique; there are probably millions of people like me; if I can convince MYSELF to vote (or even simply 'let myself be convinced'), it's reasonable to assume those millions of people will have a similar realization and I can give myself the power to win elections"! Now I have FDT which is much more cleanly generalizable! SELF INDULGENT TANGENT- FEEL FREE TO IGNORE: Reading about FDT also has me thinking about something else I've been informally mulling over for a few years now on the topic of metaethics. Here's a sloppy, not-quite-complete articulation of Kant's "Categorical Imperitive": an action is good iff it can be universalized (could be adopted universally without self defeating). Example: murder is wrong, because if everybody murdered, then nobody would be around to do murders! It's obviously flawed (especially in this mis-articulation), but I present it only as the inspiration for what I've been wrestling with: A DT is good iff its universalization maximizes the good, and an action is good iff it follows from a good DT. This has been the working basis in articulating my moral compass (a work in progress!), and reading this has helped in the process of making it more rigorous. It seems like it's not far off from "FDT + universalization"! Anyways, I'm curious if there are any glaring red flags anyone can see, or if there are other similar metaethical theories I might be interested in. (As I said, I'm engaging with this as a personal, informal struggle, so this line of thought is certainly not novel :P)
Phil Dougherty
2022-03-03 00:03:30 +0000 UTCThanks for your feedback, Olu! I hope my post didn't break your email reader 😅 But, yeah if you're prone to depression, I do not recommend The Precipice (I should've been a bit clearer that I wasn't joking, it actually is a mild anxiety/depression hazard). My 5-minute book review + Appendix F linked above + this 20-minute book review ( https://slatestarcodex.com/2020/04/01/book-review-the-precipice/?utm_source=pocket_mylist ) should be able to get all the important ideas through, I hope!
Nicky Case
2022-03-02 22:29:29 +0000 UTC1. I think it's fine for it to be so long as you've been diligent about headings! 2. I'm in two minds about reading depressing books when I'm prone to depression, but i guess if it turns out the risk is only 1 in 6 for this maybe 'the end of everything (astrophysically speaking)' will have some kind of uplifting twist also? i doubt it but you never know. Would be really interested in a post about AI and why i should worry about it ever being intelligent enough for unalignment to be a worry, though the links you've linked might be all i need, maybe?
Olu
2022-03-02 22:15:39 +0000 UTC