In this episode of the Existential Hope Podcast, Beatrice Erkers is joined by David Duvenaud, Associate Professor at the University of Toronto and former researcher at Anthropic.
We discuss Davidâs work on post-AGI civilizational equilibria and the widely discussed paper Gradual Disempowerment. David reflects on why liberalism may not hold up in a world where humans are no longer needed, how UBI could be Goodharted into absurdity, and what it would take to design institutions that protect humans even when incentives donât.
We also cover:
- Forecasting the long-term future using LLMs trained on historical data
- Robin Hansonâs idea of futarchy (governance by prediction markets)
- Asymmetrical but beneficial relationships between humans and AI
- Uploading, cultural legacies, and the possibility of âworthy successorsâ
Beatrice (00:00)
Great. So I am joined today by David Duvenaud, which is really exciting. David, we met at a workshop that you organized on post-AGI futures. We were both saying itâs kind of crazy that no one else had organized that type of workshop until you did, which was â this summer. Iâm really excited and curious to dig into what you think is going to happen in a post-AGI future. Not just what might happen, but what the potential cruxes are â the things we need to sort out to make it go well. Maybe we can start with you introducing yourself and how you came into this line of work.
David Duvenaud (00:42)
Sure. Iâm David Duvenaud. Iâm an associate professor at the University of Toronto in computer science and statistics.
I recently spent a year and a half on an extended sabbatical at Anthropic, where I ran a small team doing alignment evaluations â trying to see if models could sabotage our decision-making. Could they trick us into thinking they werenât capable of something when they were? Or secretly steer us toward a bad decision?
Iâve been following AI safety â the AI doom arguments and community â since about 2005. For a long time, there wasnât much practical work to do.
More recently, I started systematically asking smart people I knew: âWhatâs the plan for after alignment, if we actually solve the hard technical problems?â It seemed like humans could still end up in a permanently vulnerable position, not really needed for anything. I wasnât getting satisfactory answers, which was unsettling. So I tried to do something productive: organize a workshop.
I also met Jan Kulveit and other collaborators, and we co-authored a paper called Gradual Disempowerment. The argument was: we canât rely on comparative advantage, or assume government will protect us by default. Right now the world is friendlier because everyone still needs us. Itâs like being the âhot girlâ â everyone is happy youâre around. In the future, it might be more like being an old person: you may be fine as an individual, but including you just slows things down. Thatâs the kind of shift Iâve been thinking about.
Beatrice (02:29)
Yeah, thank you. 2005 â thatâs very early. How did you even find this? What corner of the web?
David Duvenaud (02:35)
When I was a teenager, I read The Age of Spiritual Machines by Ray Kurzweil.
Beatrice (02:42)
Mmm.
David Duvenaud (02:42)
It was an amazing book. We should give Kurzweil credit for thinking these things through so early. His âbig blob of computeâ argument was simple: the algorithms will come, but first order, compute matters most. Digital compute will eventually exceed biological compute, and weâll do the same things on digital computers. He was overconfident in some framing, but he was a visionary.
Then I found Overcoming Bias online â Robin Hanson, Eliezer Yudkowsky, Nick Bostrom, others. It was this post-Extropian community. I found Shane Legg back when he was a PhD student on LiveJournal. It was a very embryonic community, still talking about futurism in general â cybernetics, whether the future could be different â not yet focused entirely on AI risk.
Beatrice (03:23)
Yeah. You should come to a Foresight event â Ray and Robin used to hang around back in those days. Robin still shows up. Thatâs impressive â you were very early on the ball. Letâs pick up on the workshop you hosted. What were your assumptions about post-AGI governance and policy going in? And did you update from the workshop?
David Duvenaud (04:12)
Yeah. I mentioned systematically going to smart people asking, âWhatâs the plan?â and finding no satisfying answers. I started telling people: âNo one has a plan. You probably think someone does, but they donât.â But maybe I was wrong â maybe Iâd missed something.
So the rationale was: letâs do this publicly. We invited everyone who might have something concrete to say and asked: âWhat should the plan be?â We learned interesting things, but not âHereâs the plan.â
The workshop was called Post-AGI Civilizational Equilibria: Are There Any Good Ones? The idea was to bait people into saying, âObviously thereâs a good one, look at mine.â We wanted to make clear we were asking the big scary question: what will life look like?
The disappointing part is that no one said: âHereâs how it all works.â Joe Carlsmith gave a keynote asking: âCan goodness compete?â Against entities optimized purely for growth, competition, and resource use, how can anything else last?
Richard Ngo gave a keynote on flourishing in a very unequal world. He pushed back against the framing of AIs and humans as peers. Itâs unlikely weâll be equals, trading contracts. He compared it to humans and apes â contracts donât make sense there. Future relationships may look more like parentâchild: asymmetric but not necessarily dominating. That was hopeful: hierarchical relationships exist, often positively.
Beatrice (06:26)
Yeah, I remember he called it âbeneficial asymmetrical relationships.â
David Duvenaud (06:34)
Exactly. He noted those relationships have existed throughout history. Almost everyone historically owed something to someone and received something different in return. It wasnât pure domination.
Beatrice (06:59)
I agree. Both of those keynotes were updates for me. I also updated positively toward asymmetrical relationships as a possible solution. On that note: Richard Ngo has also written recently about drawing from more right-wing political ideas in a post-AGI world. You commented on this â especially around why liberalism may not hold up. Could you expand?
David Duvenaud (07:55)
Right now, our institutions support us even if we canât contribute much â welfare, pensions, freedoms. That was hard won. Historically the default was totalitarian: rulers having opinions about every detail of your life. Liberalism â limited government, neutral rules, live-and-let-live â is unintuitive but massively positive-sum.
Thatâs our most precious legacy. But maybe once humans stop providing value, thereâs no longer incentive to leave us alone. Liberalism may collapse.
Concretely: with UBI, people could start arguing that their way of life deserves more. That leads to zero-sum advocacy battles. You get factions reframing why they deserve resources. That could be our political future.
Beatrice (10:08)
When you say âway of life,â youâve given examples of how UBI could be Goodharted. Like if everyone made of flesh gets UBI. Why is it so easy to game?
David Duvenaud (10:30)
Itâs a thought experiment. Imagine: every human gets a fraction of GDP. Day two, entrepreneurs ask: whatâs the minimum viable human? Maybe ghoulish things â factories of frozen fetuses that technically qualify. Government patches, people adapt, and the arms race continues. The long-run winners are those who optimize hardest against the welfare rules.
Beatrice (11:45)
Politically, do you think that stabilizes? Or is it endless patching with no equilibrium?
David Duvenaud (11:57)
Equilibrium itself can sound bad â frozen states. Growth and adaptation are natural. People rightly say: the world already belongs to those who reproduce most, or succeed memetically. Competition is unavoidable. Maybe the state could set new rules â a fitness function worth fighting over. For example, being good at chess or dancing decides reproduction rights. Iâve barely begun thinking here, but there could be positive versions.
Beatrice (13:03)
Consciousness might be a candidate. Or prioritizing beings who suffer, giving them access.
David Duvenaud (13:20)
Thatâs horrible â paying for suffering leads to unbounded suffering. Youâd get torture farms. We must never do that. Better to pay for happiness.
Beatrice (13:32)
So happiest person wins? That could also be Goodharted, but I see your point.
David Duvenaud (13:45)
Exactly. Designing such a fitness function is a good exercise, even if itâs a ânice problem to have.â It forces us to think a few steps ahead: how rules get gamed, how governments patch, how optimization pressures play out.
Beatrice (14:17)
Letâs leave UBI aside and go back to liberalism collapsing. If liberalism doesnât work post-AGI, what does that mean for our political future? Can we redesign our institutions?
David Duvenaud (14:43)
Maybe. Liberalism works as long as productive entities exist who, left alone, create value through trade. But machines may replicate so fast they collapse abundance back into scarcity. Then liberalism fails.
Itâs early days; few people are thinking seriously about this. Iâm worried about acute loss-of-control risks too, but even if solved, the long-term âgood outcomeâ isnât clear.
If we could redesign civilization at all, thatâs already a huge success. But coordination at that scale is extremely hard. Almost any global plan today should start with massively upgrading our ability to coordinate and forecast before we try ambitious redefinitions of societyâs objective.
Beatrice (16:32)
When you talk about liberalism, one of its key strengths is adaptability. How should we balance planning for scenarios versus staying adaptive?
David Duvenaud (16:55)
Good question. People argue: planning far ahead always fails. 200 years ago, any attempt to design the future would have gone terribly. Thatâs fair.
But thereâs no way out but through. The âeasy winâ is better forecasting and coordination. Prediction markets, AI-assisted forecasts, superforecasters, Kickstarter-style mechanisms â these are almost strictly positive.
While humans still have power, improving coordination is hugely valuable. As for liberalism: it has obviously worked very well. I donât want to throw it away. But we should seriously consider that it may predictably fail post-AGI. Thatâs scary, and I donât want to rush into alternatives, but it deserves thought.
Beatrice (18:52)
Is your current work on this part of your professor role? Will there be a paper? How can we follow along?
David Duvenaud (19:01)
Iâm active on Twitter and publicize my papers. I also have a side project: training LLMs on historical data only, to forecast their âfutureâ (our past). That tests how well they can long-term forecast.
But overall, this project is much larger than me. Many others are exploring similar ideas.
Beatrice (19:31)
When I think of your work, itâs always on the human side â the societal impacts.
David Duvenaud (19:40)
I worked with Anthropicâs societal impact team. They have a similar view: cataloging who AI is replacing vs. augmenting. But itâs a Cassandra role: watching it happen, with little power to intervene. These issues are beyond the scope of any one company.
Beatrice (20:16)
On this podcast, Iâve interviewed many people with different expectations about AI. Iâd summarize your view as: AGI is inevitable and coming soon. Is that fair?
David Duvenaud (20:38)
Pretty much. Unless we had some global Amish-style ban, AGI is inevitable. Soon: I havenât done detailed forecasting like the AI 2027 folks, but I donât expect my kids to have ânormal jobs.â My oldest is six. Within ~15 years, current scripts about how humans provide value will stop making sense.
Beatrice (21:12)
What has led you to that assumption?
David Duvenaud (21:18)
The brain is a biological computer â and probably not efficient. Digital memory is copyable, which is overpowered. Itâll always be better to build one amazing digital mind and replicate it than to train a million humans.
The tipping point isnât when jobs vanish, but when it becomes common knowledge that humans have no comparative advantage. Thatâs the depressing day.
Beatrice (22:17)
And since you see it as inevitable, what do you think about stopping it?
David Duvenaud (22:29)
Hereâs a moral crux Iâve discussed with Robin Hanson: Do you care about yourself and your kids flourishing, or just about life in general flourishing?
If you only care about life, then you donât have to worry. Whoever inherits will be happy colonizing the galaxy, just like weâre happy despite our bloody evolutionary past.
If you care about your own family line and humans not going extinct, thatâs a much narrower, harder target. Thatâs my bar: a world as good as mine for most people, especially my family not marginalized. Thatâs tough.
Beatrice (24:31)
I want to go back to the Gradual Disempowerment paper, since you mentioned it briefly. Itâs been widely read and discussed. What are its biggest implications?
David Duvenaud (24:57)
The main one: we canât assume UBI will save us. Weâll be culturally demonized for demanding expensive retirements. Institutions will evolve to marginalize us, because that serves growth.
Many people assume: âGovernment exists to serve humans.â Or: âIf humans go broke, machines will too, since they need to trade with us.â Thatâs wrong. Itâs like monkeys asking who humans will trade bananas with. Humans donât need monkeys. Machines wonât need us.
Our intuitions about governments come from a world where governments needed us. Even the cruelest regimes still kept their people alive. That wonât apply when humans arenât needed.
Beatrice (26:34)
Thatâs scary. If we take it seriously, what interventions could preserve meaningful human agency? For example: uploading â would that help?
David Duvenaud (27:04)
Uploading is like an old person downsizing to a small apartment. It reduces tension â less resource waste. I imagine futures where robots run most of the world, but I live with family on a small plot.
But machines could say: your house uses vast resources for a handful of humans. If you uploaded, we could host millions of minds, including you, plus morally âbetterâ versions. Physical humans would look criminally decadent â like paving over a country for a single gorilla while orphans suffer.
So uploading may reduce conflict pressures, but itâs also disempowerment: once digital, youâre easy to shut off.
Beatrice (28:42)
If not uploading, what else could help avoid gradual disempowerment?
David Duvenaud (28:52)
Weâve been designing institutions on âeasy mode.â Weâll need more robust institutions that protect people even when theyâre not needed. Thatâs very hard.
We may need global agency. Thatâs scary â once you have a world government, you canât go back. But probably something like that is coming. We need to practice upgrading governance capacity, to build governments that treat people well even when incentives donât demand it.
Beatrice (30:47)
Interesting. At Foresight, we often think about multipolar futures â checks and balances, liberalism extended. You seem to see the opposite: global agency. Could you explain?
David Duvenaud (31:33)
I keep saying âworld government,â but maybe the naive analysis expects one hegemon while reality is messier. Empires rise and fall, parties split, religions spread. So maybe multipolar chaos is the attractor. But thatâs even worse for dependent humans: endowments vanish when empires collapse.
Still, liberalism is underrated â itâs amazingly positive-sum. But I donât see how it persists long-term when governments that optimize for machine growth will outcompete those that spend heavily on human pensions.
One hope: growth is so fast that human needs are a rounding error. Machines can just give us whatever we want without cost. Thatâs feasible, but also scary â like being a mouse in bed with an elephant.
Beatrice (33:20)
Maybe it depends on assumptions â multipolarity works under weaker AI, your analysis assumes stronger AI. Thoughts?
David Duvenaud (33:41)
I dislike relying on AGI to solve global coordination. Some, like Ryan Greenblatt, argue: give each person an aligned AGI and theyâll cut deals globally. That could work. But what if we have slow takeoff, lots of AGIs playing local zero-sum games â spamming citations, spamming dating sites?
The real solution is global institutional redesign. Thatâs hard, and no one may be empowered enough to do it.
Beatrice (35:42)
Anything else on hegemon AI vs. multipolar futures youâd add?
David Duvenaud (35:50)
I have a half-written post: whatâs stable at the top? One case: global domination persists by purging dissent. Another: life finds a way â cancers grow, religions spread, power fractures. Stability is unclear. At the cosmic scale, multipolarity seems inevitable (with aliens).
Beatrice (36:41)
If you personally got to decide, what would you do? Stop AGI? Something else?
David Duvenaud (37:04)
If we had global coordination, Iâd say: stop AGI. Just like: if we could stop global war, we should. But thatâs about as hard.
Beatrice (37:04)
Would that be âstop for nowâ or âstop foreverâ?
David Duvenaud (37:24)
Always âfor now.â Preserve option value.
Beatrice (37:37)
Good point. Sorry, continue.
David Duvenaud (37:38)
The reason Iâm not on a hunger strike is that we donât actually
have power to stop AGI. What we could do is ban civilian AGI. But then it shifts into military programs, run by optimists, likely worse than the status quo.
So Iâm not pushing for pause now, but if we had the coordination, I would. For now, we should focus on building auditable consensus and coordination muscles.
Beatrice (38:42)
So youâre not for a pause, practically speaking?
David Duvenaud (38:46)
If we could actually pause, I would. But advocating pause now likely produces bad outcomes. It depends on whoâs saying it and whoâs listening.
Beatrice (38:49)
Fair. If pause is off the table, whatâs the second-best option?
David Duvenaud (39:10)
We muddle through: some coordination, some control, humans not rapidly disempowered, decline slowed. By the time disempowerment happens, many will feel machines deserve it. Culturally, weâll see them as successors.
The question is: can we make them worthy successors? Robin Hansen is optimistic: we can leave cultural legacies, moral fables for machines. Iâm skeptical about stickiness, but he may be right.
Beatrice (40:30)
Thatâs one way to look at it positively.
David Duvenaud (40:41)
And culture will reinforce that positivity. Most people will be thrilled: âOur children are doing amazing things.â Angry dissenters will look like losers on the wrong side of history. Thatâs how disempowerment usually feels in hindsight.
Beatrice (41:15)
Yeah. I think Iâm more on the side of caring about life broadly than about myself personally, which sometimes differs from longevity advocates. I also want to talk about the far future. Youâve said people think they donât care, but they do.
David Duvenaud (41:47)
Exactly. Ask someone: âWill your child graduate?â They care deeply. Ask: âDo you care if humans exist in 1000 years?â They shrug. But if you chain the logic â âYour great-grandchildren exist, then suddenly they donâtâ â they realize they do care.
It frustrates me when people dismiss concern for the future. Theyâd care at the moment of disempowerment. Near vs. far mode, as Robin Hanson says. Far mode is vibes and signaling; near mode is concrete trade-offs. We need tools to force ourselves into near mode, like planning a grocery trip, not just âsounding good.â That was part of the workshopâs point.
Beatrice (43:28)
That makes sense. Like in climate debates: lots of far-mode signaling.
David Duvenaud (43:49)
Exactly. Near mode would be: âAre you willing to make people poorer now to avoid future harm?â People hate ranking sacred values, but we need to.
The âsingularityâ framing has been corrosive. It encourages vague handwaving instead of concrete planning. We need to imagine waking up tomorrow: where do you invest your pension? In the machine economy or the human economy? What do you actually do?
Beatrice (45:05)
Itâs a very hard problem. Glad youâre working on it.
David Duvenaud (45:20)
Part of my origin story is asking Silicon Valley people: âWhat will you do post-AGI?â Their answers were flippant: âClick accept all day,â or âTake up certification.â People refused to think concretely.
Beatrice (45:40)
When you press people, do they reveal preferences about the far future?
David Duvenaud (46:04)
Not much. Usually: âIf itâs gradual, itâs less bad.â Which to me just spreads out the badness.
Yudkowskyâs âcoherent extrapolated volitionâ was meant to address this: if we think longer, weâll converge on preferences. Thereâs a counterpoint essay: even after a million years, you might still feel indifferent. Then you should just pick â clearly you donât have strong preferences.
Beatrice (47:15)
Youâve mentioned forecasting as something youâre excited about. What draws you to it?
David Duvenaud (47:27)
Itâs rationalism scaled up. When asked whoâll win a war, people give vibe-based answers, not predictions. Discussions would be higher quality if we had even imperfect consensus on outcomes.
Metaculus questions have helped â e.g., on AGI timelines. But superforecasters badly missed when asked about AGI significance, giving 0.4%. That aged poorly.
We need stress-tested long-term forecasting. It should be a prestigious field, like physics. Imagine statues for forecasters proven right after 10 years. Thatâs the civilization I want.
Iâm collaborating with Alec Bradford and Nick Levine: training LLMs on only pre-1930 data, then testing whether they can âforecastâ the 1930sâ40s. Itâs hard â leakage, framing questions, calibration â but promising.
Beatrice (49:57)
Thatâs such an interesting project. I hope youâll share updates. You also mentioned Robin Hansonâs futarchy. Could you give a quick intro and say what excites you?
David Duvenaud (51:26)
Sure. Robinâs idea is: separate values from predictions.
Weâd vote on values â the governmentâs utility function. Then use prediction markets to forecast which policies best achieve those values. Expert forecasters would be incentivized to get it right.
Itâs hard in practice. Companies have tried internal prediction markets; leaders killed them because forecasts made them look weak. But if we can reconcile this, futarchy could be a big upgrade.
Beatrice (53:05)
Yes, futarchy comes up often as one of the few genuinely new governance ideas. We should experiment more.
David Duvenaud (53:25)
There have been experiments. Robin helped set some up. They faded because leaders felt undermined. Maybe there are technical fixes â subsidizing âyesâ positions, for example. But aligning incentives is still an open question.
Beatrice (54:23)
Right. The self-fulfilling prophecy issue seems central.
David Duvenaud (54:44)
Exactly. Itâs a fascinating technical area. I even considered working on it instead of safety. But Iâm not an expert.
Beatrice (54:54)
Fair. To round off: what gives you optimism or hope in your work and life, given these heavy topics?
David Duvenaud (55:10)
First, more people now share this big-picture view. I feel less crazy. Second, weâll have AGIs to help us in the short run. Third, worst case isnât doom: if your moral circle is broad enough, something will have fun colonizing the galaxy.
I always want to improve outcomes, but Iâm not nihilistic.
Beatrice (56:14)
Youâve said it took you years to process the idea that no one is coming to save us. How did you adjust?
David Duvenaud (56:26)
Around 2019â2022, I felt the big picture was worse than most smart people admitted. I waited for them to âwake up.â Some did, but without great answers.
Vitalik Buterin is an example. I thought heâd eventually offer a solid plan. He wrote about d/acc, which is about the best anyone has done, but still not reassuring. That made me realize: maybe I have to be the adult in the room. Itâs unpleasant, but you do what you must.
Beatrice (57:38)
Do you have personal systems to stay sane â to be present with your family, for example?
David Duvenaud (57:47)
I do the basics: exercise, sun, family. And I consciously decided: be pleasant. Donât be a downer. Thatâs table stakes.
Beatrice (58:28)
Yes. Lastly, with the Existential Hope program, I want people to imagine ambitious positive futures. If things go really well, how good could it get?
David Duvenaud (59:04)
Concretely, life is already rich with meaningful coincidences. In the future, weâll likely understand more of them, stay mentally awake more often. That alone promises much richer experiences.
Beatrice (59:50)
Thank you so much, David. This has been a very interesting conversation â wide-ranging, but full of important considerations.
David Duvenaud (1:00:05)
My pleasure. Thank you for having me.
â