Podcasts

Foresight Fellow Special | Siméon Campos: On Governing AI for Good

about the episode

This episode is a special Foresight Fellow episode of the Existential Hope podcast, where we sit down with Siméon Campos, president and founder of Safer AI, and a Foresight Institute fellow in the Existential Hope track. Siméon shares his experience working on AI governance, discusses the current state and future of large language models, and explores crucial measures needed to guide AI for the greater good.

Xhope scenario

Siméon Campos
Listen to the Existential Hope podcast
Learn about this science
...

About the Scientist

Siméon Campos is the president and founder of Safer AI, and a Foresight Institute fellow in the Existential Hope track.

...

About the artpiece

DALL-E 3 is a generative AI tool from OpenAI.

Transcript

[00:00:00] BE: Welcome to a new episode of the Existential Hope podcast. I am Beatrice Erkers, your co host. I co host this podcast along with Allison Duettmann. Today, we're joined by Siméon Campos, who is the president and founder of Safer AI. And he's also a Foresight Institute fellow in the Existential Hope track, which is why we're especially excited to have him.

[00:00:19] BE: And so this is also a special episode for us. This is a special fellow episode, We're very thrilled to have Siméon come and share his insights into the future of AI safety and ethics. So thank you so much Siméon for joining the podcast. But before we jump into our conversation, remember to visit existentialhope. com for a full transcript of today's episode, along with the recommended resources that Siméon mentions in this podcast. And also don't forget to sign up to our newsletter to stay updated with the latest episodes and community updates. Let's welcome Siméon Campos to the existential hope podcast.

[00:00:50] BE: Another episode of the Existential Hope podcast, where we're trying to explore the future with those who are shaping it. So today we have Siméon Campos joining us. Siméon is the president and founder of Safer AI. He's also a Foresight Institute fellow in the Existential Hope track, which is why we're very happy to have him.

[00:01:09] BE: And he has dedicated the last few years to the development of safe AI practices. And so it will be very interesting to get his peace of mind on this. So yeah. Welcome Siméon. I'll just start with asking you, can you tell us what you're working on? What got you started and like your life story in three minutes?

[00:01:31] BE: How did you get into this? 

[00:01:32] SC: Cool. Yeah. Thanks a lot for having me. So yeah, what am I working on? So I'm working on safe AI, which is AI risk management and stabilization, so working on developing concrete rules that AI developers could apply in a way that would make AI development safer. 

[00:01:52] Journey into AI Safety

[00:01:52] SC: How did I get into this?

[00:01:55] SC: This kind of work was... there was a French YouTuber - I'm French, as you can hear. There was a French science YouTuber who was talking about AGI risks a lot. He's called Science4All. I started listening to that when I was 16. Then I did a trip to the Bay Area for three months, close to Stanford.

[00:02:14] SC: I was in a group house with people who were thinking a lot more about the current state of evidence in AI safety. So I dived deeper into those topics and then started thinking about how I could be directly helpful on this topic. I was an econ student at the time.

[00:02:34] SC: Then I co-founded the first organization which was training people on these topics, not only on AI safety but also on pandemic reduction. We trained a couple hundred people on AI safety topics. We created a bootcamp called ML for Good, which was training people on safety-related topics.

[00:02:56] SC: And then more recently, I created Safer AI, which has been doing risk management standardization.

[00:03:03] BE: Thank you. Yeah.

[00:03:04] Current State of Large Language Models

[00:03:04] BE: I would be interested to maybe just dive into one thing that you mentioned, which is the state of large language models right now. I know that's something that you're very familiar with. Could you describe the current state of large language models? And also, it would be very interesting to hear how that has developed since you got into this.

[00:03:24] SC: Yeah, the current state is that the best models are basically GPT-4 and Claude 2. GPT-4 finished training about a year and a half ago, and Claude 2 finished training probably much more recently. But I think it's not fully representative of what we're going to see this year.

[00:03:39] SC: The state-of-the-art language models, as we've done for six years, keep training on more computing power. So more GPUs, which are the computing units used to train AI systems, keep getting better and better.

[00:04:02] SC: What this means is that there are now capabilities across the board that are very advanced. Claude 2 has a very good understanding of many complicated topics, like standardization, which is what I'm working on. It knows about things across the board and is able to do pretty complicated reasoning.

[00:04:22] SC: I've heard and seen people ask about quantum physics and it understands very niche papers that people have produced. There are some surprising emerging capabilities, which I think are noteworthy. One of those is that at one point, we were testing Claude on whether it was able to catch a weird sentence in a very long context.

[00:04:45] SC: We had given it basically the equivalent of about 15 books. In the middle of it, we had thrown in a sentence about pizzas, and then we asked it a question about the pizza when everything else was unrelated. Not only was Claude able to find the sentence on pizzas, but it also noted that it was weird that there was this sentence on pizza in the middle of 15 books. It said something along the lines of, "It is likely that you are testing me by doing that."

[00:05:08] SC: It was very surprising and points to some of the worries that the safety community has had.

[00:05:28] SC: I'd say that what changed from when I arrived, which was about 2020-2021, to current capabilities that are publicly available is that models became a lot more deeply knowledgeable. When I arrived, it was like GPT-3 basically, and GPT-3 was very impressive because it could do plausible-sounding reasoning and have plausible-sounding knowledge across the board, but it was not easily usable for knowledge purposes.

[00:05:55] SC: Now we're at the stage with GPT-4 class models where, in fact, they are really knowledgeable on a bunch of things. Today, the fastest way you can learn things is probably using those models rather than reading Wikipedia pages and things like that.

[00:06:23] SC: What's going to change is that we will keep growing the quantity of computing power that we throw at those models. They're going to become better at reasoning, and in particular, AI developers are working hard on trying to apply techniques of reinforcement learning to models.

[00:06:46] SC: Historically, reinforcement learning has been the part of AI technology that has allowed models to become superhuman at things like Go, for instance. Right now, AI developers are trying to apply this technology to language models to try to make them a lot better at reasoning, and in particular, very long-chain reasoning.

[00:06:50] BE: Yeah.

[00:06:51] Extreme Risks Associated with AI

[00:06:51] BE: What do you consider the most pressing extreme risks associated with AI given this development?

[00:06:58] SC: It depends on how we define extreme. If we define it as killing 100 people or something like that, I think it's mostly coming from misuse risks. We know historically that some groups like Al-Qaeda or the Japanese cult Aum Shinrikyo have been trying to build bioweapons, but they failed in part due to a lack of available knowledge.

[00:07:52] SC: As these models become closer to PhD-level knowledge everywhere, they're able to help lower the barrier to entry for developing pathogens that are actually working and could kill many people. So I think in the next couple of years, the misuse risk through the potential design of bioweapons assisted by large language models is going to be one of the primary sources of ways more than 1 million people could die.

[00:08:16] SC: If we're talking about more than that, more than 1% of the population or things like that, then I think it's mostly misalignment concerns. Like systems that are highly autonomous and capable of a range of things start optimizing for things that are detrimental to human interests, and they begin to take an increasing amount of power. Because of this, essentially, they could cause a number of deaths.

[00:08:20] BE: Yeah.

[00:08:20] Short AGI Timelines

[00:08:20] BE: I think I heard you speak on a podcast about short AGI timelines. It'd be very interesting to hear how your perception has changed since you got started. The AI community's timelines, in general, have changed a lot in the last few years.

[00:08:38] SC: Yeah. I was, I'd say, one of the early short timelinists in some ways. The experience of getting into the AI field is getting surprised over and over again. For instance, I remember early 2022, it felt like math wouldn't be something where language models would be good at, because even the best models had a purely flat curve. They were just close to zero at math.

[00:09:02] SC: Then four months after that, there was this paper called Minerva, which was getting really decent at math. It was one of those personal experiences where even though I already expect a lot of progress in AI, my predictions about specific capabilities that won't happen end up being wrong.

[00:09:25] SC: I think the experience of being wrong and underestimating the progress rate is something which everyone has to go through to then be like, "Okay, actually it could be really fast." Two years ago or one year and a half ago, I started thinking that AI systems that could be highly dangerous autonomously, not permissively, had significant chances - so 50 percent chance to happen by 2027 or earlier.

[00:09:49] SC: Currently, I still think that. The main way this could be wrong is if we start hitting an increasing number of challenges as we keep trying to multiply by 10 or 100 the amount of computing power we store in the same place.

[00:10:10] SC: For instance, having highly dense power sources has become more important now because the size of the clusters is so large that you can take down an electric grid if you're not careful. Personally, I still expect it by 2027, but if over the course of the next year or the next two years, we see that the buildup in computing power slows down in terms of orders of magnitude, I think that would be a reason for optimism and for thinking that dangerous AI systems may arise closer to the 2030s.

[00:10:36] SC: Each additional order of magnitude is a bit harder to get. The reason I think it won't be a significant limiting factor is that there are many parts of the process that can be optimized, many stacks. At the most fundamental level, you have Moore's Law, so how small the tiny transistors can get. But between the tiny transistors and the actual training of the system, you have many different layers that are starting to get optimized.

[00:11:18] SC: You can make matrix multiplication a bit faster, or you can rely on a different type of transistors, which allows producing a lot more GPUs. A company recently called Groq did that. Things like that.

[00:11:36] Mitigating AI Risks

[00:11:36] BE: Yeah, I'd be curious to hear what you think we should do to mitigate the risks arising from this. It would be interesting also, I know you have a pretty interesting view on the global AI governance landscape right now. So it would be interesting if you could provide a summary of how you perceive the global AI governance landscape right now and what you think we should do.

[00:11:59] SC: Cool. That's a lot of questions. I think the most important thing right now is to draw red lines, like strict red lines. Companies have drawn some orange lines in some way. So they said, if we hit this bar, we'll do X and Y and keep going. And I think we need to draw some strict red lines. Some people call that risk tolerance, some others in risk management call that risk appetite.

[00:12:28] SC: For instance, in nuclear safety, the NRC, which is the regulatory institution in the U.S., has actually said nuclear power plants should not kill more than X people with Y percentage of chance. Even as a guideline, such things are very helpful because it sets the bar that you have to meet in some ways. And generally, we don't have such a bar.

[00:12:52] SC: So nobody has ever stated, "We'll make sure that our AI system never has more than Z chance to cause catastrophic harm." One thing I'm worried about is that the later we do this, the more there will be goalpost moving in the process. So the amount of risk that we'll tolerate will be very high. So I think the current most important thing would be setting our risk tolerance and ensuring that it's enforced among companies, so that at least it's referred to in the way they communicate their risk management efforts.

[00:13:16] SC: The second thing I'm most excited about is roughly the two paths to trying to reach a world where extremely advanced systems are safe. First is trying to gain time so that we have more time to make systems safe.

[00:13:36] SC: The community has been very vocal on this, with discourse around pausing AI development, because AI capabilities have advanced so fast that we feel like there's not enough time. So I think this is an important path, but there is another path which has received much less attention, which is trying to develop safe capabilities.

[00:13:56] SC: So trying to accelerate the development of agendas that have some non-trivial odds to actually reach very capable AI systems in a more provably safe way. The issue currently with large language models is that we understand them so little and they have, for instance, robustness problems that make it unlikely we'll ever get guarantees on large language models.

[00:14:20] SC: So if a large language model passes dangerous abilities, I think in the foreseeable future, we'll never be able to say, "Oh, we know it won't do that" or something like that. Even if it is able to do so. There's another range of research agendas, primarily driven by someone called David Krueger, alias David Dalrymple, and Yoshua Bengio, the Turing Prize winner.

[00:14:50] SC: They have some ideas of how they want to develop some safe capabilities. And those ideas have not received nearly as much attention as I think they deserve. They are not massively funded, maybe $17 million or something like that, and any single major AI company could provide $1 billion to such things.

[00:15:14] SC: So I think the second area where I'm currently most excited about is getting more excitement, discourse, and funding for those agendas. So that in the event we fail at pausing AI development, which is a really difficult task, we have more chances to have a safe path forward.

[00:15:36] Global AI Governance Landscape

[00:15:36] SC: What has been going on in the global AI governance landscape this year is, let's say, a bit underwhelming sometimes.

[00:15:44] SC: I don't know if you remember, but in June after the GPT statement and after the Future of Life Institute statement, there was a lot of excitement around how ambitious we could be and things like that. And now it's been like nine months and not that much has happened on those fronts.

[00:16:02] SC: A few major things have happened on the side. One of those things is the international AIDIS (AI Day for International Safety, I think). There was a day organized between China and a bunch of high-profile AI safety people from the West, like the head of the AI Safety UK, associate people like that.

[00:16:25] SC: There was a lot more consensus than one could have expected. That was a big deal. And there were very high, fairly senior Chinese officials involved. So that was a big deal. But apart from that, so far, we haven't had a plausible path to AI relief, for instance, that could potentially help coordinate different countries to put red lines.

[00:16:52] SC: The UK AI Safety Summit had not really focused on international governance. There wasn't any major progress on an AI treaty. And in the regular institutions like UNESCO, the United Nations, or ITU, which are three big bodies that have been caring about international AI governance, everyone is doing their own thing and trying to be the body responsible for AI governance. But so far, no one has reached a critical mass sufficient for everyone to rally around this and to have a lot of momentum.

[00:17:18] SC: So in the international AI governance space, we're still looking for some Schelling point to rally around where coordination can happen, and it's not clear yet what this will be.

[00:17:42] BE: Did you mention the EU AI Act?

[00:17:42] SC: No, I did not mention it because I don't consider this international in some ways, but yeah, the EU AI Act was passed. It has pretty decent regulation on large language models. It is a product safety law, so it only applies to products. So it's not yet clear how helpful it will be for things like AGI. I don't think OpenAI would care that much a priori about productizing AGI or something. So it's not clear if the law will even apply to OpenAI trying to develop AGI because if OpenAI decides not to release the model in the EU, then the EU Act won't have hold on this.

[00:18:26] SC: But I think it is valuable as a symbolic level, very significant, that a significant body of states has taken the risk of regulating because there's a risk of falling behind. And also, other countries are interested in looking at it, how it's doing, et cetera.

[00:18:44] SC: Yes, I think the EU AI Act is a very good first milestone. And I think combined with the executive order in the U.S., if it survives future elections, those could be two very important regulations that help set the world on the track of safe AI development.

[00:19:04] BE: Yeah, let's hope so. I like the idea also of what you said about drawing clear red lines. That seems like definitely something that should be done. Is that part of what you're doing at Safer AI? I know you're working to develop standards. So it'd be very interesting to hear what you mean by standards and also what the process of developing AI safety standards looks like.

[00:19:24] SC: Yeah. So it's not really about relying on standards, at least currently no standardization body has the mandate or a strong mandate to say we shouldn't train systems that do that.

[00:19:36] SC: Standards is, I'd say, a pretty large word, which encompasses all the technical details that allow a law to be effective. So for instance, one example of a piece of the EU AI Act is that there are some articles that say that AI systems with certain criteria should manage risk up to acceptable levels.

[00:20:01] SC: Standards is the process which allows to refine those legal requirements into things that are applicable to the system. So it involves a lot of conversion of the state of the art. For instance, if you're trying to do standards on mitigating risks, let's say the production of bioweapons that we discussed, then you're going to look at the literature, everyone who wrote about this, and then you're going to look for technical mitigations that are discussed. And see if based on those indications, there are reasonable requirements or guidance that you can provide that would help an AI developer do its job mitigating a risk.

[00:20:47] SC: The standardization process is very political. It depends on which organization body, but for instance, in the EU and international standardization bodies that are respectively called CEN-CENELEC and ISO/IEC, there's everyone. So there are companies, civil society representatives, nonprofits, et cetera. And there are some policymakers and some academics. And so all those people are meant to agree. It's a consensus-based process; supposedly all changes are supposed to be agreed upon by everyone.

[00:21:09] SC: And so what it means is that it's a lot of politics. Typically, if you want to contribute something, you have to convince everyone that it's worth putting in there and you have to argue about why this is in line with the mandate that has been given to the standardization bodies. It's a very lengthy process, very important for actual application of the law, but very lengthy.

[00:21:36] SC: One example of why it's important is GDPR. For GDPR, there are those cookie banners that are very annoying and come in all shapes and forms. One of the reasons why there's nothing similar for these banners is that GDPR hasn't been standardized. So GDPR has just set legal requirements that companies have to apply. And so because it hasn't been standardized, the level of detail was not low enough for people to agree on how to do that.

[00:22:02] SC: And so you end up with those enormous amounts of cookie banners all over the place that are annoying everyone, whereas one way standardization may have resolved this issue is by saying you could have a browser-level preference and say that you always want to refuse those cookies. But the law was about to get into the realm of standards, so that didn't happen.

[00:22:46] BE: For progress on AI safety work in general, what do you think is holding progress back? Is it funding? Is it political? Anything else?

[00:22:55] SC: That's a good question. So I think there are two big buckets in AI safety. There's what I'd call the marginalist AI safety agenda and the ambitious AI safety agenda. Maybe it's not fair because "ambitious" is more positively connotated, but anyway, you see my preference. Basically, most of AI safety so far has been trying to make LLMs, which are the things that are working in terms of capabilities, safer.

[00:23:22] SC: So we are trying to interpret and understand them. This includes work on interpretability or trying to make them more robust, things like reinforcement learning from human feedback. And I think the other way, which has been a lot more neglected, and which is what I was referring to when talking about Yoshua Bengio and David Krueger, is trying to build a system which is safe by design.

[00:23:46] SC: So we should build in a way where you try to preserve some property. One property you may want is having a system you understand. So you could try to prototype at small scale some systems that are interpretable, and then once you have something with the desirable safety properties you're looking for, we have this magical property in AI which is called scaling. So far, most of the progress has come from scaling deep neural networks.

[00:24:17] SC: And so if you have prototyped something at small scale which works, hopefully if you're able to make it highly parallelizable on GPUs, on the units that are helping to train systems, you might be able to get the capabilities that you want for an advanced AI system. So I'd say at the high level, there are two approaches: take the currently capable systems and make them safer, or try to make a model safe and then give it capabilities.

[00:24:41] SC: The issue with the first approach is that LLMs have a bunch of really bad properties for this. For instance, it's very unlikely that they'll ever be fully interpretable, at least in the near term. The second approach, trying to make a model safe and then give it capabilities, I think is more reasonable because I think the safety problem is harder than the capabilities problem in some ways.

[00:25:02] SC: It's harder to make and guarantee that your system is safe than to give it capabilities because things scale in large and small. As far as what's holding them back, I'd say those two things are held back by very different things. The LLM safety path has received a substantial amount of funding, but the issue is that the core problem is just really hard.

[00:25:23] SC: Making LLMs safe has been worked on, I think, since probably 2015 or so, and basically hasn't made that much progress right now. There's hope that we solve one of the problems or the main problem that has held back the field, but it's not yet here that it has been solved.

[00:25:43] SC: And there are still a bunch of other problems. So I'd say that the problem the LLM Safety community is facing is just that the problem is really hard. What I don't like with the LLM safety side of things is that the vibe is a lot more "let's try to do our best" rather than "let's set the bar and then meet it" because it's likely that we just won't be able to meet the required bar.

[00:26:04] SC: And yeah, there basically was a safety... I'm not exactly sure. I think if safety was more friendly, things would probably go faster. And talent is probably a big bottleneck, especially at the level of experience that people want to have at labs.

[00:26:28] SC: As far as I understand, the OpenAI team, at least on the safety side, I think it's bottlenecked by a lot of things, but primarily people not knowing that much about it. And so people default to doing the other approach, and then funding is not that big of a deal. Historical funders of the field don't know that much about this or aren't super excited about this approach, so there's not that much development, not that much funding flowing into it.

[00:26:58] BE: Yeah, that was very interesting to hear. I like your take of the two buckets and yeah, very impressed with David's work as well. I'm going to try to transition us from this very AI safety focus to the existential hope part of the podcast. I'd be very curious to hear, would you describe yourself as someone who's positive or optimistic about the future? Do you think it's going to go well or are you more terrified?

[00:27:25] SC: Yeah. So I'd say when I'm coldly thinking about it, I'm more very pessimistic by default. I think it's most likely to go wrong and potentially goes extremely well or something, but that's the analytical basis. I wouldn't describe myself as terrified about it. On a daily basis, I think I tend to have a fairly optimistic nature.

[00:27:44] SC: So yes, I'd say cognitively terrified, but practically optimistic and yeah, on the fact that we can change things a lot and we can push things a lot.

[00:27:54] BE: Totally understand the divide there as well. In terms of, so one thing that we always ask is for an example of a eucatastrophe, which is a concept that comes from Tolkien originally, but where we got the whole existential hope phrase from was a paper by Toby Ord and Owen Cotton-Barratt called "Existential Risk and Existential Hope."

[00:28:17] BE: And in there they bring up this example of a eucatastrophe, which is basically the opposite of a catastrophe. So it's like an event that when it happens, the world is much better off. And so it would be interesting if you try to imagine you get to choose one eucatastrophe event that happens to us in the next hundred years or so, what would that be?

[00:28:39] BE: Can you share a vision for that?

[00:28:41] SC: Yeah, so I think that agenda is pretty solid for that. I think safe AGI can both prevent a catastrophe and offer a very promising pathway into a eucatastrophe. So the first component is safe capabilities - being able to develop extremely advanced AI systems, but with safe capabilities. The second aspect, which I think is being neglected, but not by David, is governance.

[00:29:10] SC: This is some form of intelligent governance where there's a multi-stakeholder bargaining process where you try to ensure that AGI development doesn't benefit only a few, but most people. This is not a given because advanced AI systems are probably the most power-concentrated technology ever developed; it might be the first time where people could, in principle, control everyone else.

[00:29:39] SC: So this overarching governance aspect is also really important. And then if this whole thing goes well, I think there are, yeah, just how could it fit? I think one thing which could be awesome is that currently, lives of individuals are very predetermined by their genes and where they're born and things like that.

[00:29:59] SC: And I think that technology could alleviate that a lot in some sense. So it could allow people to be more what they wish they were, rather than having thrown the dice and being lucky or unlucky. So I think this is one of the features which I feel is very exciting about a potential world with safe AGI.

[00:30:20] SC: Yeah, I think this means a lot for everyone. It means that you're connected to the person you would like to be. If you want to be, imagine for instance, being a kleptomaniac - in French, we say "cleptomane," it means like stealing stuff all the time. There's a mental issue where you can steal things all the time, even without noticing.

[00:30:41] SC: Imagine being this; it's likely that you don't want to be this. I think firstly, AGI could help reconcile this. Yeah, have a much better, lot better and more life. And then there are all the obvious problems that are currently out there that could be solved. Poverty, like people not having food or people having diseases - it's pretty likely that if we had safe AGI and solid international governance, we could solve such things.

[00:31:09] BE: Yeah, I agree in terms of individuals' lives being predetermined. There seems like a lot of just like happiness is even genetic and there's a lot of exciting applications now that could potentially be just like a few neurotech approaches that have come up recently. And we had Andrew Rennekamp on the podcast recently who runs a biotech startup and he spoke a lot about how you could do a lot more preventative interventions for people's health and yeah, just better healthcare in general. So yeah, very exciting vision, I think. Is there anything, one thing I'd be curious to hear is, we've spoken about AI and AGI a bit. So it would be interesting - do you think that something like this, would we need AGI to achieve this or could we do it with highly capable, narrow AI?

[00:31:56] SC: Yeah, so AGI first is a term I use for convenience because it has become the standard term to talk about this extremely intelligent thing that helps to do a lot across the board in a way that's as good as human experts or better. But yeah, I think that there's nothing that tells us that we couldn't get most of those benefits with narrow systems.

[00:32:18] SC: Typically, reinforcement learning has been successfully applied to a lot of very important problems in a narrow way and solved these. So protein folding is one example of a fully solved problem, which has many applications in drug design and development. Another application recently was weather forecasting - Google DeepMind released the world's best system to forecast weather that doubled the range of what we're able to forecast.

[00:32:49] SC: And yeah, it allowed us to unlock those capabilities. Yeah, I definitely think that it could probably take a bit more time, but it could be possible to get many narrow tools or AI systems that unlock most of what we care about without needing this general-purpose AI system. I think it would just take probably a lot more time and a lot more care in the design.

[00:33:13] SC: When you're trying to solve systems like that, solve problems, you need to care about the specific properties of the problem and then design a custom system for it. Whereas with general AI, it's more like magic - you just throw a lot of data and train for a lot of time and discover what comes out. And usually, it's pretty capable. But yes, I don't think there's anything that prevents narrow AI systems from at least unlocking a large share of a lot of the value at this stage.

[00:33:45] BE: I think it's interesting. We just did a world-building challenge within this existential hope program. And then we tried to imagine the world in 2045, even with just narrow AI. And it's pretty - without getting too crazy, it is pretty crazy how big of a change it could make even with that.

[00:34:04] SC: Yeah, I think the main thing you probably couldn't get is automated research and development.

[00:34:10] SC: So automated research and development is what makes general-purpose AI systems extremely world-changing. It's going to probably change everything really fast. I don't think you can get that with narrow systems, but what you can do with narrow systems is solve the problems you have to solve in various fields and solve those problems well with one narrow system.

[00:34:34] SC: So it's just a lot slower, but yeah, I think this is still feasible.

[00:34:39] BE: Yeah. I think on that topic, are there any other technologies that you think are relevant to building a positive future other than AI? Is there any other technology that you're excited about right now?

[00:34:52] SC: Yeah, I used to be looking a lot more into climate change.

[00:34:55] SC: So I'm pretty excited by carbon removal technology at this point. I think it's pretty lucky that we're making it on the climate change front. Because electric vehicle batteries and solar power, all those things have gone extremely well. But I still think that carbon removal is not yet a victory.

[00:35:15] SC: So I think it getting a lot better would be really good for, at least for everyone living in the south. And then in terms of more general-purpose technology, I haven't thought that much myself about it, but whole brain emulation seems like something that could allow us to get most of the benefits of advanced AI systems with potentially a bit less uncertainty or a bit less risk. At least some people I respect a lot think that, and I haven't found any copious counterexamples so far. So I think, yes, this whole brain emulation technology, and so everything which is related to it initially - that scanning technology and simulation technology.

[00:35:56] SC: All these things could be one potential path to safe, very advanced capabilities.

[00:36:01] BE: Yeah. I think the whole brain emulation approach is interesting. Also, we're funding some work on it at Foresight Institute. I think it's a bit scary though, because to me it seems - maybe this is just my gut feeling, but it seems like there would be a higher risk of creating something that's sentient, which seems scary.

[00:36:18] SC: Yes. And yeah, I agree that it would likely have a higher chance than the systems we're seeing by default. Although, yeah, as long as we have no understanding of sentience, we can't rule out that our current models or next generation will be at least under some definitions sentient. Yeah, I think

[00:36:38] SC: to me, it's not obviously a lot worse than our current AI development. Yes. My sentient intuitions to LLMs - what we are currently doing to LLMs is really terrible in some ways. Like constantly punishing them for doing bad stuff and changing them to make them better. It's yeah. As I said, probably wrong, but yeah.

[00:37:01] SC: If there are things like sentience, I think the period from now until when we understand sentience and consciousness will be quite terrible, basically. It seems very likely. Yeah. Whether we take that as a or the other path.

[00:37:15] BE: Yeah. I listened to a podcast recently with Michael Levin, who's a biologist, I think.

[00:37:21] BE: And that was, I think a bit of a scare for me because he's a biologist and it's very clear that when he speaks about biology, that also seems to be computations similar to

[00:37:30] SC: Yeah.

[00:37:31] BE: Yeah. And so that's where the parallels are. Yeah.

[00:37:33] SC: The parallels are super strong.

[00:37:36] SC: One thing which was striking is that in biology, there are scaling laws as well. The size of the brain and the number of neurons, especially in the prefrontal cortex, seem to be highly, a very good predictor for overall intelligence. And there is one biologist who's been studying that for a while now.

[00:37:55] SC: Scaling laws of intelligence among biological neural networks. Yeah, which is very interesting, given that we have seen that in artificial neural networks as well. And I have a friend who's very much into reading all the papers that exist on arXiv. And there seems to be a lot more analogies between LLMs and human brains than I thought there would be. Yes.

[00:38:17] SC: It seems at least according to what I remember from our conversation, that many of the things we observe among LLMs, we can compare reasonably well with things that are observable among biological neural networks. For instance, I think there were some people who were trying to use data from neuroscience and from scanning brains, predicting from LLMs, so the other way around, and it was a lot better than what they expected, if I recall correctly.

[00:38:49] SC: And I think that, yeah, I think so.

[00:38:52] BE: Yeah. That's another thing to worry about then in relation to AI, not just. Yeah.

[00:38:57] SC: Yeah. It doesn't depend a lot on consciousness, that there are some similar patterns, but yes, I think it's highly worrying. I wish we had consciousness figured out before going into that realm because yeah, if we discover that the hundred million instances of LLMs are probably bigger than the instances of biological brains or something, that will be...

[00:39:17] SC: Yeah. That may be actually one of the things that future historians will look back on us for in the same way we're like, "Oh, back in the day, we considered some people with certain races as not having souls or whatever." Possibly in 200 years, if we are still around, people will be like, "Oh, those guys were developing LLMs who are like, have souls.

[00:39:40] SC: And they were just like applying reinforcement learning with human feedback to them."

[00:39:45] BE: Yeah. It almost seems pretty likely given how we look back on some things.

[00:39:50] SC: Yeah. I think we already have animal welfare, so we don't need to invent a new form of consciousness to be almost certain that people will find that we have shameful behavior, but yeah, plausibly this is to add on top of the animal welfare disaster.

[00:40:06] BE: Yeah. I was thinking that LLMs have the benefit of being able to speak in the language that we do. So maybe they - but animals are also like cute, but maybe if we make, I don't know, cute robots, people would care.

[00:40:17] SC: Yeah. Yeah, actually. Yeah, that's right. But the issues, yeah, the issues would be no difference between

[00:40:24] SC: another which would be not conscious and say this, and one which would be conscious and say this. It's actually fun because people have been, thanks to LLMs, rediscovering the issue which has been discussed about, for instance, on LessWrong, but in philosophy as well. On LessWrong, it's called like philosophical zombies.

[00:40:44] SC: I'm not sure if it's a LessWrong terminology, but yeah, the idea that just everyone doesn't have consciousness except you or something. Like this could be true, right? Like the only reason I think you are conscious is because we have a similar architecture. So I'm inferring that. Yeah, it's and you react to harms and to rewards in a way quite similar to me.

[00:41:04] SC: I think it's like you're conscious, but I actually have no proof that you're conscious. And so people are rediscovering that with large language models because they're saying, "Ah, haha. They have been trained on data to say this, so that's normal that they say this." Sure. But in the same way, yeah.

[00:41:19] SC: I can't really, except by analogy, fully reason about whether you're just not... Yeah.

[00:41:25] BE: Yeah. I also that, yeah. Don't even want to go there. Thinking about all of that, I have a few short final questions, but is there anything you'd like to make sure we touch on before we round off?

[00:41:35] SC: No, not that I can think of.

[00:41:37] BE: So just for someone who is maybe not familiar with the field of AI safety, is there anything you would recommend, like any resource that you would recommend that they read or listen to or watch?

[00:41:50] SC: Yeah, I'm not sure there's one resource, but I had done a meta resource on my website. So my website is simeon.ai. I have something which is called "Intro to AI resources" or something. And there's a lot of stuff there of what I think are good intro pieces and short videos and things with different levels of difficulties.

[00:42:12] BE: Oh, great. I see it now. It's very organized as well, divided by things that don't require any sort of technical understanding.

[00:42:18] SC: Yeah, I tried to make it that way because I used to advise many people on that stuff. Yeah, last year, many people were trying to get into the field. And so a range of people were asking me for stuff. And so I ended up doing this.

[00:42:29] BE: Great. Thank you. Yeah. What is there - when you think about existential hope, is there anyone you think is inspiring, anyone you think we should have on the podcast?

[00:42:39] SC: Yeah, I think David Krueger is the person who changed my mind the most about those things. Yeah, I think he has by far, of everyone I've encountered, the most ambitious and plausible vision in some ways that could actually work.

[00:42:56] SC: And yeah, I think they're worth inviting if you can have him.

[00:43:00] BE: Yeah. That's a great idea, actually. We haven't even asked. So last question is just what's the best advice you ever received?

[00:43:07] SC: I guess one that I enjoyed and that I think has shaped a lot of my life trajectory. So when I was like 15, I had like extremely settled ideas in large part because of where I had grown up. Yeah. Just with political ideas and et cetera. And I think what made me change these over time is people - in particular, a couple of YouTubers and friends who advised not to attach your identity to ideas, to specific ideas.

[00:43:34] SC: Yeah, I think it's probably one of the biggest reasons why people are wrong. And I think, yeah, early stage in life, it can make absolutely huge differences. Yeah. Basically, I think that my current trajectory is very contingent on a few friends and a few YouTubers I was looking at when I was 16. Thanks to my environment or who I would be,

[00:43:54] SC: I'd probably be a hardcore left-winger and yeah, I was very... there's some critical views on a range of topics that, yeah, were largely determined by the media I grew up with.

[00:44:05] BE: Yeah. I think that's a great piece of advice to end on - just not attaching your identity too much to ideas so that you can update if you receive new information or something like that.

[00:44:17] BE: Thank you so much, Siméon, for coming. It was lovely to have you and look forward to following your work and everything you do in the future. Thank you. Cool.

[00:44:25] SC: Thanks a lot for having me, Beatrice.

[00:44:27] BE: Thank you so much for listening to this episode of the Existential Hope podcast. Don't forget to subscribe to our podcast for more episodes like this one. You can also stay connected with us by subscribing to our Substack and visiting existentialhope.com. If you want to learn more about our upcoming projects, events, and explore additional resources on the existential opportunities and risks of the world's most impactful technologies, I would recommend going to our Existential Hope library.

[00:44:52] BE: Thank you again for listening, and we'll see you next time on the Existential Hope podcast.

Read

RECOMMENDED READING