INTELLIGENCE

With greater intelligence, what new questions could we explore and what answers could we find?

What if AI could revolutionize the way we tackle global challenges? Advances in AI can transform how we solve problems and make decisions, in almost all areas, from healthcare to climate change.

Learn all about intelligence under "Intro," understand potential risks under "Risks," discover the most hopeful scenarios under "Hope," and find out how to get involved under "Action."

Intelligence

What is intelligence? Luke Muehlhauser. Attempts an explanation of intelligence.
71 Definitions of Intelligence - Shane Legg. Reviews various explanations of intelligence.
Society of Mind - Marvin Minsky. Classic on how intelligence on the level of the individual and society arises from individual agents cooperating with each other.
What is Intelligence? - Kurzgesagt. Intro-level explainer video defining intelligence.
Forecasting TAI with Biological Anchors - Ajeja Cotra. Report on what we can learn from human intelligence for forecasting artificial intelligence.

‍

Artificial (General) Intelligence

The Artificial Intelligence Revolution - Wait But Why. Intro-level post summarizing potential implications of Artificial (General) Intelligence.
A Reply to WBW on Machine Intelligence - Luke Muehlhauser. Responds to the WBW post on AI with clarifications and corrections.
Superintelligence - Nick Bostrom. Explores different take-off scenarios of AI, why they’re dangerous and possible preparatory actions.
Life 3.0 - Max Tegmark. On how AI affects crime, war, justice, jobs, society and our understanding of being human. See also: Benefits & Risks of AI, an intro-level post on the benefits and risks of AI with various links to background material.
Human Compatible: AI And The Problem of Control - Stuart Russell. On near-term benefits from intelligent personal assistants, accelerated scientific research, to dangers and responses to superhuman AI.
‍Benefits & Risks of Artificial Intelligence - FLI. Introductory post on AI with link list.
How Sure Are We About This AI Stuff? - Ben Garfinkel.
AGI Futures – Roon, Roon’s Blog. Delves into various topics related to technology, AI, and digital philosophy, via a mix of formats and thematic explorations.

‍

Near-term

Malicious Use of AI Report - Miles Brundage et. al. Report on various risks arising from near-term and longer-term progress in AI, and potential policy and technology approaches to address those risks.
Slaughterbots - Stop Autonomous Weapons. Fictional short video on the dangers of lethal autonomous weapons.
Information Security Concerns for AI & The Long-term Future - Jeff Ladish. Introduces information security as a crucial problem that is currently undervalued by the AI safety community.
Teachable Moment Dual Use - Lawfare Podcast. Interviews two scientists who created an AI-powered molecule generator that could design thousands of new biochemical weapons within hours.

‍

Intelligence Takeoff

AI as Positive and Negative Factor in Global Risk - Eliezer Yudkowsky. On the implications of advanced AI for other risks, such as molecular nanotechnology.
Intelligence Explosion Microeconomics - Eliezer Yudkowsky. On the likelihood and dangers of a fast take-off. More recent video interview with Yudkowsky on this topic: We Are All Going to Die.
The AI Foom Debate - Robin Hanson and Eliezer Yudkowsky on hard vs. slow take off. Also in varied form as recording: Yudkowsky vs Hanson: Singularity Debate
Slow Takeoff vs Fast Takeoff - Paul Christiano. Argues for a slow takeoff.
Yudkowsky Contra Christiano on Takeoff Speeds - Astral Codex Ten. Summarizes a recent debate between Eliezer Yudkowsky and Paul Christiano on AI takeoff speeds.
Why Not Wait on AI Risk? - Robin Hanson. Repeats Hanson’s skepticism of a fast take-off, asking for evidence to the contrary.
Heretical Thoughts on AI Alignment - Eli Dourado. Skepticism on shortening AI timelines.

‍

Alignment

AGI Ruin: A List of Lethalities - Eliezer Yudkowsky. Forty-three reasons that make Yudkowsky pessimistic about our world being able to solve AGI safety. A longer list of dangers can be found in Rationality: From AI to Zombies - Eliezer Yudkowsky. Especially My Naturalist Awakening, That Tiny Note of Discord, Sympathetic Minds, Sorting Pebbles into Heaps, No Universally Compelling Argument, The Design of Minds in General, Detached Lever Fallacy, Ethical Injunctions, Magical Theories, Fake Utility Functions.
The Basic AI Drives - Steve Omohundro. On fundamental drives that may be inherent in any artificially intelligent system and their dangers.
Orthogonality Thesis - Nick Bostrom. On why an increase in intelligence does not have to be correlated with alignment in human values.
AI Alignment & Security - Paul Christiano. On how the relationship between security and alignment concerns is underappreciated.
Eliciting Latent Knowledge - Paul Christiano, Ajeya Cotra, Mark Xu. On how to train models to elicit knowledge of off-screen events that is latent to the models.
Artificial Intelligence, Values and Alignment - Iason Gabriel. On philosophical considerations in relationship to AI alignment, proposing to select principles that receive wide-spread reflective endorsement

‍

Coordination

Reading Guide for the Global Politics of Artificial Intelligence - Allan Dafoe. A drop box link to a reading list about the geopolitical implications of advanced artificial intelligence.
Smart Policies for Artificial Intelligence - Miles Brundage, Joanna Bryson. Argues that it is not too early for policy considerations around increasingly advanced AIs and what we can learn from previous policy failures.
A Survey of Artificial General Intelligence Projects for Ethics, Risk, and Policy, Seth Baum. A now slightly dated map of AI and AI risk projects.
Public Policy Desiderata in the Development of Machine Superintelligence - Nick Bostrom, Allan Dafoe, Carrick Flynn. An extensive introduction to how advanced AI affects various policy domains.
Deciphering China’s AI Dream - Jeffrey Ding. Analysis of China’s mission to lead the world in AI, including implications for international cooperation.
Racing to the Precipice: A model of Artificial Intelligence Development - Stuart Armstrong, Nick Bostrom, Carl Schulman. Game-theoretic analysis of factors influencing race dynamics in AI, such as information or number of players.
AI And The Future of Defense - Stephan De Spiegeleire, Matthijs Maas, Tim Sweijs. Summarizes AI history and approaches, and defense history and approaches, before combining both topics to point out novel concerns.
Unilateralist’s Curse: The Case for a Principle of Conformity - Anders Sandberg, Nick Bostrom, Tom Douglas. On the dangers of individual actors racing ahead in AGI development.
Superintelligence: Coordination & Strategy - Roman Yampolskiy, Allison Duettmann. Collection of papers on challenges and strategies for ensuring a cooperative development of advanced AI.
International AI Institutions - Matthijs Maas. Reviews seven different types of AI institutions.
Structured Transparency in AI Governance - Emma Bluemke et al. Discusses cryptography-enabled multipolar regulatory markets for AI.

‍

Fiction

Daemon - Daniel Suarez. On the disastrous near-term implications of AI.
Autonomous - Annalee Newitz. On dangers of biotechnology, AI and the intersection of both.
After On - Rob Reid. on near-term risks of AI-infused social-media.‍
That Alien Message - Eliezer Yudkowsky. On AI revealing itself to humanity.

Strategies

Long-term Strategies to End Existential Risk from Hard Take-off - Daniel Dewey. Specifically addresses hard take-off risk mitigation strategies, for instance via international cooperation and a different technological advantage.
Avoiding the Precipice: Race Avoidance in the Development of AI - Olga Afanasjeva, Jan Feyereisl, Marek Havdra. Maps out various races with their actors, and potential responses.
Why AI risk Might be Solved Without Additional Intervention From Long-termists - AI Alignment Forum. Summarizing four interviews with AI safety researchers who are rather optimistic about AI alignment.
Existential Hope Transformative AI Institution Design Hackathon - Foresight Institute. Nine proposals for new governance mechanisms for transformative AI.

‍

Value Alignment

Research Priorities for Robust AI - Stuart Russell, Max Tegmark, Daniel Dewey. Paper outlining priorities for making progress on robust AI.
Directions and Desiderata for AI alignment - Paul Christiano. His AI Alignment website is generally a good resource. Post outlining priorities for AI Alignment.
The Landscape of AI Safety and Beneficence Research - Richard Mallah. Overview of the AI safety landscape.
Value Alignment Landscape - Lucas Perry. Overview of the sub-field of Value Alignment.
Coherent Extrapolated Volition - Eliezer Yudkowsky. Classic on how to align super intelligent systems with an idealized account of human values.
Objections to Coherent Extrapolated Volition - Lesswrong. Criticism of CEV.
Intro to Brain-like AGI Safety - Steven Byrnes. A summary of posts on brain-inspired AI safety approaches.
Brain Enthusiasts in AI Safety - Samuel Nellessen. On brain-inspired approaches to AI safety.
Our Approach to AI Alignment Research - Jan Leike. Describes OpenAI’s alignment approach.
An Open Agency Architecture for Safe Transformative AI - Davidad. A proposal for an AI system that ends the risk period without causing catastrophic harm.
Why I Am Not (As Much Of) A Doomer (As Some People) – Scott Alexander. Why there is still room for some optimism when deliberating the probability of AI-induced human extinction.
What Success Looks Like – Marius Hobbhahn et al. Discusses scenarios and strategies for ensuring the safe development and deployment of transformative artificial intelligence (TAI), exploring different situations where AI alignment might or might not be an issue.

‍

Decentralized AI

Reframing Superintelligence, Intelligence Distillation, QNRS: Toward Language Models for Machines - Eric Drexler. Provides an alternative approach to AGI in which agents are a class of service-providing products, rather than a singleton-like engine of progress in themselves.
Building Safe AI - Andrew Trask. Describes how federated learning could be leveraged to build an AI system that can produce insights based on the encrypted data of two mutually suspicious parties without itself gaining access to the data or leaking any information about its own algorithms.
The Long-Tail Problem in AI and How Autonomous Markets Can Solve Them - Ali Yahya. On how decentralized autonomous hiveminds can incentivize local knowledge for problem-solving in a superior way to top down singleton-like systems, potentially creating a viable alternative.
Blockchain-based ML Market Places - Fred Ehrsam. On how crypto market places filled by ML and opt-in privacy can create decentralized alternatives to top down AIs.
Safe AGI via DLT - Kristen Carlson. On how distributed ledger technologies and related innovations may help create a safe ecosystem for the development of AGI.
Pluralism Through Personalized AIs - Steve Omohundro. On how personal AI assistants can support human goals and cooperation across them.
Open Source Game Theory is Weird - Andrew Critch. Introduces open source game theory.
Gaming the Future - Allison Duettmann, Mark S. Miller, Christine Peterson. Introduces a decentralized approach to advanced AI that focuses on improved problem-solving arising from the cooperation of individual agents. Also: Civilization as Relevant Superintelligence, or Welcome New Players: AIs | Steve Omohundro & Trent McConaghy - Foresight Institute.
Decentralized Approaches to AI Presentations | Robin Hanson, Eric Drexler, Mark S. Miller 1, Decentralized Approaches to AI Discussion | Robin Hanson, Eric Drexler, Mark S Miller 2 - Foresight Institute. Panel discussion amongst the three major proponents of a decentralized approach to AI.
Debate - Geoffrey Irving. Proposes an AI safety technique that relies on various agents debating each other, judged by a human.
Using GPT-Eliezer against ChatGPT Jailbreaking - Stuart Armstrong, Rebecca Gorman. On using ChatGPT to detect prompts that are dangerous to feed into other AI chatbots, such as ChatGPT.

‍

Fiction

Understand - Ted Chiang. On the promises of dramatically enhanced understanding.
GPT-3 Fiction - Gwern. Fiction written by GPT-3.
AI Aftermath Scenarios - Max Tegmark. Surveys twelve potential long-term scenarios arising from AI, classified according to different utopian or dystopian ideals.
The Three Body Problem - Cixin Liu. Sci-fi classic on AI alien contact.

‍

Staying Up to Date

AI Alignment Newsletter - A weekly publication with recent content relevant to AI alignment with over 2600 subscribers. See also the AI Alignment Database spreadsheet.
AI Alignment Forum - A single online hub for researchers to discuss all ideas related to ensuring that transformatively powerful AIs are aligned with human values.
AIsafety.com - AI safety reading list meeting regularly virtually.
Beneficial AI Conference - FLI Youtube channel. Conference recordings of the Beneficial AI conference with leading researchers and practitioners in the field.
AI Safety & Policy Job Board - 80,000 Hours.
Any of the newsletters of the organizations below.
AI Safety Support: Lots of Links. A collection of resources for everything AI Safety, including fellowships and training programmes, news and updates, research agendas, AI Safety organizations and initiatives, major publications, and support.

‍

Organizations on the Cause

Guide to Working in AI Policy - Miles Brundage. On potential roles, requirements, and organizations in AI policy.
AI Safety Camp - Connects you with an experienced research mentor to collaborate on their open problem during intensive co-working sprints – helping you try your fit for a potential career in AI Safety research.
MIRI (Machine Intelligence Research Institute) - MIRI's artificial intelligence research is focused on developing the mathematical theory of trustworthy reasoning for advanced autonomous AI systems.
OpenAI - A non-profit artificial intelligence research company that aims to promote and develop friendly AI in such a way as to benefit humanity as a whole.
DeepMind - Especially the Deepmind Ethics & Society Unit. See also: The Mind of Demis Hassabis for an overview of the thinking of DeepMind’s co-founder.
Center for Human-Compatible AI - CHAI's goal is to develop the conceptual and technical wherewithal to reorient the general thrust of AI research towards provably beneficial systems.
Anthropic - A research company that’s working to build reliable, interpretable, and steerable AI systems.
Future of Humanity Institute - Future of Humanity Institute is a multidisciplinary research institute working on Existential Risk at the University of Oxford.
Future of Life Institute - A volunteer-run research and outreach organization in the Boston area that works to mitigate existential risks facing humanity, particularly existential risk from advanced artificial intelligence.
GovAI - Building a global research community, dedicated to helping humanity navigate the transition to a world with advanced AI
Leverhulme Centre for the Future of Artificial Intelligence - A global community to ensure that AI benefits all of humanity.
AI Objectives Institute - The objective of this organization is to help humanity pick better objectives for AI systems, markets, and other large-scale optimization processes.
Aligned AI - A benefit corporation dedicated to solving the alignment problem – for all types of algorithms and AIs, from simple recommender systems to hypothetical superintelligences.
Ought - A product-driven research lab that develops mechanisms for delegating open-ended thinking to advanced machine learning systems.
OpenMined - Help each member of society to answer their most important questions by empowering them to learn from data owned and governed by others.
Foresight Institute - Supports the beneficial development of high-impact technologies to make great futures more likely.
AI Startups in SF - Overview of AI startups based in San Francisco.

‍