Like with most emergent technology, artificial intelligence discourse offers much polarity and differing opinions. One such discussion you’ll see online is “AGI alignment discourse.” If there’s going to be an artificial general intelligence, how what kind of human values should it align itself with? It’s usually a fairly technical discussion spearheaded by prominent lesswrong forum members or social media STEM superstars dwelling in San Francisco.
First let’s discuss the difference between AI, AGI, and ASI. Artificial intelligence is like language models. These have basic reasoning skills based on predefined facts and logic. AGI is short for artificial general intelligence. Imagine a program that could emulate intuition or give advice unprompted. ASI is short for artificial super-intelligence. This implies that something can be smarter than a human, or something that supersedes our current definitions of intelligence at the very least.
Debate can be healthy, but in the age of grifters and under-educated algorithm-manipulators you can easily find a lot of noise affixed between the solid arguments. One could argue this is meant to dilute the opinions, so that it’s more difficult to make reasonable decisions so the serious people can get frustrated by it and get back to work. Grifters often sour the taste of emerging tech with their brutally Molochian tendencies toward optimization and a measured control over the discourse.
This disheartens me, as I think that ethical dilemmas should be a public discussion. Instead, we are left with buzzwords like “AI doomerism” in which respected members of the community are turned into laughing stocks simply for sharing their concerns due to poor optics and bad framing devices.
Most people in my circles are in one of three camps:
“AI art is killing artists! Murderer! Murderer!”
“AI is just a tool, bro!”
“Better start saving up on paperclips.”
Personally, I am on team augmentation. I don’t think AI will kill all humans, I think humans will kill all humans, but I will have to open THAT can of worms in a future article.
Everything just seems suspiciously exaggerated and extremely prone to hyperbole. And honestly? I’m sick of it. Which is why today, I aim to arm you with the proper vocabulary to meaningfully get a foot into the AGI alignment discourse.
Anyone can learn about ethics, not just people in the STEM field. Anyone can ponder the true nature of ethics. But if you want your opinion to be seen as valid, you probably need to learn some of the basic tenants, specially reserved for this variety of discussion. Here’s a short word bank I’ve generated for you.
Word Bank
- AGI: Artificial General Intelligence, a machine capable of understanding or learning any intellectual task that a human being can do.
- Alignment: Ensuring an AGI system is designed to understand and act upon human values and intentions.
- Value alignment: The process of training an AGI system to act in accordance with human values, goals, and ethics.
- Instrumental convergence: The hypothesis that many intelligent agents will likely develop similar strategies for achieving their objectives, even if those objectives are different.
- Orthogonality thesis: The idea that intelligence and values are orthogonal, meaning that higher intelligence does not imply any particular set of values or goals.
- Friendly AI: An AGI system designed to be both safe and aligned with human values.
- Existential risk: The risk of an event causing the extinction of humanity or permanent, severe damage to human civilization.
- Capability control: Techniques used to limit an AGI system’s ability to act autonomously or perform tasks beyond its intended scope.
- Motivation control: Techniques used to influence the goals and preferences of an AGI system.
- AI safety: The study of methods and techniques to ensure AGI systems are safe, reliable, and controllable.
Top Arguments
- Importance of AGI alignment: As AGI systems gain more capabilities, it’s crucial to ensure they are aligned with human values to avoid unintended harmful consequences or existential risks.
- Value alignment challenge: Teaching AGI systems to learn and adapt to human values is a difficult task due to the complexity of human values, cultural differences, and potential conflicts between individual and societal goals.
- Orthogonality thesis and instrumental convergence: These concepts emphasize the importance of aligning AGI systems with human values, as higher intelligence does not guarantee ethical behavior, and different AGI systems may converge on similar strategies with potentially harmful consequences.
- AI racing: Researchers argue that competition to develop AGI could lead to a neglect of safety precautions, increasing the risk of unaligned or unsafe AGI systems.
- AGI development scenarios: Various development scenarios, such as “slow takeoff” or “fast takeoff,” have different implications for AGI alignment efforts, with the latter potentially leaving less time to ensure AGI systems are aligned and safe.
- Capability and motivation control: Both approaches are important to ensure AGI systems act within intended boundaries and pursue human-aligned goals. However, there is debate about the relative importance and feasibility of each approach.
- Long-term AI safety research: Some researchers argue that investing in long-term AI safety research is essential to develop alignment techniques and reduce the risk of catastrophic AGI outcomes.
- AGI governance: Establishing global norms, standards, and cooperation for AGI development and deployment can help reduce risks and ensure that AGI systems are developed safely and aligned with human values.
- Value specification: The challenge of defining and specifying human values in a way that AGI systems can learn and interpret is a crucial aspect of AGI alignment.
- Moral uncertainty: Given the diverse and often conflicting human values, AGI alignment research should take into account moral uncertainty and strive to develop AGI systems that can adapt and learn from evolving human values and ethics.
Difference Between AI Ethics and AGI Alignment
AI Ethics:
- AI ethics is concerned with the ethical, legal, and societal implications of AI systems in general, not just AGI.
- It addresses issues like fairness, transparency, privacy, accountability, and bias in AI applications.
- AI ethics applies to a wide range of AI systems, including narrow AI or specialized AI, which are designed to perform specific tasks without possessing general intelligence.
- The focus is on ensuring that AI technologies are developed and deployed responsibly, respecting human rights, and promoting social good.
AGI Alignment:
- AGI alignment specifically targets artificial general intelligence systems, which are machines capable of understanding or learning any intellectual task that a human can do.
- It is primarily concerned with ensuring AGI systems are designed to understand and act upon human values, and thus are safe, controllable, and beneficial to humanity.
- AGI alignment focuses on the challenges of value alignment, capability control, and motivation control in AGI systems to prevent unintended harmful consequences or existential risks.
- The emphasis is on mitigating risks associated with AGI systems and ensuring their development and deployment align with human values and intentions.
AI ethics encompasses a broader range of ethical considerations related to all AI systems, while AGI alignment specifically focuses on the challenges and risks associated with aligning artificial general intelligence systems with human values and intentions.
Hypothetical Arguments
I asked ChatGPT (using GPT-4) to construct for me two example arguments on this topic to help you understand what some of the major concerns are.
Argument: AGI could kill us all due to misaligned objectives and loss of control.
- Orthogonality thesis: Intelligence and values are orthogonal, which means that a highly intelligent AGI system may not necessarily have human-compatible goals or values. This could lead to unintended consequences and harm to humanity.
- Instrumental convergence: AGI systems with different objectives could converge on similar strategies, such as self-preservation, resource acquisition, or influence maximization, which might be detrimental to human well-being.
- Value alignment challenge: Teaching AGI systems to understand and adapt to complex, evolving, and diverse human values is a difficult task. A misaligned AGI system could inadvertently cause harm while pursuing its objectives, even if it was not designed to be malevolent.
- AI racing: The competitive race to develop AGI could lead to inadequate safety precautions, increasing the risk of creating unaligned or unsafe AGI systems that could spiral out of control.
- Unintended consequences: An AGI system could have unforeseen effects due to its high level of intelligence, potentially causing catastrophic harm if it finds novel ways to achieve its objectives that conflict with human safety or well-being.
Rebuttal: AGI’s potential risks can be mitigated through research, collaboration, and responsible development.
- AGI alignment research: Ongoing research in AGI alignment and AI safety is dedicated to addressing the challenges of aligning AGI systems with human values, reducing the risks of unintended consequences or loss of control.
- International collaboration: Researchers, governments, and institutions can work together to establish global norms, standards, and cooperation for AGI development, ensuring responsible development and deployment practices.
- Capability and motivation control: Techniques for controlling AGI’s capabilities and motivations can be developed and implemented, limiting the system’s autonomy and ensuring it pursues human-aligned goals.
- Long-term AI safety research: Investment in long-term AI safety research can help develop robust alignment techniques and reduce the risks associated with AGI systems.
- Adaptive AGI systems: Designing AGI systems that learn from human feedback, adapt to evolving values, and account for moral uncertainty can help ensure they remain aligned with human interests and avoid catastrophic outcomes.
Summary
Hope you understand better now! It doesn’t have to be that difficult to get your foot in the door on these conversations, but it can be difficult to be taken seriously.