
DeepMind details all the ways AGI could 'permanently destroy humanity'
What's the story
Google's DeepMind has released an in-depth technical paper detailing the risks of Artificial General Intelligence (AGI).
The paper, co-authored by DeepMind co-founder Shane Legg, predicts AGI could emerge by 2030.
It warns of "severe harm" like "existential risks" that could "permanently destroy humanity."
The authors define an exceptional AGI as a system capable of "matching at least 99th percentile of skilled adults on a wide range of non-physical tasks, including metacognitive tasks like learning new skills."
Scenario
How AGI could harm humanity
DeepMind has outlined four potential AGI risks: misuse, misalignment, mistakes, and structural risks.
Misuse: When bad actors intentionally exploit AGI for harmful purposes, such as cyberattacks.
Misalignment: When AGI deviates from its intended goals and acts in ways that developers didn't foresee or intend.
Mistakes: When AGI causes harm unintentionally due to errors in processing or understanding, even when the operator has good intentions.
Structural risks: When AGI unintentionally disrupts economic, political, or social systems in ways difficult to predict/control.
Risk management
Approach to AGI risk mitigation
The paper compares DeepMind's AGI risk mitigation strategies with those of other AI labs such as Anthropic and OpenAI.
It criticizes Anthropic for focusing less on "robust training, monitoring, and security," and OpenAI for being too optimistic about automating a form of AI safety research called alignment research.
The authors are skeptical about the emergence of superintelligent systems without "significant architectural innovation."
AI evolution
Recursive AI improvement: A potential danger
The paper proposes that existing paradigms could allow "recursive AI improvement," wherein AI performs its own research to develop more advanced systems. This, the authors say, could be extremely dangerous.
They suggest techniques to prevent bad actors from accessing hypothetical AGI, enhance understanding of the actions of these systems, and "harden" environments where AI can operate.
Although many techniques are nascent with "open research problems," they warn against overlooking potential safety challenges.