How OpenAI aims to fend off risks posed by AI

By Sanjana Shankar

Oct 27, 2023

10:52 am

What's the story

OpenAI, the company behind the widely popular chatbot ChatGPT, has established an internal group to examine and address potential "catastrophic risks" associated with AI models. The group, which is called "Preparedness," will " track, evaluate, forecast, and protect" against future AI system threats. These include their capacity to manipulate and deceive humans, for example in phishing attacks, or by creating harmful code.

Details

The team will assess AI risks including nuclear hazards

The Preparedness team will investigate a range of risk categories related to AI models, including "chemical, biological, radiological, and nuclear" hazards as well as "autonomous replication," or the act of an AI replicating itself. The organization is also interested in examining "less obvious" aspects of AI risk and has initiated a contest to gather ideas for risk studies from the public. The top ten entries will be awarded $25,000 (about Rs. 20.8 lakh)and a position at Preparedness.

Facts

Preparedness as an AI safety SWAT team

The Preparedness group will function as an AI safety SWAT team and will conduct thorough assessments of OpenAI's cutting-edge AI models. The group will "red team" OpenAI's own AI systems to proactively identify weaknesses. They will be responsible for developing a "risk-informed development policy" (RDP), that will outline OpenAI's strategy for creating AI model evaluations and monitoring tools. The team will be led by Aleksander Madry, the director of MIT's Center for Deployable Machine Learning.

Insights

OpenAI previously announced a team to manage 'superintelligent' AI forms

The announcement of the Preparedness team aligns with a significant UK government summit on AI safety. This comes after OpenAI's earlier declaration that it would create a team to research, guide, and manage emerging "superintelligent" AI forms. Both OpenAI CEO Sam Altman and Ilya Sutskever, OpenAI's chief scientist and co-founder, think that AI surpassing human intelligence could emerge within ten years. Such AI might not necessarily be benevolent, thereby requiring investigation into methods to limit and control it.