Anthropic recruits ex-OpenAI safety chief to lead new 'Superalignment' team
Jan Leike, a prominent AI researcher who recently resigned from OpenAI, has joined Anthropic, a rival firm. At Anthropic, Leike will lead a new team focusing on "Superalignment," a concept related to AI safety and security. The team's work will revolve around "scalable oversight," "weak-to-strong generalization," and automated alignment research. Leike's move to Anthropic comes after publicly criticizing OpenAI's approach to AI safety.
Leike's role at Anthropic
Leike will be reporting directly to Jared Kaplan, Anthropic's Chief Science Officer, in his new role. As he builds his team at Anthropic, researchers who are currently working on scalable oversight, will transition to report to him. Scalable oversight involves techniques used to manage large-scale AI's behavior in predictable as well as desirable ways. The mission of Leike's new team at Anthropic mirrors that of OpenAI's recently dissolved Superalignment team, which he previously co-led.
Anthropic's commitment to AI safety
Anthropic has often positioned itself as more focused on safety than OpenAI. The company's CEO, Dario Amodei, was once the VP of research at OpenAI. He reportedly split with OpenAI following a disagreement over the company's direction — namely its growing commercial focus. Amodei brought with him several ex-OpenAI workers to launch Anthropic, including OpenAI's former policy lead Jack Clark.