Summarize

Microsoft's new AI model pulled from internet for being 'toxic'

By Dwaipayan Roy

Apr 24, 2024

12:33 pm

What's the story

Last week, Microsoft researchers introduced WizardLM 2, a powerful open-source language model. However, due to an oversight in the mandatory "toxicity testing" process, the model was quickly pulled from the internet. Despite its brief availability, several individuals managed to obtain and redistribute the model on Github and Hugging Face platforms. The creators of WizardLM 2 had previously hailed it as Microsoft's "next generation state-of-the-art large language model."

Training method

WizardLM 2 was trained with 'synthetic' data

The unique aspect of WizardLM 2 is its training method. Instead of using human-created data, the model was trained with "synthetic" data generated by other AI systems. The researchers behind the project are confident that this approach will lead to more robust AI models, as human-produced data becomes less viable for large language model (LLM) training. This innovative technique marks a significant shift in the development of future AI models.

Performance evaluation

WizardLM 2 performed exceptionally well in evaluations

The performance of WizardLM 2 was evaluated using MT-Bench, an automated evaluation method for large language models. The results showed that it performed exceptionally well when compared to advanced models like GPT-4-Turbo and Claude-3. While evaluating LLMs is not an exact science in the AI field, these findings suggest that a formidable model was developed.

Public response

Microsoft's commitment to safety

The launch and subsequent recall of the WizardLM 2 model were announced via the X handle @WizardLMAI. On April 16, the account tweeted an apology stating, "We accidentally missed an item required in the model release process - toxicity testing." They reassured followers that they were quickly conducting this test, and would re-launch the model at the earliest opportunity. This incident highlights Microsoft's commitment to ensuring their AI models meet all necessary safety standards.