
Meta unveils Llama 4—its most advanced family of AI models
What's the story
Meta has unveiled a new series of AI models under its Llama family, called Llama 4.
The latest addition comes with three different models: Llama 4 Scout, Llama 4 Maverick, and the still-in-training Llama 4 Behemoth.
These advanced models were trained on vast amounts of unlabeled text, image, and video data to ensure a comprehensive visual understanding.
Inspiration
Models inspired by Chinese AI lab DeepSeek
Reportedly, the development of the Llama 4 series was accelerated by the success of open models from Chinese AI lab DeepSeek.
These models have shown performance on par with or even surpassing Meta's previous flagship Llama models.
This led Meta to investigate how DeepSeek managed to cut costs associated with running and deploying sophisticated models like R1 and V3.
Accessibility
Now available on multiple platforms
The Llama 4 Scout and Maverick models are now available on Llama.com and via Meta's partners, including AI development platform Hugging Face. The Behemoth model, however, is still in training.
Additionally, Meta has refreshed its AI-powered assistant across apps like WhatsApp, Messenger, and Instagram to use the new Llama 4 models in 40 countries.
Licensing
Llama 4 models face licensing restrictions
The license for the Llama 4 models has also sparked some controversy among developers.
Users and companies based in the EU are prohibited from distributing these models due to governance requirements imposed by the region's AI and data privacy laws.
Meanwhile, companies with over 700 million monthly active users must seek a special license from Meta, which can be granted or denied at Meta's discretion.
Architecture
Utilizing mixture of experts architecture
The Llama 4 series is also Meta's first set of models to use a mixture of experts (MoE) architecture. This makes them more efficient to train and answer queries.
For example, Maverick has 400 billion total parameters, but only 17 billion active parameters across 128 "experts."
Similarly, Scout has 17 billion active parameters, 16 experts, and 109 billion total parameters.
Performance
Maverick and Scout excel in various tasks
According to internal testing by Meta, Maverick beats models such as OpenAI's GPT-4o and Google's Gemini 2.0 on some coding, reasoning, multilingual, long-context, and image benchmarks.
Meanwhile, Scout shines in tasks such as document summarization and reasoning over large codebases thanks to its massive context window of up to 10 million tokens.
Scout can operate on a single NVIDIA H100 GPU, whereas Maverick needs an NVIDIA H100 DGX system or equivalent.
Top-tier
Take a look at Behemoth model's performance
Meta's upcoming Behemoth model will demand even more powerful hardware.
The company says it features 288 billion active parameters, 16 experts, and nearly two trillion total parameters.
Internal benchmarks show Behemoth outperforming GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro (though not 2.5 Pro) in various STEM-focused evaluations, including math problem solving.
However, neither of the three models is a "reasoning" one like OpenAI's o1 and o3-mini. Reasoning models verify their answers and usually provide more reliable responses to questions.
Response
Responding to contentious questions
Meta has tuned all its Llama 4 models to refuse to answer "contentious" questions less often.
The company claims these models respond to "debated" political and social topics that previous Llama models wouldn't touch.
"[Y]ou can count on [Lllama 4] to provide helpful, factual responses without judgment," a Meta spokesperson told TechCrunch, hinting at a shift toward more open discussions on controversial subjects.
These adjustments follow criticism from some White House allies who claim AI chatbots are overly politically "woke."