Meet LLaMA: Meta's large language model for generative AI researchers

By Akash Pandey

Feb 25, 2023

03:36 pm

What's the story

Meta has followed Open AI's ChatGPT and Google's Bard into the generative AI fray. The company's new language model, LLaMA, is heating up the AI arms race. It has been marketed as a more responsible alternative to the AI chatbots from the Microsoft-OpenAI alliance and Google. LLaMA has been released under a noncommercial license "to maintain the integrity and prevent misuse."

Context

Why does this story matter?

The battle for dominance in the AI technology space began last year when OpenAI's ChatGPT captivated the world's attention, prompting tech giants like Alphabet Inc and Baidu to introduce their own offerings. Now, Meta, which has long been at the forefront of AI research and development has introduced LLaMA, to ensure that the company is not left behind in the supercharged generative AI market.

Details

What is LLaMA?

According to Meta, LLaMA (Large Language Model Meta AI) is a "foundational, 65-billion-parameter large language model." It has been developed to support researchers in the field of AI, and solve issues concerning AI language models. LLaMA is a "smaller, more performant" model that will allow research communities who don't have access to large amounts of infrastructure to study AI models, claims Meta's official blog.

Scenario

The model uses "far less" computing power than other LLMs

LLaMA uses a set of words as input and predicts the next word to recursively create text - similar to other Large Language Models. Usually, LLMs mine enormous amounts of text for the purpose of summarizing information and creating content. However, Meta's LLaMA consumes "far less" computing power than earlier solutions. It could outperform competitors who take into account more parameters or variables.

Prowess

The LLaMa model with 13-billion parameters can beat GPT-3

LLaMA is a set of foundation language models with 7B, 13B, 33B, and 65B parameters. The LLaMA 65B and 33B models were trained on 1.4 trillion tokens, whereas the LLaMA 7B model, on one trillion tokens. Specifically, LLaMA 13B can outperform Open AI's GPT-3 (175B) while being over ten times smaller. The 65B LLaMA model is comparable to DeepMind's Chinchilla70B and Google's PaLM-540B models.

Availability

LLaMA will be available to government, civil, and academic bodies

LLaMA will be available under a non-commercial license to researchers/entities affiliated with the government/civil bodies or academia. "We hope that releasing LLaMA to the research community will accelerate the development of large language models, improve their robustness, and mitigate known issues like toxicity/bias," said Meta's blog. The company intends to publish bigger models, trained on more extensive pre-training datasets, in the future.