Meet LLaMA: Meta's large language model for generative AI researchers
Meta has followed Open AI's ChatGPT and Google's Bard into the generative AI fray. The company's new language model, LLaMA, is heating up the AI arms race. It has been marketed as a more responsible alternative to the AI chatbots from the Microsoft-OpenAI alliance and Google. LLaMA has been released under a noncommercial license "to maintain the integrity and prevent misuse."
Why does this story matter?
The battle for dominance in the AI technology space began last year when OpenAI's ChatGPT captivated the world's attention, prompting tech giants like Alphabet Inc and Baidu to introduce their own offerings. Now, Meta, which has long been at the forefront of AI research and development has introduced LLaMA, to ensure that the company is not left behind in the supercharged generative AI market.
What is LLaMA?
According to Meta, LLaMA (Large Language Model Meta AI) is a "foundational, 65-billion-parameter large language model." It has been developed to support researchers in the field of AI, and solve issues concerning AI language models. LLaMA is a "smaller, more performant" model that will allow research communities who don't have access to large amounts of infrastructure to study AI models, claims Meta's official blog.
The model uses "far less" computing power than other LLMs
LLaMA uses a set of words as input and predicts the next word to recursively create text - similar to other Large Language Models. Usually, LLMs mine enormous amounts of text for the purpose of summarizing information and creating content. However, Meta's LLaMA consumes "far less" computing power than earlier solutions. It could outperform competitors who take into account more parameters or variables.
The LLaMa model with 13-billion parameters can beat GPT-3
LLaMA is a set of foundation language models with 7B, 13B, 33B, and 65B parameters. The LLaMA 65B and 33B models were trained on 1.4 trillion tokens, whereas the LLaMA 7B model, on one trillion tokens. Specifically, LLaMA 13B can outperform Open AI's GPT-3 (175B) while being over ten times smaller. The 65B LLaMA model is comparable to DeepMind's Chinchilla70B and Google's PaLM-540B models.
LLaMA will be available to government, civil, and academic bodies
LLaMA will be available under a non-commercial license to researchers/entities affiliated with the government/civil bodies or academia. "We hope that releasing LLaMA to the research community will accelerate the development of large language models, improve their robustness, and mitigate known issues like toxicity/bias," said Meta's blog. The company intends to publish bigger models, trained on more extensive pre-training datasets, in the future.