Democratizing AI: US-based start-up releases ChatGPT-like models for free
Cerebras Systems, an artificial intelligence chip start-up based in Silicon Valley, has released ChatGPT-like models for use in the field of research and business. These AI models are being offered for free. A total of seven models, which are trained on the company's AI-focused supercomputer called Andromeda, have been released, including a 111 million parameter language model to a larger 13 billion parameter model.
Why does this story matter?
Artificial Intelligence is one of the biggest trends of 2023 and for all the promising and not-so-promising reasons. OpenAI has already pocked billions from Microsoft while Google, Meta, and others rush to attract other investors. For the common public, Cerebras has released a handful of AI models to democratize the technology. These models are said to "use less energy than any existing public models."
In comparison to Cerebras's AI models, ChatGPT is very advanced
OpenAI's conversational chatbot ChatGPT has 175 billion parameters, which are essentially the values the Large Language Model (LLM) can update as it learns. In comparison, Cerebras's top AI model has been trained on 13 billion parameters. The parameters of the latest GPT-4 are rumored to be around 100 trillion. Both GPT-4 and GPT-3.5 are trained on the Azure AI supercomputer.
Smaller language models can be used on phones
Smaller language models can be used on phones or smart speakers while the bigger ones can run on PCs or servers, as per the company. But larger models might not necessarily translate to better. "There's been some interesting papers published that show that (a smaller model) can be accurate if you train it more," said Karl Freund, a chip consultant.
Most AI models are trained on NVIDIA's chips
"There is a big movement to close what has been open-sourced in AI...it's not surprising as there's now huge money in it," said Andrew Feldman, CEO and founder of Cerebras. "The excitement in the community, the progress we've made, has been in large part because it's been so open." Most of the current AI models are trained on NVIDIA's chips.
The biggest model took over a week to train
While the training typically takes up to a couple of months, Feldman said that the biggest model took over a week to train, owing to the architecture of the Cerebras system. The start-up uses its own chips for training AI models. The models trained on Cerebras machines can also be used on NVIDIA systems, added Feldman.