Musk's xAI cooks up world's most powerful AI training cluster
Elon Musk's start-up, xAI, has commenced training its next artificial intelligence (AI) model at a new supercomputing facility in Memphis, Tennessee. The billionaire entrepreneur confirmed this development on X. "Nice work by xAI team, X team, NVIDIA & supporting companies getting Memphis Supercluster training started," Musk posted. The company is utilizing 100,000 liquid-cooled H100s GPUs from NVIDIA for this purpose, making it the most powerful AI training cluster in the world.
xAI utilizes NVIDIA's H100 GPUs for AI training
The training of the next version of xAI's chatbot, Grok, is being facilitated by 100,000 liquid-cooled H100s from NVIDIA on a single RDMA fabric. Musk praised the setup, stating "With 100k liquid-cooled H100s on a single RDMA fabric, it's the most powerful AI training cluster in the world!" These GPUs from NVIDIA are specifically designed for training AI models that require substantial energy and computing power.
New supercomputing facility raises environmental concerns
The supercomputing facility, dubbed the "Gigafactory of Compute," is located in a former manufacturing building in Memphis. However, environmental groups have raised concerns about its energy and water consumption. The Memphis Community Against Pollution and two other groups warned that the facility creates a significant "energy burden." They stated, "xAI is also expected to need at least one million gallons of water per day for its cooling towers."
Community fears over energy and water supply
The CEO of Memphis Light, Gas, and Water estimated that xAI's facility may use up to 150mW of electricity an hour, equivalent to the energy required to power 100,000 homes. Memphis City Council member Pearl Walker voiced the community's concerns last week, stating "People are afraid. They're afraid of what's possibly going to happen with the water and they are afraid about the energy supply." The environmental groups urged xAI to invest in a wastewater reuse system.
Comparison with tech giants
Despite its size, xAI's Memphis facility may not be the biggest computing facility globally. Other tech giants such as Microsoft, Google, and Meta also use data centers for training and operating their AI models. Meta CEO Mark Zuckerberg has pledged to buy 350,000 NVIDIA H100s this year. Musk had previously announced plans to release Grok 2 next month, but it remains unclear whether this supercomputing cluster might be utilized for this purpose.