AI breakthrough? Google's new training method uses 10x less power
Google's AI research lab, DeepMind, has developed a new AI training method called JEST (joint example selection), which is claimed to be 13 times faster and 10 times more energy-efficient than existing techniques. Unlike traditional methods that focus on individual data points, JEST is based on selecting the most "learnable" data subsets. The technique optimizes training data by creating a smaller model to evaluate data quality, which then guides the training of a larger model.Google's JEST Sparks AI Revolution:
JEST's success relies on high-quality training data
The effectiveness of the JEST method heavily depends on the quality of its training data. DeepMind researchers emphasized in their paper that the "ability to steer the data selection process toward the distribution of smaller, well-curated datasets" is crucial for JEST's success. However, without a human-curated dataset of top-notch quality, this method's bootstrapping technique would not be effective, posing challenges for amateur AI developers.
JEST's timely development amid rising AI power demands
The development of the JEST method comes as discussions about the high power demands of artificial intelligence intensify in the tech industry and among world governments. In 2023, AI workloads consumed about 4.3GW, nearly equivalent to Cyprus's annual power consumption. Google's own greenhouse gas emissions have soared by 50% since 2019 amid AI expansion Notably, a single ChatGPT request uses 10 times more power than a Google search, highlighting the urgency for more energy-efficient solutions like JEST.
Uncertainty surrounds adoption of JEST in AI industry
The future adoption of the JEST method by major players in the AI industry remains uncertain. Training larger models like GPT-4o reportedly cost $100 million, and future models costing up to $1 billion are already in development. As firms are likely seeking ways to reduce costs, some hope that JEST methods will maintain current training productivity rates at much lower power draws, while others believe companies will continue to prioritize hyper-fast training output.