Summarize

Google's AI matches human brilliance in solving Math Olympiad problems

By Akash Pandey

Jul 26, 2024

10:05 am

What's the story

Google DeepMind, a leading artificial intelligence (AI) research team, has developed an AI system that can solve intricate mathematical problems. The system combines multiple AI models, and has demonstrated impressive performance in solving four out of six problems from this year's International Mathematical Olympiad (IMO), a global test for high school students' mathematical abilities. The AI's performance was strong enough to place it among the top quartile of contestants, equivalent to winning a silver medal.

Evolution

New AI system builds on previous models

The new system is an advancement of an earlier model, AlphaGeometry, which was unveiled in January and could solve geometry problems from the IMO at a level comparable to top high school students. The updated system combines a new model, AlphaProof, with an improved version of AlphaGeometry, named AlphaGeometry 2. This combination allows the system to tackle various types of mathematical problems and develop sophisticated solutions.

Performance analysis

Showcaing potential despite limitations

Despite its impressive performance, the new system is not without flaws. It failed to find solutions for two questions and took three days to solve one problem. In contrast, human competitors must solve three questions within four-and-a-half hours. However, Google DeepMind researchers view this as a step toward more powerful AI models capable of planning and reasoning about complex tasks.

Technical details

Utilizing multiple neural networks

The AlphaProof system, unlike other well-known AI models, involves multiple neural networks performing different functions. A large language model (LLM), in this case, Google's Gemini model, is used to translate text-based mathematical problems into a formal mathematical language. The problem is then passed to a different AI model, AlphaZero, which suggests proof steps. If the proof step is valid, it will compile correctly in Lean, a mathematical programming language.

Progress report

Significant improvement in geometry

AlphaGeometry 2, another component of the system, is used for geometry problems. This updated version can solve 83% of IMO geometry problems compared to just 53% for its predecessor. In one instance, AlphaGeometry solved a highly complex geometry problem in just 19 seconds, demonstrating the system's improved efficiency and capability.

Expert opinion

Google's AI tools impress top mathematicians

Pushmeet Kohli, who heads Google DeepMind's AI for science division, sees AlphaProof and AlphaGeometry 2 primarily as tools for assisting mathematicians. Timothy Gowers, a director of research in mathematics at the University of Cambridge and a past winner of the Fields Medal, reviewed the proofs produced by these models and was impressed. He noted that some problems required deep thinking and the discovery of "a sort of magic key" that makes an unsolvable problem solvable.

Twitter Post

Take a look at Gowers's post

Google DeepMind have produced a program that in a certain sense has achieved a silver-medal peformance at this year's International Mathematical Olympiad. 🧵https://t.co/DIcsYXUv97
— Timothy Gowers @wtgowers (@wtgowers) July 25, 2024

Future prospects

Hybrid AI system opens new possibilities

This significant achievement by Google DeepMind marks a milestone in the development of AI systems. It showcases the potential of combining different AI approaches to create more capable hybrid systems. Google has hinted that this technology could eventually be integrated into commercial products like its Gemini AI tools, expanding the practical applications of AI in various industries.