Will Google's Gemini disrupt the AI space?

By Rishabh Raj

Dec 07, 2023

11:31 am

What's the story

Google has launched Gemini, its most advanced artificial intelligence (AI) model till now, aimed at rivaling OpenAI's GPT-4.

This powerful multimodal AI system can generate outputs in various formats, including images, video, and audio.

Available in three versions—Nano, Pro, and Ultra—Gemini is expected to revolutionize how consumers, developers and businesses work with AI.

Google claims that Gemini is much more powerful than GPT-4. Let's see how the two AI models stack up against each other.

Details

First, take a look at the three tiers of Gemini

The Ultra is "the largest and most capable model for highly complex tasks." The Pro is the "best model for scaling across a wide range of tasks" and Nano is for on-device tasks.

Gemini Ultra can understand and create high-quality code in Python, Java, C++, and Go.

Google claims, Gemini Ultra excels in several coding benchmarks, including HumanEval, industry-standard for evaluating performance on coding tasks and Natural2Code, internal dataset sourced from authors rather than the web.

Details

Gemini Ultra surpasses human experts in some parameters

According to Google, Gemini Ultra performed better in tasks like understanding images, audio, video, and math, outdoing current standards in 30 of the 32 widely-used academic benchmarks.

One standout benchmark where Google's Gemini led the pack was the MMLU (massive multitask language understanding).

It tackles 57 topics, testing knowledge and problem-solving across fields like math, physics, history, law, medicine, and ethics.

Google says Gemini hit 90%, surpassing human experts, while GPT-4 scored 86.4% in the same test.

Insights

GPT-4 v/s Gemini Ultra: Text understanding capabilities

Gemini took the lead in Big-Bench Hard (multistep reasoning) and DROP (reading comprehension) benchmarks, scoring 83.6% and 82.4% respectively, while GPT-4 scored 83.1% and 80.9%.

It also excelled in basic arithmetic (GSM8K) with 94.4% and complex math (MATH) with 53.2%, surpassing GPT-4's scores of 92% and 52.9% in the same tests.

However, GPT-4 scored a whopping 95.3% score in HellaSwag (commonsense reasoning for everyday tasks), beating Gemini which scored only 87.8%.

Details

GPT-4 v/s Gemini Ultra: Image, video, and audio capabilities

In multimodal assessments, Gemini showcased superior image understanding, outscoring GPT-4 across all benchmarks.

In video tasks, Gemini stood out in English video captioning (VATEX) and video question answering (MCQA).

For audio benchmarks, Gemini excelled in automatic speech translation with a score of 40.1 compared to GPT-4's 29.1.

However, its performance lagged in automatic speech recognition, scoring 7.6%, which is considerably lower than GPT-4's 17.6%.

Insights

Gemini Pro now powers Google Bard AI chatbot

Starting December 6, Google upgraded Bard AI with a fine-tuned version of Gemini Pro, which offers advanced reasoning, planning, and understanding capabilities.

Gemini Pro outperformed GPT-3.5 in six of eight benchmarks, making Bard a superior free chatbot compared to leading alternatives.

Google is also bringing Gemini Nano to Pixel 8 Pro.

In the coming months, Gemini will be available in more than 170 countries across Google's products and services like Search, Ads, Chrome and Duet AI.

Facts

Gemini Ultra's release

In the near future, Google plans to merge Gemini Ultra into a new Bard Advanced version and experiment with Gemini in Search to boost speed.

Starting on December 13, developers and enterprise customers can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.

Android developers will also be able to build with Gemini Nano.