Summarize

Google Gemini can now answer questions about PDFs on screen

By Akash Pandey

Dec 22, 2024

06:26 pm

What's the story

Google has added a new capability to its Files app, courtesy of its advanced AI assistant, Gemini. The update lets Gemini detect when a PDF is open on your screen and lets users directly ask questions about what's in the file. This marks the latest in a series of context-aware capabilities being rolled out in Gemini to enhance user interaction with digital files.

User guide

How to use the new feature?

To use this new feature, users will have to open a PDF in the Files by Google app and enable Gemini. A button saying "Ask about this PDF" will show up, which users can tap to ask specific questions about the file's contents. The interactive process is akin to interacting with a conversational AI like ChatGPT. The feature is being rolled out to Gemini Advanced subscribers, according to The Verge.

Information retrieval

Enhancing user interaction with digital files

The new feature dramatically improves the way users engage with their files. For instance, when you open a PDF, say a research paper or report, you can simply ask Gemini questions like "What's the summary of this document?" or "Can you explain this section?" In reply, Gemini delivers detailed summaries or clarifications just like a personal assistant would interpret the file.

Expanded functionality

Gemini's context-aware capabilities extend beyond PDFs

The PDF recognition capability comes as part of Google's push to make Gemini more context-aware across different media. Earlier, Gemini let users ask questions about web pages and YouTube videos. Now, it can interpret content on a device's screen, paving the way for new possibilities for mobile users. For apps/files that don't support Gemini's context-aware capabilities yet, the assistant can help by taking a screenshot of the screen and offering to answer questions based on it.