Google Books is cataloging AI-generated content
Google Books has been discovered to be cataloging low-quality, AI-generated books. This revelation could potentially impact the Google Ngram Viewer, a tool used by researchers for analyzing word usage over time.The identification of these AI-generated books was made using the same method previously employed to uncover AI-generated Amazon product reviews, academic papers, and online articles.
Dozens of AI-generated books identified on Google Books
The AI-generated books were identified by searching for the phrase "As of my last knowledge update," which is associated with ChatGPT- and Gemini AI-generated answers. This search revealed dozens of books containing that phrase on Google Books. While some of these books discuss topics like ChatGPT, machine learning, and AI and seem to be human-written, most appear to be generated by artificial intelligence.
Examples of AI-generated books on Google Books
Two examples of these AI-generated books include Bears, Bulls, and Wolves: Stock Trading for the Twenty-Year-Old by Tristin McIver and Maximize Your Twitter Presence: 101 Strategies for Marketing Success by Shu Chen Hou. Both published in 2024, these books read like ChatGPT-generated text with superficial analysis of complex topics. The latter even appears outdated at the time of publication due to being generated with an old version of ChatGPT.
Concerns over Google Books indexing AI-generated text
Gary Price, a librarian and editor of the Library Journal's infoDOCKET, expressed surprise that Google may not be aware of the nature of the content being indexed in Google Books search. He suggested that labeling such content as AI-generated would benefit both Google and its users. These AI-generated books are similar to those found on Amazon and many appear on both Amazon and Google Play Books.
Potential impact on Google Ngram Viewer
One potential issue with Google Books cataloging AI-generated text is its possible future inclusion in the Google Ngram Viewer. This tool charts the frequencies of words over the years in published books scanned by Google dating back to 1500. However, a spokesperson from Google confirmed that none of the flagged AI-generated books are currently informing Ngram Viewer results.
Google's commitment to high-quality Ngram Viewer
Google said it is committed to ensuring that the Ngram Viewer remains a high-quality resource and will evaluate their approach as the world of book publishing evolves. Alex Hanna, director of research at the Distributed AI Research Institute (DAIR), expressed concern that if AI-generated books start affecting Ngram Viewer results in the future, the tool may become unusable.