Why Reddit won't allow search engines to index its content
In a significant move, Reddit has begun blocking all search engines except Google from indexing its content. This decision follows an update to Reddit's Robots Exclusion Protocol (robots.txt) last month aimed at preventing unauthorized data scraping by AI bots. The only exception to this rule is Google, which inked a $60 million annual contract with Reddit earlier this year.
Impact on search engines and AI companies
The update to the robots.txt file was initially seen as a measure against AI firms like Perplexity, which despite being blocked, continued to ignore requests not to scrape Reddit's content. However, it's now evident that the change also impacts search engines. Currently, Google is the sole search engine permitted to crawl Reddit and generate results. Searches for Reddit content on rival engine Bing yield no results, as confirmed by 404 Media and Engadget.
New policy affects DuckDuckGo, Mojeek, and more
The privacy-focused search engine DuckDuckGo initially displayed some Reddit links without descriptions but has since removed even those. The note accompanying these links stated, "We would like to show you a description here but the site won't allow us." Colin Hayhurst, CEO of the lesser-known "no-tracking" search engine Mojeek, expressed his concerns to 404 Media stating that Reddit is "killing everything for search but Google."
Reddit's history of blocking data scraping
Reddit's stance against AI companies scraping its data has been apparent over the past year. Last year, CEO Steve Huffman blocked third-party API requests, leading to the downfall of apps like Christian Selig's Apollo. Despite protests from moderators and forum users, the company only temporarily lost a negligible number of users. Reddit's decision to block search engines is seen as a strategy to safeguard its data and create an additional revenue stream.