This Google-funded start-up allegedly stole YouTube videos for AI training
Runway, an AI start-up backed by Google, is facing accusations of using pirated content and unauthorized YouTube videos to train its Gen-3 Alpha video generation tool. The allegations emerge from a leaked internal document obtained by 404 Media, reportedly shared by an ex-employee of Runway. The document outlines plans to categorize and tag content from over 3,900 YouTube channels including major media giants like Disney and Netflix, as well as popular creators like Casey Neistat and Marques Brownlee (MKBHD).
Gen-3 Alpha draws attention amid controversy
The Gen-3 Alpha video generation tool, developed by Runway, gained significant attention last month for its ability to generate nearly photorealistic clips. The company stated that the tool was "trained jointly on videos and images," but did not disclose the data source. Despite not confirming the authenticity of the leaked spreadsheet, Runway previously claimed to use "curated, internal datasets" for training. However, 404 Media managed to create convincing videos of well-known YouTube personalities using this tool.
Alleged use of proxies and massive web crawler
Runway reportedly went to the extent of covering its tracks by using a proxy to avoid being blocked by YouTube. "The channels in that spreadsheet were a company-wide effort to find good quality videos to build the model with," an unnamed former employee told 404 Media. "This was then used as input to a massive web crawler which downloaded all the videos from all those channels, using proxies to avoid getting blocked by Google," the employee added.
Intellectual property concerns in AI training
This isn't the first instance of an AI company facing scrutiny for using copyrighted material without necessary licenses. Earlier this year, OpenAI CTO Mira Murati admitted in a Wall Street Journal interview that she was unsure if training data for the company's upcoming Sora video generator, included videos from Instagram, YouTube, or Facebook. The New York Times later reported that OpenAI had bypassed corporate policies to evade copyright laws, using tools to transcribe YouTube videos for training its AI chatbots.
YouTube CEO warns against violation of platform's terms
YouTube CEO Neal Mohan has cautioned AI companies that using YouTube videos to train AI models would constitute a "clear violation" of the platform's terms of use. The issue of intellectual property infringement remains a significant hurdle in the development of generative AI, particularly with models capable of generating entire videos. Runway, valued at $1.5 billion, raised $141 million in funding last year from investors including YouTube owner Google, NVIDIA, and Salesforce.