Facebook uses public Instagram images to train image recognition AI
At its ongoing F8 developer conference in California, Facebook has detailed how it's using Instagram images to train its image recognition artificial intelligence (AI) algorithms. The company is using billions of public Instagram photos already annotated with hashtags to train software to automatically recognize objects in images without humans having to individually label them. Facebook said its image recognition models beat top-of-the-line industry benchmarks.
Training AI to better identify objects in images
Facebook's deep learning models achieved 85.4% accuracy on image recognition benchmark ImageNet. "We've produced state-of-the-art results that are 1 to 2% better than any other system on the ImageNet benchmark," Mike Schroepfer, Facebook's Chief Technology Officer, said. Instead of relying "on hand-curated, human-labeled data sets," Facebook uses Instagram images that are already labeled by hashtags to analyze relevant data and enable image recognition.
Facebook is only extracting object-based data at the moment
This means it can tell dog breeds, plants, food and other such objects in images without anyone having labeled them such. Of course, the training is happening through default hashtag labels. The company is not analyzing user behavior from the contents of the photos.
Large-scale hashtag prediction model: Sorting what is relevant
Facebook's largest test so far has used 3.5 billion Instagram images spanning 17,000 hashtags. The pre-training system finds relevant hashtags and learns to prioritize specific hashtags over the general ones. This way, the pre-training process converts data at this scale to something that can be accurately and automatically labeled. This image recognition AI model will ultimately be universally useful to Facebook.
Privacy implications
Facebook is only accessing public data to train its AI models. However, it is also important for users to be aware that the images they post are part of a database used to build deep learning models, and not just used for targeted advertising.
Facebook to up its moderation game post Cambridge Analytica scandal
These AI systems will primarily help Facebook scale its moderation efforts. "We often had to rely on reactive reports. We had to wait for something bad to be spotted by someone and do something about it," Schroepfer said. Now, AI moderation can help the company preemptively screen for abuse, terrorist propaganda, nudity, violence, spam, and hate speech on the platform.