27-year-old entrepreneur is helping Google, Microsoft democratize AI in India
In the quiet streets of Agara, a small village nestled in the lush landscapes of southwest Bangalore, a remarkable transformation is unfolding. At the heart of this transformation is a 27-year-old innovator, Manu Chopra, whose vision is saving India's vernacular languages from obscurity and empowering those who have long been overlooked. Chopra is reshaping the landscape of artificial intelligence (AI) and technology by bridging the language gap that has kept millions marginalized for years.
Karya: The catalyst of change
Karya, the brainchild of Chopra, was founded in 2021. It stands as a beacon of hope in the ever-changing world of AI. In an era dominated by generative AI like ChatGPT, Karya demonstrates the potential of technology harnessed for a noble purpose. This innovative start-up has united a hidden global workforce from countries like India, Kenya, and the Philippines. Their mission? To collect, annotate, and label text, voice, and image data in the diverse local languages.
Empowering lives through Karya
Meet Preethi P., a living example of the transformation Karya has brought. Earlier a skilled tailor, Preethi used to earn less than Rs. 100 a day, toiling over mending and stitching clothes. Today, she is one of the 70 workers from Agara who helps build a dataset. She reads sentences in her native Kannada language into Karya's app on a phone. In just three days, Preethi earned Rs. 4,500, over four times her usual monthly income as a tailor.
Karya's mission is fuelled by the demand for data
With the rising enthusiasm for generative AI, tech companies' appetite for data has become insatiable. According to NASSCOM, India is expected to host almost a million data annotation workers by 2030. Karya differentiates itself by paying its workers up to 20 times the minimum wage and ensuring the delivery of high-quality Indian-language data. Chopra, a computer engineer educated at Stanford, firmly believes that underpaid data labor is an industry failure that must be rectified.
Major tech companies are partnering with Karya
Major tech companies have often outsourced data tasks to cheaper contractors overseas. But now, they are partnering with Karya to address the challenge of finding high-quality data that caters to diverse non-English speaking users. Microsoft, the Bill & Melinda Gates Foundation, and Alphabet Inc.'s Google have joined forces with Karya. "Tech companies want the data, accent and all," Chopra told Bloomberg. "You cough, they want that in the speech - it represents natural language."
Bill & Melinda Gates Foundation is collecting 'gender intentional' datasets
Bill & Melinda Gates Foundation has involved over 30,000 educated young women to work with Karya to help collect "gender intentional" datasets in six Indian vernacular languages. This means ensuring words like "pilot," "scientist" or "boss" aren't exclusively associated with the male gender, or "nurse," and "flight attendant" sort of professions aren't assumed to be female-only.
Microsoft and Google are also leveraging Karya's services
Microsoft Research India's Saikat Guha, who used Karya's content for a project to help those with visual disabilities find jobs, praised Karya's data quality. "If you pay workers fairly, they are more invested in their work, and the end result is better data," he says. Meanwhile, Google, with the help of Karya and other local partners, is gathering speech data in 85 Indian districts. In the future, they plan to build a generative AI model for 125 Indian languages.
Bridging the linguistic gap
AI models have heavily leaned on English-language internet data, leaving non-English-speaking users underserved. In India alone, nearly a hundred crore potential users are eager to embrace AI tools across various sectors, but the reality is that over 70 Indian languages, each spoken by over a million people, have zero digital representation. Karya aims to bridge this gap in India. Its app is designed to function even without internet access and offers voice support for those with limited literacy.
Manu Chopra's purpose-driven journey
For Chopra, the goal isn't just to improve the supply of data but to fight poverty. He grew up in an impoverished neighborhood called Shakur Basti in West Delhi. He won a scholarship to study in an elite school where he was bullied because his classmates said he "smelled poor." He graduated in 2017 from Standford University in computer science and began working on Karya, with the vision to use technology to tackle poverty.