Meta releases open-source version of Google's podcast generator
Meta has launched an open-source version of Google's popular podcast creation feature, NotebookLM. The new project, called NotebookLlama, uses Meta's proprietary Llama models for most of its processing tasks. Just like its Google counterpart, this innovative tool can generate conversational-style summaries from text files uploaded onto the platform.
NotebookLlama's approach to podcast creation
NotebookLlama starts its work by turning a file (like a PDF of a news article/blog post) into a transcript. It then adds "more dramatization" and interruptions to this transcript before passing it through open text-to-speech models. Though the voices in NotebookLlama samples sound overtly robotic and sometimes talk over each other at unexpected moments, Meta's team believes these problems can be solved with advanced models.
Meta researchers acknowledge room for improvement in NotebookLlama
The Meta researchers behind NotebookLlama have acknowledged the tool's limitations on its GitHub page. They said, "The text-to-speech model is the limitation of how natural this will sound." They also suggested an alternative approach for creating podcasts, which involves having two agents debate the topic and draft the podcast outline, instead of relying solely on a single model.
AI-generated podcasts face 'hallucination problem'
Despite several attempts to replicate NotebookLM's podcast feature, none have managed to tackle the 'hallucination problem' that comes with all AI. This problem is where AI-generated podcasts include made-up stuff. NotebookLlama is no exception to this problem, which highlights an area that could be improved in future iterations of the tool.