Meet LongWriter: The AI that can write 10,000-word texts
What's the story
A team of artificial intelligence (AI) researchers at China's Tsinghua University and Zhipu AI have developed a groundbreaking large language model (LLM) named LongWriter.
This innovative LLM is capable of generating text outputs up to 10,000 words long, significantly surpassing the output length of existing models.
The researchers' findings are detailed in a paper available on the arXiv preprint server.
Limitation challenge
Overcoming the limitations of existing LLMs
Current LLMs, despite their ability to process inputs of up to 100,000 words, often struggle to generate outputs exceeding a modest length of 2,000 words.
The researchers attribute this limitation to the fact that these models are typically trained on short documents.
To overcome this hurdle and enhance output length, they introduced modifications and used longer documents for training the new model.
Training process
LongWriter's training and performance
The team initially trained a 9-billion parameter LLM using a conventional dataset, primarily composed of documents less than 2,000 words long.
As anticipated, this model could only generate texts up to 2,000 words.
To enhance its performance, they modified the LLM using a pipeline called AgentWrite and created a new dataset named "LongWriter-6k," comprising 6,000 documents ranging from 2,000 to 32,000 words in length.
Output enhancement
Enhanced output and potential applications
Upon training the modified LLM with the new dataset, the researchers discovered that it could produce documents approximately 10,000 words long.
The team found these longer documents to be coherent and applicable in various contexts.
They have made the open-source code for their model available on GitHub, and demonstrated its capabilities by producing a 10,000-word tourist guide for people traveling in China.