Summarize

ChatGPT to get upgraded voice mode next week

By Akash Pandey

Jul 26, 2024

05:40 pm

What's the story

OpenAI is set to introduce an upgraded "Voice Mode" for its GPT-4o model in ChatGPT.

The announcement was made by OpenAI CEO Sam Altman, who confirmed the feature will be available in a limited "alpha" release for ChatGPT Plus subscribers starting next week.

The GPT-4o model, launched in May as OpenAI's new flagship AI model, has seen significant enhancements to the talkback feature of ChatGPT.

AI evolution

Current voice mode limitations and upcoming improvements

The existing version of Voice Mode in ChatGPT, available across both free and paid tiers, has been noted for its limitations.

Communication with ChatGPT currently experiences latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4) on average due to a three-step data processing pipeline.

OpenAI has recognized that this process often results in loss of information.

The forthcoming GPT-4o model aims to address these issues with an end-to-end training approach across text, vision, and audio.

AI advancement

GPT-4o model: A leap forward in conversational AI?

The GPT-4o model is designed to significantly reduce latency and improve results by processing all inputs and outputs over the same neural network.

This end-to-end training approach enhances the model's ability to handle interruptions and manage group conversations effectively.

Additionally, the GPT-4o model can filter out background noise and adapt to tone changes during conversations, promising a more natural and efficient conversational experience for users.