OpenAI likely to make GPT-4o Advanced Voice feature available for selected users

Advanced Voice feature acts as a patient language teacher with an ability to correct your pronunciation

Tech Desk - Jul 20, 2024

OpenAI, the leading artificial intelligence (AI) firm, recently unveiled its generative large language model (LLM) GPT-4o in May during its Spring Update.

GPT-4o is a type of Omni model with capabilities such as speech-to-speech assistance and voice guidance. This natively multimodal model can also understand speech directly without converting it into text.

OpenAI is now set to make a limited 'alpha' roll-out of its Advanced Voice feature.

What is GPT-4o Advanced Voice feature?

GPT-4o Advanced Voice is an updated version of the recently unveiled French model, Moshi. It can create custom character voices, generate sound effects while telling a story, and even act as a live translator.

This feature makes GPT-4o both faster and more accurate as a voice assistant, allowing it to pick up on tone and vocal nuances during a conversation. It also acts as a patient language teacher, with the ability to correct pronunciation and help improve accents.

Release date

The feature will be accessible to initial users in the upcoming weeks, according to OpenAI CEO Sam Altman. He further stated that although users have been waiting for access for a long time, safety testing is necessary. Therefore, it will be released to trusted users first, with availability for other users expected later in the autumn.