ChatGPT has received a big update and can now “speak” using up to 5 different voices. It is one of the most important improvements that the OpenAI chatbot has received since its launch.
ChatGPT is about to change forever. OpenAI has announced the update of its artificial intelligence chatbot ChatGPT, which will now be able to speak and maintain conversations out loud with users. A drastic evolution that will take the platform to another level, since it will no longer be limited to interactions via text.
Audio support will come to ChatGPT through its iOS and Android applications. Those who access the new version will have the possibility of chatting directly with the AI chatbot, being able to select which voice they want to give it among 5 different options.
As explained by OpenAI, this ChatGPT feature is based on a new text-to-speech model that allows you to generate spoken responses from an audio sample of a few seconds. This added to working with professional voice actors, makes interactions with artificial intelligence feel more human.
The developers further explained that the new version of ChatGPT uses the Whisper voice recognition system to convert user questions from audio to text. In the following post on X (Twitter), you can see how a spoken conversation with the chatbot works.
Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with them on the go, request a bedtime story, or settle a dinner table debate.
It is important to mention that ChatGPT’s voice options will not be activated by default. Those who wish to chat with the chatbot through this channel must enable it manually in Settings > New features. Once this is done, they will be able to select what type of voice they want to give to the chatbot.
Please note that the addition of voice features will occur gradually. This means that not all users of the ChatGPT mobile apps will be able to use them from day one. OpenAI assures that this new feature will be activated gradually and will reach ChatGPT Plus and Enterprise subscribers first. This will happen over the course of the next two weeks.
The company indicated that it would limit this technology to conversations with the chatbot within the apps to avoid malicious use. “New technology, capable of creating realistic synthetic voices from just a few seconds of real speech, opens the doors to many creative and accessibility-focused applications. However, these capabilities also present new risks, such as the possibility that malicious actors impersonate public figures or commit fraud,” said Sam Altman.
It is worth mentioning that, with this update, ChatGPT can not only listen and speak but also see. This means that users of the iOS and Android apps will be able to interact with the chatbot using photos. Thus, for example, they will be able to take an image and ask the AI for help to carry out a particular task.
An interesting fact is that, beyond capturing the photo itself, it will be possible to “draw” some sections so that artificial intelligence focuses on them.
According to OpenAI, this new feature is powered by a multimodal platform that is based on both GPT-3.5 and GPT-4.OpenAI
“Like other ChatGPT features, the vision is to help you with your daily life. It does this best when it can see what you see,” the developers note. However, the startup has also applied limitations to this image-based feature. This means that, for example, it will not work when photos are of people. “ChatGPT is not always accurate and these systems must respect the privacy of individuals,” they explained to OpenAI.