OpenAI Unveils Major ChatGPT Update with 'See, Hear and Speak' Capabilities

Rolf van Root/Unsplash

Tuesday, September 26, 2023 3:38 AM UTC

One of the key highlights of the update is the ability for users to engage in voice conversations with ChatGPT through the mobile app. Moreover, users can choose from five different synthetic voices to personalize their experience. Additionally, this update empowers users to share images with ChatGPT for analysis and focus highlighting.

These features will roll out for paying users over the next two weeks, with voice functionality limited to iOS and Android devices. Image processing capabilities will be available on all platforms.

Major players like Microsoft, Google, and Anthropic are locked in an AI arms race, vying to incorporate generative AI technology into consumers' daily lives. Google recently announced a series of updates to its Bard chatbot, while Microsoft integrated visual search capabilities into Bing.

OpenAI has garnered substantial support from its partners, most notably Microsoft, which invested an additional $10 billion into the company earlier this year. This sizable investment made it the biggest AI funding of the year. OpenAI's valuation stands between $27 billion and $29 billion, following a reported $300 million share sale in April. This funding round attracted prominent investors such as Sequoia Capital and Andreessen Horowitz.

Despite the exciting advancements, concerns about the emergence of AI-generated synthetic voices have been raised. While such voices provide a more natural interaction, they also raise the risk of deepfakes. Researchers and cyber threat actors have begun exploring how deepfakes can disrupt cybersecurity systems.

OpenAI has addressed these concerns in its announcement, emphasizing that synthetic voices were created in collaboration with voice actors rather than sourced from unidentified individuals. However, the release did not provide significant information regarding how OpenAI uses consumer voice inputs or safeguards user data.

OpenAI asserts that they do not retain audio clips and that these clips do not contribute to model improvements. However, the company acknowledges that transcriptions are considered inputs and may contribute to enhancing their large-language models.

Photo: Rolf van Root/Unsplash