ChatGPT can now see, hear, speak!

San Francisco: Sam Altman-run OpenAI has announced it is rolling out new voice and image capabilities in ChatGPT that can now help the AI chatbot see, hear and speak.

These capabilities offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about, the company said Monday in a statement.

Also Read

Govt notifies telecom cyber security rules; sets timelines for telcos to report security incidents

2 days ago

COP29: Rich countries’ proposal of $250 billion per year climate finance by 2035 draws flak from developing nations

2 days ago

“Voice mode and vision for chatGPT! really worth a try,” Altman posted on X.

The company said it is rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks.

“Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms,” said the Microsoft-backed company.

The new voice capability is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech.

“We collaborated with professional voice actors to create each of the voices. We also use Whisper, our open-source speech recognition system, to transcribe your spoken words into text,” said OpenAI.

Image understanding is powered by multimodal GPT-3.5 and GPT-4. These models apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images.

The new voice technology opens doors to many creative and accessibility-focused applications.

However, “these capabilities also present new risks, such as the potential for malicious actors to impersonate public figures or commit fraud,” the company noted.

“This is why we are using this technology to power a specific use case — voice chat. Voice chat was created with voice actors we have directly worked with,” it added.

Spotify is using the power of this technology for the pilot of their Voice Translation feature, which helps podcasters expand the reach of their storytelling by translating podcasts into additional languages in the podcasters’ own voices.

“We’ve also taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy,” said the company.

IANS

Tags: AI ChatGPT GenAI OpenAI

ChatGPT can now see, hear, speak!

Govt notifies telecom cyber security rules; sets timelines for telcos to report security incidents

COP29: Rich countries’ proposal of $250 billion per year climate finance by 2035 draws flak from developing nations

Anup Mahapatra

Pratyasharani Ghibela

Rajashree Manasa Mohanty

Priyabrata Mohanty

Jhili Jena

Pratik Kumar

Arya Ayushman

Vandana Singh

Archit Mohapatra

Praptimayee Biswal

Amritansh Mishra

Pratik Kumar Ghibela

Ramakanta Sahoo

Rajashree Pravati Mohanty

Chinmay Kumar Routray

Debasis Mohanty

Tapaswini Mallick

Aishwarya Ranjan Mohanty

Keshab Chandra Rout

Faiza Firdous

Mandakini Dakua

Saishree Satyarupa

Adweeti Bhattacharya

Sibarama Khotei

Kamana Singh

Subhajyoti Mohanty

Swarit Praharaj

Spinoj Pattnaik

Priyasha Pradhan

Mrutyunjaya Behera

Archives

Editorial

Beneficial Bangladesh

Power of Continuity

DOGE

Menacing Nexus