ChatGPT rolls out voice and image prompts

Last Updated : 13 Oct, 2023

OpenAI is set to revolutionize the way we interact with artificial intelligence through an innovative update to ChatGPT. This upgrade allows users to engage with the AI bot not just through text input but also via voice commands and image uploads, promising a more immersive experience. These features will be available to subscribers within the next two weeks and soon to the wider public. What sets this development apart is the potential for greatly improved responses, thanks to enhanced underlying technology, aligning with the industry trend of utilizing Large Language Models (LLMs) for virtual assistants. OpenAI’s ChatGPT is expected to lead the way, offering a more natural and efficient means of interacting with AI, and setting up a new way of conversation that blends the familiarity of voice commands with the power of advanced language models.

ChatGPT-Image-and-Voice-Prompt-Launched-copy

When can you start using the new ChatGPT Features

ChatGPT features will first be available to ChatGPT Plus and Enterprise users within the next two weeks, with developers gaining access soon. To activate the Voice Feature, users will simply need to navigate to the ‘Settings’ menu within the ChatGPT mobile app, opt into voice conversations, and select their preferred voice by tapping the headphone button on the home screen. Notably, the Voice feature will roll out on an opt-in beta basis for ChatGPT app users, while Image Search will be seamlessly enabled by default across all platforms.

ChatGPT Deploying Image and Voice Capabilities

Note: The voice and images in ChatGPT will be available to Plus and Enterprise users over the next two weeks.

ChatGPT will now be able to answer in Five Different Voices

ChatGPT is taking a significant leap forward in user interaction by introducing a range of five separate voices that users can choose from based on their personal preferences. By teaming up with professional voice actors, OpenAI has carefully crafted each of these voices and relies on their proprietary Whisper speech recognition system to precisely translate spoken words into text.

ChatGPT can now generate Human Like Audio from Text

ChatGPT will now be able to transform plain text in just a few seconds of speech samples into highly human-like audio. This innovation will not only enhance the conversational experience but will also open up exciting possibilities for creative applications and improved accessibility across various domains.

Spotify to Collaborate with ChatGPT

Spotify has joined forces with the AI startup to facilitate the translation of podcasts into additional languages, preserving the authentic voice of the podcasters in the process. This collaboration showcases the versatility and real-world impact of ChatGPT’s voice capabilities in diverse settings.

ChatGPT will Answer Image Questions

Within ChatGPT, the image prompt works as a visual search tool, requiring users to select an image of the object they want by clicking on the camera icon adjacent to the text input bar. This prompts ChatGPT to perform an analysis of the image’s components to find out the user’s intent. To streamline the process, users have the option to include text queries alongside the image. For more correct responses, users can maintain the conversation by providing follow-up images.

How to use New Voice and Image Prompts in ChatGPT

Snap pictures of a Landmark Image while traveling and ask ChatGPT “What’s interesting about it”.
Snap pictures of a fridge and ask ChatGPT to figure out “what’s for dinner”.
Ask a bedtime story from ChatGPT.
Snap a picture of your device that is not working, mark it, and ask why it is not working
Ask ChatGPT to analyze work-related data from a complex graph.

How to Speak with ChatGPT

Step 1: Open ChatGpt

Step 2: Go to Settings

Step 3: Click on New Features on the Mobile App

Step 4: Enable Voice Conversations

Step 5: Tap on the Headphone Button

Step 6: Select your Preferred Voice

How to Speak with ChatGPT | Image Source: OpenAI

How to use the ChatGPT Image Prompt

Step 1: Open ChatGPT

Step 2: Tap the Photo Button and Choose an Image

Step 3: Tap on the Plus Button

Step 4: Use Drawing to Mark the Photo

Step 5: Speak or Type a Question

How to use the ChatGPT Image Prompt | Image Source: OpenAI

Conclusion

In conclusion, OpenAI’s commitment to building safe and beneficial Artificial General Intelligence (AGI) is evident in their gradual approach to releasing advanced AI models, particularly those involving voice and vision technologies. This careful strategy allows for ongoing improvements and risk mitigation refinements, and crucially, it ensures that users are well-prepared for the advent of more powerful AI systems in the future. The introduction of the new voice technology, capable of generating lifelike synthetic voices, brings forth a world of creative and accessibility-focused possibilities, while simultaneously necessitating vigilance against potential misuse. OpenAI’s responsible application of this technology, as seen in the collaboration with voice actors and companies like Spotify for Voice Translation, demonstrates their commitment to ethical and secure usage. Similarly, in the realm of image input, OpenAI’s proactive testing and collaboration with various stakeholders underscore their dedication to addressing the unique challenges posed by vision-based models, ultimately striving for responsible and beneficial AI deployment across diverse domains.

Suggest improvement

Top 10 ChatGPT Prompts for Teachers

Share your thoughts in the comments