News

Human or Machine? OpenAI’s GPT-4o offers human-like experience

Published

2 years ago

May 14, 2024

Human or Machine? OpenAI's GPT-4o offers human-like experience with audio & video

OpenAI recently unveiled a significant upgrade to its ChatGPT platform, introducing the GPT-4o model, where the “o” stands for “omni.” This major update enables the chatbot to interpret and engage with video and audio content in real-time, leading to a more fluid and human-like conversational experience.

OpenAI introduced its newest AI model, GPT-4o, designed to be a more conversational and lifelike AI chatbot. This advanced model interprets both audio and video inputs and provides real-time responses.

In a series of demonstrations, OpenAI showcased the potential of GPT-4 Omni in enhancing users’ daily lives. The versatile AI model assists with tasks such as interview preparation, and guiding appearance and presentation. Additionally, GPT-4 Omni’s capabilities shine as it simulates a call to a customer service agent to secure a replacement iPhone for a user.

Further demonstrations reveal a range of capabilities offered by GPT-4 Omni. The AI model showcases its playful side, sharing dad jokes and engaging in lighthearted user interactions.

It also demonstrates proficiency in the real-time translation of bilingual conversation and serves as an unbiased referee in a rock-paper-scissors match between users, adding an element of fun and interaction. Moreover, when prompted, the AI model injects its responses with sarcasm.

Another demonstration showcases GPT-4 Omni’s capacity for empathy and engagement as it interacts with a user’s puppy for the first time.

“Well hello, Bowser! Aren’t you just the most adorable little thing?” the chatbot exclaimed.

“It feels like AI from the movies, and it’s still a bit surprising to me that it’s real,” said the firm’s CEO, Sam Altman, in a May 13 blog post.

“Getting to human-level response times and expressiveness turns out to be a big change.”

OpenAI announced in a post that a version of GPT-4 Omni with text and image-only input capabilities was released on May 13. The company further stated that the full version of the model will roll out in the forthcoming weeks.

OpenAI revealed that its latest AI model, GPT-4o, caters to the needs of both premium and non-paying ChatGPT users and will be accessible from ChatGPT’s API.

In a recent statement, OpenAI explained that the “o” in GPT-4o derives from the term “Omni,” reflecting the company’s intention to move closer to a future where human-computer interactions become more seamless and instinctive.

GPT-4o simultaneously handles inputs from text, audio, and image sources, representing a considerable advancement over previous AI models developed by OpenAI, such as ChatGPT-4, which often “loses a lot of information” when forced to multitask.

According to OpenAI, GPT-4o sets itself apart from current AI models with its significantly improved capabilities in comprehending visual and auditory inputs. The advanced model detects a user’s emotional cues and breathing patterns.

It is also “much faster” and “50% cheaper” than GPT-4 Turbo in OpenAI’s API.

OpenAI claims GPT-4o boasts an impressive response time to audio inputs, taking as little as 2.3 seconds and averaging 3.2 seconds. The company equates this rapid response rate to human reaction times in everyday conversations, highlighting the model’s enhanced ability to engage in natural dialogue.