Today’s OpenAI live demo of ChatGPT-4o is utterly mind-blowing next-level Star Trek magic
ChatGPT-4o launch event
This is a summary covering today’s May 13, 2024 launch event for the latest ChatGPT-4o, a so-called omnimodal assistant — hence the “4o” in the name — with the ability to talk and see via your phone’s camera and a new assistant app for your computer.
Some of the examples included, in no particular order:
Truly realtime, essentially lag-free voice conversation with the ability to interrupt
Whereas the previous ChatGPT voice chat worked well, it was decidedly laggy and didn’t allow for a truly human conversational flow with interruptions. This is now fully realized.
Broad dynamic range of voices even within a single voice type
Each of the various voice types is now able to emulate different tones, emotions, and speaking styles, with such broad ranging (and utterly pointless but fun) examples as “robotic” or “singing.”
Realtime voice and video empathy and emotional intelligence
Not only is the voice assistant able to interpret the emotion of your own voice, but if looking at you via your phone camera while speaking, it is able to accurately identify your emotional state based on your facial expressions.
Ability to use camera phone in realtime while chatting with ChatGPT
You are now able to point your phone at a piece of paper while writing, for instance, mathematical equations, and ChatGPT will understand what you’re writing, as you’re writing, and discuss the topic as you go.
Computer assistant that can see what you’re working on for dynamic interaction
There is a new computer app — the demo showcased a macOS app — with which you can grant full screen sharing permissions so that ChatGPT can follow along as you’re working, and with which, you can engage in full, free-flowing, dynamic, human-like conversation. ⚠️ Serious questions regarding privacy and security concerns here.
Realtime Star Trek communicator or Her’s Samantha language translation
No longer an unrealized figment of science fiction, the realtime, lag-free dynamic voice chat capability at last allows for truly realtime language translation between multiple speakers.
AI web search
Similar to Perplexity, ChatGPT can now do realtime Internet searches as a legitimate alternative to Google.
Thoughts
While the Star Trek analogy is certainly apt — think the Enterprise’s Main Computer, or our favorite android Lt. Comm. Data — the truth is, this latest iteration of ChatGPT is something more akin to the extraordinarily clairvoyant film “Her” which debuted a whopping 11 years ago and showcased Samantha, an on-phone, always-on (via an AirPod-like device) AI personal assistant. (Little surprise that Apple is rumored to be baking ChatGPT into their next release of iOS in the fall.)
The speed and quality improvements and practical real-world value of true AI is something absolutely astonishing and unmatched, in any domain or industry vertical, in human history.
Where things will be five to ten years hence is somewhere within an unbounded realm of imagination scarcely touched upon even by science fiction. While Star Trek imagined voyaging “where no one has gone before,” the true journey, for better or worse, will not (yet) be amongst the cosmos, but rather, firmly on Earth, in our phones.