The most recent iteration of the technology behind OpenAI's AI chatbotChatGPT has been made public.
GPT-4o is the name of the feature that will be made available to all ChatGPT users, even those who are not subscribers.
It is faster than earlier models and has been programmed to sound chatty and sometimes even flirtatious in its responses to prompts.
The new version can read and discuss images, translate languages, and identify emotions from visual expressions. There is also memory so it can recall previous prompts.
It can be interrupted and it has an easier conversational rhythm - there was no delay between asking it a question and receiving an answer.
During a live demo using the voice version of GPT-4o, it provided helpful suggestions for how to go about solving a simple equation written on a piece of paper - rather than simply solving it. It analysed some computer code, translating between Italian and English and interpreted the emotions in a selfie of a smiling man.
Using a warm American female voice, it greeted its prompters by asking them how they were doing. When paid a compliment, it responded: “Stop it, you’re making me blush!”.
It wasn’t perfect – at one point it mistook the smiling man for a wooden surface, and it started to solve an equation that it hadn’t yet been shown. This unintentionally demonstrated that there’s still some way to go before the glitches and hallucinations which make chatbots unreliable and potentially unsafe, can be ironed out.
But what it does show us is the direction of travel for OpenAI, which I think intends GPT-4o to become the next generation of AI digital assistant, a kind of turbo-charged Siri or Hey, Google which remembers what it’s been told in the past and can interact beyond voice or text.