OpenAI’s latest reasoning AI models are seeing things

Jack Dorsey invests $10 million in a non-profit organization dedicated to open source social media.

Twitter co-founder and Block CEO Jack Dorsey is not only working on new social apps like Bitchat and Sun Day, Read more

Rivian collaborates with Google to enhance navigation experience in its EVs and app

For the past 18 months, Rivian and Google engineers have been working together on a new project that is now Read more

Trump EPA Investigates Small Geoengineering Startup for Air Pollution

Humans have found it hard to quit fossil fuels, which is why some argue that we’ll soon need to start Read more

PHNX Materials: Turning Dirty Coal Waste into Eco-Friendly Concrete

Coal-fired power plants have made quite a mess over the past century. From climate change to health issues, they haven't Read more

OpenAI’s brand new o3 and o4-mini AI models are cutting-edge in many ways. But here’s the catch: these models are actually seeing things, and they’re seeing even more than some of OpenAI’s older models.

Hallucinations in AI have always been a tough nut to crack, even for the most advanced systems out there today. Traditionally, each new model has gotten better at minimizing hallucinations compared to its predecessor. But with o3 and o4-mini, that trend seems to be going in reverse.

### More Hallucinations, More Problems
According to OpenAI’s tests, these new reasoning models are hallucinating more frequently than their older reasoning models like o1, o1-mini, and o3-mini, as well as the traditional non-reasoning models like GPT-4o.

### Why is This Happening?
The ChatGPT maker is scratching their heads, trying to figure out why these models are seeing things more often. In their technical report, OpenAI admits that more research is needed to understand why hallucinations are on the rise as reasoning models scale up.

See also  Character.AI reveals AvatarFX: an AI video model for crafting realistic chatbots

### The Numbers Don’t Lie
Testing revealed that o3 was hallucinating in response to 33% of questions on PersonQA, double the rate of its predecessors. And o4-mini did even worse, hallucinating 48% of the time.

It’s a bit concerning that these models are making up events, actions, and information that never actually happened. But hey, maybe a little imagination isn’t always a bad thing, right?

Now, the race is on to find a solution before these hallucinations get even more out of control. Stay tuned for more updates on this trippy AI journey!

Developers enabled to utilize Microsoft Edge for AI web applications

Signal Ranks as the Top Downloaded Application in the Netherlands: A Comparative Analysis.