OpenAI is trying to fix how AI works and make it more useful

ChatGPT conversation screen on a smartphone — (Image credit: Jay Bonggolto / Android Central)

OpenAI has done something nobody would have expected: it slowed down the process of giving you an answer in the hopes that it gets it right.

The new OpenAI o1-preview models are designed for what OpenAI calls hard problems — complex tasks in subjects like science, coding, and math. These new models are released through the ChatGPT service along with access through OpenAI's API and are still in development, but this is a promising idea.

It seems like magic to us because we're used to seeing software output in a different way. When it acts human-like, it seems strange and futuristic, and that's really cool. Everyone wants to be Tony Stark and have conversations with their computer.

Unfortunately, the rush to release the cool type of AI that seems conversational has highlighted how bad it can be. Some companies call it a hallucination (not the fun kind, unfortunately), but no matter what label is placed on it, the answers we get from AI are often hilariously wrong or even wrong in a more concerning way.

Assistant with Bard at Made by Google event — (Image credit: Nick Sutrich / Android Central)

OpenAI says that its GPT-4 model was only able to get 13% of the International Mathematics Olympiad exam questions correct. That's probably better than most people would score but a computer should be able to score more accurately when it comes to mathematics. The new OpenAI o1-preview was able to get 83% of the questions correct. That is a dramatic leap and highlights the effectiveness of the new models.

Thankfully, OpenAI is true to its name and has shared how these models "think." In an article about the reasoning capabilities of the new model, you can scroll to the "Chain-of-Thought" section to see a glimpse into the process. I found the Safety section particularly interesting as the model has used some safety rails to make sure it's not telling you how to make homemade arsenic like the GPT-4 model will (don't try to make homemade arsenic). This will lead to defeating the current tricks used to get conversational AI models to break their own rules once they are complete.

Overall, the industry needed this. My colleague and Android Central managing editor Derrek Lee pointed out that it's interesting that when we want information instantly, OpenAI is willing to slow things down a bit, letting AI "think" to provide us with better answers. He's absolutely right. This feels like a case of a tech company doing the right thing even if the results aren't optimal.

I don't think this will have any effect overnight, and I'm not convinced there is a purely altruistic goal at work. OpenAI wants its new LLM to be better at the tasks the current model does poorly. A side effect is a safer and better conversational AI that gets it right more often. I'll take that trade, and I'll expect Google to do something similar to show that it also understands that AI needs to get better.

AI isn't going away until someone dreams up something newer and more profitable. Companies might as well work on making it as great as it can be.

TOPICS

Jerry is an amateur woodworker and struggling shade tree mechanic. There's nothing he can't take apart, but many things he can't reassemble. You'll find him writing and speaking his loud opinion on Android Central and occasionally on Threads.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.