Marketoonist: "AI Hallucinations and Reliability" cartoon

Weekly hand-drawn business cartoon from Marketoonist Tom Fishburne

Welcome back to Marketoonist, the cartoon I’ve been hand-drawing to poke fun at marketing and business nearly every week since 2002. Was this email forwarded to you? Please subscribe here.

AI Hallucinations and Reliability

The NYT published a fascinating article last month on the conundrum of AI accuracy and reliability. They found that even as AI models were getting more powerful, they generated more errors, not fewer.

In OpenAI’s own tests, their newest models hallucinated at higher rates than their previous models. One of their benchmarks is called a SimpleQA test, based on general questions. OpenAI found their most powerful o3 model hallucinated 51% of the time, up from 44% in their earlier o1 model.

In their PersonQA test, based on questions about public figures, the o3 model hallucinated 33% of the time, double the rate of their earlier model.

Some of this growing problem relates to the nature of reasoning systems, as AI works through more complex problems in multiple steps, compounding the errors of each step.

Amr Awadallah, former Google exec and CEO of Vectara, claims that hallucinations are just part of the nature of AI models. As he put it:

“Despite our best efforts, they will always hallucinate. That will never go away.”

Last month, I wrote about the “Garbage In, Garbage Out” challenge of AI systems. I quoted how Greg Kihlsrom termed the outputs as “confident nonsense.”

With AI adoption full steam ahead, this raises the urgency for business leaders to figure out how to work around “confident nonsense.” Yet 64% of marketing teams are adopting AI without an AI roadmap or strategy, according to the AI Marketing Institute.

Some are trying to solve the hallucination problem by adding multiple AI systems to fact-check each other. Yet with each AI model bringing their own baggage, I’ve heard this described as a “turtles all the way down” problem, which inspired this week’s cartoon.

I like how Pratik Verma, CEO of Okahu, framed the challenge:

“You spend a lot of time trying to figure out which responses are factual and which aren’t. Not dealing with these errors properly basically eliminates the value of AI systems, which are supposed to automate tasks for you.”

Keynote Speaking

I’m taking a keynote speaking break the next two months, but I’m starting to plan some fun events for later this year, including the BAM Marketing Congress in Belgium.

As always, please let me know if you’d like to talk about any events you’re planning (or know of) that you think could be a good fit.

For an idea of my approach to keynotes, here’s a full 45-minute keynote from one of my favorite events last year — the Gartner CMO Symposium in Denver.

Cartoon From The Archives

Here’s a related cartoon I drew on a similar dynamic in 2015:

Thank you for all of your support (and cartoon material)!

-Tom

P.S. If you like these marketoons, here are a few ways to help:

  1. Bring me into your company to speak

  2. License cartoons for presentations or more (if a picture tells a thousand words, a marketoon tells a thousand boring PowerPoint slides)

  3. Forward this newsletter to a friend with an invitation to subscribe: marketoonist.com/newsletter.

  4. Buy my latest book

  5. Collaborate with me on cartoons for marketing, culture change, or thought leadership

  6. Just hit reply and say hello

About Marketoonist

Marketoonist is the thought bubble of me, Tom Fishburne. I first started drawing cartoons as a student in the Harvard Business School newspaper (not quite as well-known for humor as the Lampoon) and later started this newsletter from a General Mills cubicle in 2002. The cartoons have followed my career ever since. I poke fun at the ever-changing world of marketing and business because I believe that laughing at ourselves can help us do our best work.