The map

Are people fed up with chatbots?

Yes, well sort of. Bad implementations hammer brand reputation, good implementations can be brand positive. What does this mean for DTC brands?

Justin ThompsonJune 4, 20266 min read

Most CX leaders are not asking whether AI can answer customer questions. They know it can.

The harder question is whether customers will accept it, especially when something has gone wrong.

That is where the AI support conversation gets messy. The same technology can make one brand feel faster and another feel colder. It can take repetitive work out of the queue, or it can put a brittle layer between an already-frustrated customer and the person who can actually help.

So the useful question is not “are people fed up with chatbots?” exactly. It is: which chatbot experiences are customers rejecting, and which ones are they fine with?

Search backs it up. “Speak to a human” peaked at 77 in April 2026 from a baseline around 30. Every phrase in the “get me past the bot” cluster bends upward in August 2025. People are not only annoyed in surveys. They are actively trying to route around AI in customer service.

But the public record is not one clean story. There are high-profile failures, and there are quieter deployments customers seem to like. Reading both is the closest thing to predictive data a DTC CX leader has.

The big rollouts to learn from

Klarna. Announced in February 2024 that an OpenAI-built assistant was doing the work of 700 CS agents in its first month. Fourteen months later, Sebastian Siemiatkowski told Fortune that “cost was a too predominant evaluation factor” and Klarna was investing in humans again.

Air Canada. A customer asked the airline’s chatbot about bereavement fares. The chatbot invented a refund policy. When Air Canada refused to honor it, the customer took them to British Columbia’s Civil Resolution Tribunal, which ruled against the airline in February 2024.

Cursor. The AI-coding-tool company’s own support bot invented a one-device policy in May 2025. There was no such policy. Subscribers churned and Fortune ran the story.

Chipotle (notable mention). Not reputational damage, just funny. In March 2026 users discovered Chipotle’s support bot, “Pepper,” would solve LeetCode and write Python. Someone shipped an OpenAI-compatible proxy so the internet could use Chipotle’s compute for free coding help. The brand had shipped an LLM in production without scope guardrails, and the internet noticed within weeks.

Chipotle Pepper customer support bot solving a LeetCode coding problem — Chipotle’s “Pepper” support bot answering a coding question.

When the AI broke, customers did not blame the vendor. They blamed the brand. The Air Canada tribunal made it explicit: the chatbot’s misinformation was the airline’s.

What separates the failures from the wins

What separates a success from a viral failure isn’t budget or vendor. It’s whether the work needs human judgment.

Bank of America’s Erica. Live since 2018, past 2.5 billion interactions and 56 million users by August 2025. Erica handles banking actions customers were already doing in the app, just faster.

Lyft + Anthropic. Lyft published in 2025 that integrating Claude cut customer service resolution time by 87% on handled cases. Most are real-time dispatch problems humans couldn’t scale to.

Sephora. Shade matching, virtual try on, product discovery, across a deployment running nine years. Work no human was ever going to staff at that volume.

Quick low-stakes answers. Order status, store hours, return policy basics, password resets. Customers prefer the bot here because the alternative is waiting on hold for a human to read them a tracking number. No emotional load, no policy interpretation, no judgment call. The bot just has to be faster than the form.

The wins sit where humans weren’t going to do the work (banking self serve, dispatch, retail-scale browsing) or didn’t need to (status lookups, hours, policy basics). The losses are complex cases the chatbot was never going to be able to solve, leaving the customer feeling like they need to get through the bot to speak to a human.

What does this mean for DTC?

The lesson is not “avoid customer-facing AI.” It is to understand which side of the line your tickets sit on before you deploy.

Quick lookups, order status, tracking, store hours, returns policy basics, sit in the wins category. Deploy there, take the speed. The tickets customers escalate over, refund disputes, damaged orders, subscription cancellations, anything with emotional load, sit in the losing category. AI on those either fails publicly like Klarna did, or routes most of the work to a human anyway.

The deployment that quietly works does both. The bot handles the easy mix. Humans handle the hard one. Most brands skip that diagnostic and then wonder why one blended deflection number does not match what customers are feeling.

What are you optimising for?

Deflection rate is what vendors lead with. It is not customer experience. You can drive deflection up and CSAT down at the same time. Klarna did, then walked it back.

Targeted deployments with clear escape hatches to a human work. Blanket rollouts don’t.

Sources

Klarna press release, February 27, 2024. Initial announcement: AI assistant handling two-thirds of CS chats, work of 700 FTEs in first month.
Fortune, May 9, 2025. Sebastian Siemiatkowski walk-back: “cost was a too predominant evaluation factor,” Klarna investing in human support again.
CBC, February 2024. Air Canada chatbot ruling: BC Civil Resolution Tribunal held airline liable for chatbot’s invented refund policy.
Fortune, May 2025. Cursor AI support bot inventing one-device policy, subscriber churn.
cyberpapiii/chipotlai-max on GitHub. March 2026 Chipotle “Pepper” support bot used as general LLM, OpenAI-compatible proxy shipped.
The Register, May 2026. 74% of firms have rolled back at least one customer facing AI deployment in the last 12 months.
Bank of America newsroom, August 2025. Erica surpassed 2.5B interactions, 56M active users, seven years live.
Anthropic, 2025. Lyft + Claude integration cut average customer service resolution time by 87% on handled cases.
Sephora newsroom. Nine-year chatbot deployment, expansion to ChatGPT app.
Gorgias 2026 State of Conversational Commerce. 16,000 brands, 350M conversations. 86% of AI conversations eventually involve a human.
UJET 2026 Agentic Experience Orchestration white paper. 85% of consumers prefer human agents over AI (Metrigy).
Google Trends data (June 2026). speak to a human peak 77 in April 2026, live customer support sustained 84-89 since December 2025, all rebellion terms inflect August 2025.

Part of the AI in customer service: the map series

The AI-in-CX category is still being drawn. Deflection, assist, automation, copilot, agent. These words mean different things to different vendors, and the map of the category is contested. This pillar publishes our reading of the map, and where Handsom sits on it.

See the full series

What is Handsom?

Team-side AI that briefs your support team on every ticket before they open it. Lookup work happens once, by the AI; your reps reply with context.

See how it works

More in The map

Title card on a warm parchment background reading: Is AI going to replace customer support?

The mapJune 6, 20266 min read

No. What the Klarna walk-back, Gartner's 2027 prediction, and 86% of AI conversations needing a human tell us about where AI actually fits in customer support.

Read article

Title card on a warm parchment background reading: What's the difference between deflection rate and resolution rate?

The mapJune 6, 20265 min read

Deflection rate measures whether AI touched the ticket. Resolution rate measures whether the customer got an answer. Vendors quote the first; only the second describes customer experience.