Passed the Audit, Failed the Customer
A contact centre had been transcribing every call for months. Nobody was reading them. Everybody assumed somebody else was, or would, or had a plan to.

Every year, organisations accumulate more data than they know what to do with. The assumption โ reassuring, expensive, and nearly always wrong โ is that this is a technology problem. The warehouse is wrong. The dashboards aren't built. The pipeline is immature. A vendor is duly booked.
It is almost never a technology problem. Data does not arrive with a purpose attached; someone has to bring one, and the bringing is harder than the building.
This is the story of a mountain of transcripts that arrived in search of a question, and the mildly awkward answer that came back when somebody thought to ask one: that a substantial number of calls were passing the quality audit while quietly failing the customer. The two statements are not contradictory โ which is the interesting bit.
A Mountain in Search of a Question
The contact centre had recently enabled automatic transcription. Around 4,200 calls and 400 chats a week, handled by eighteen agents, all of it transcribed, summarised, and filed somewhere.
The summaries were well received. You could read a conversation from Tuesday in fifteen seconds and look, to a colleague, as though you had been paying attention. This is not a small thing in an office, and nobody wanted to give it up.
The transcripts themselves were a different matter. They accumulated quietly on disk, gigabyte by gigabyte, and nobody could quite say what the point of keeping them was, beyond the conviction that surely something useful would emerge. Surely something useful is where most enterprise data projects go to die.
We were asked, in effect, to find that something โ or, less flatteringly and more honestly, to establish whether there was a something to find.
For the first month there was no solution, because there was no problem. There was only a pile of conversations and the unromantic question: what, if anything, are these good for?
What the Transcripts Were Quietly Saying
When you read enough contact centre transcripts โ and we read a great many โ patterns appear that nobody has previously noticed, for the slightly depressing reason that nobody has previously bothered to look.
A customer calls to ask whether a branch is open on Saturday. The agent answers: open until two, have a lovely weekend. Ninety seconds. In the middle of those ninety seconds, the customer mentions โ in the tone normally reserved for remarking on the weather โ that they have two mortgages coming up for renewal. The agent does not offer to book a mortgage specialist. The customer does not ask. The call ends. Both parties hang up feeling that the exchange went well, because by the standards they were each keeping, it did.
A second call. A customer rings because their card has been declined. The agent explains, politely and correctly, that the card is at its limit; a payment would restore available credit; have a pleasant afternoon. No consolidation conversation. No suggestion of an alternative arrangement. No enquiry, even an idle one, about whether this is a one-off or the shape of a broader problem. Balance, goodbye.
Neither call was bad. Both agents were professional, accurate, and fast. A traditional quality audit โ of which this organisation ran one per quarter, on a curated sample of a few dozen calls โ would have scored them fine.
And that is the quietly interesting thing. Both calls passed the audit. Both calls failed the customer. These turn out to be descriptions of entirely different things, measured in entirely different ways, and the audit was only ever measuring one of them.
It is worth pausing on why the agents behaved as they did, because they should have spotted it is a comfortable explanation that happens not to be true.
Agents in a well-run contact centre have been trained, measured, and occasionally praised for not being pushy. Pivoting from "is my branch open Saturday?" to "shall I book you a mortgage appointment?" is the move of a telemarketer, not a helpful professional. The agent who resisted the pivot was not being negligent. They were being polite โ and the metrics they were judged on (handle time, first-call resolution, customer satisfaction in the moment) were quietly rewarding the politeness. The audit wasn't so much overlooking missed opportunities as gently paying for them.
This is the part of the operation nobody measures, because measuring it requires you to notice something that did not happen. Quarterly audits ask whether the agent greeted the caller by name. They do not, and cannot, ask whether the agent spotted the mortgage. Leadership, listening to a small curated sample and receiving generally favourable reports, held a broadly favourable view of service quality. The view was not wrong, exactly. It was answering a different question.
An Auditor That Doesn't Sleep, and Can't Be Flattered
Month two was the build. Using the organisation's existing Microsoft and Azure stack, with a generative AI model doing the reading, we put in place a system that examined every transcript โ not a sample, not a selection, every one โ for three things:
Completeness of service. Was the customer's stated need handled, and handled well?
Missed opportunities. Did the conversation contain signals โ renewals, life events, financial stress, upcoming purchases โ that warranted a second conversation and didn't receive one?
Captured opportunities. Where the agent had spotted and actioned a signal, the system credited them.
That last item is not decorative. A system that only reports what you did wrong becomes, very quickly, a system that people route around. A system that also reports what you did well is one they can live with and, in time โ not always gratefully, but genuinely โ one they can use.
Three design choices made the rest of it work, and all three were really the same choice rephrased.
1. The agent hears about it first.
When the model identified a missed opportunity, the alert went to the agent who had handled the call, not to their manager. The agent then had a window to do what the call had not โ ring the customer back, make the referral, book the appointment, or flag the record if the system had misread the conversation. Only afterwards did the item move up the reporting chain.
The distinction may appear procedural. It is not. A system that ambushes people with their failures is experienced, quite correctly, as surveillance. The same system, giving them first sight of the same finding and letting them correct it, is experienced as coaching. The data is identical. The sequence is the entire difference. Most of what fails in these projects fails here, not in the model.
2. The model is allowed to be wrong.
Language is ambiguous. My son is starting university may or may not be the opening of a meaningful financial conversation; context decides, and sometimes context is thin. Agents were given an unceremonious way to contest a flag, and aggregate results were adjusted for fairness. This turned the tool from a verdict into a conversation โ which is what a coaching tool ought to be, and what a surveillance tool never is.
3. The numbers travel upward on their own.
Individual coaching happened at the agent level. Trends โ missed opportunities by theme, queue, week โ rolled up into dashboards for the contact centre leader and thence into a short weekly summary that eventually found its way to executive leadership. This last step was not planned. It happened because the metric described something people wanted to know, and metrics of that sort have a habit of migrating upward whether anyone has authorised the migration. It is, in fact, a reliable test of whether a metric is any good: if it stays where you put it, it probably isn't one.
What Changed
At launch, the system flagged a missed opportunity or an incomplete service moment in roughly 40% of engagements.
This is not a number anyone expected. It is also not a number one would arrive at by listening to a hand-picked sample of calls and filling in a scorecard. It was the first time the organisation had seen service quality assessed across every conversation rather than a curated handful, and the gap between sample and reality turned out to be substantial.
Over the following six months, with agent-level feedback running continuously, the rate fell to under 5%, where it has jittered comfortably ever since. Some of that jitter is the model occasionally flagging something that wasn't really a miss; most of it is that agents got meaningfully better once they could see what they had been missing. Both explanations are flattering to somebody.
Meanwhile, the downstream business moved in ways that were harder to argue with:
Appointments booked, referrals made, and related conversions arising from contact centre engagements roughly doubled year on year.
Audit coverage went from a quarterly sample of a few dozen calls to 100% of every call and chat, at no additional human cost.
What began as an informal internal report became the contact centre's de facto service-quality metric. The shift toward being a formal one happened when executive leadership noticed the doubling in downstream business and concluded, reasonably enough, that anything driving that number deserved its own row on a dashboard. It is being operationalised as a departmental KPI next year.
Two months from we have transcripts to the first agent alerts. Six months from the first alerts to an informal executive metric. A year from start to a KPI in waiting. Most of that time was not technical.
The Thing That Is Easy to Miss
There is a reflex in modern organisations to treat new data as a technology opportunity โ which dashboard, which warehouse, which vendor. It is almost never the right instinct. New data is, first, a question opportunity: it lets you ask things you previously could not.
The transcripts, sitting quietly on disk, were not themselves the insight. They were a permission slip. They made it possible, for the first time, to ask: what are we systematically not hearing? โ and to get an answer that was neither flattering nor fatal, just honest.
The quarterly audit had told leadership that service was good. The audit was not wrong. It was answering a different question from the one that mattered. A sample of a few dozen calls confirms, with considerable reliability, whatever the organisation already believes about itself. An instrument that examines every conversation does not have that luxury โ and is therefore harder to love, and more useful to have around.
Which is, in the end, the point about an auditor that cannot be tired out, cannot be flattered, and cannot be talked into a generous interpretation. It notices what is missing. It is easy to live with a contact centre where most of the calls go well. It is considerably harder to live with a contact centre where most of the calls go well and you know precisely which ones did not. The difference between those two states is almost entirely perceptual, and almost entirely the point.
100% call and chat coverage ยท Missed opportunities: ~40% โ <5% ยท Downstream conversions roughly doubled year on year
Sitting on data you haven't found a use for yet?
Most data problems aren't storage problems. They're question problems. If your organisation is generating more transcripts, logs, tickets, or forms than it knows what to do with, a short conversation is usually enough to work out whether the answer is already inside โ and if so, how to get at it.
Book a process audit | Learn more: process automation | Contact us



