top of page

VoC, CTQ and DPMO or Hallucinations?

  • James Markham
  • Jul 3
  • 2 min read

In early 2023, when we were building out FleetAI at Dentons, one of the incredibly innovative things we did was speak directly to the firm's clients :)


I know, right?


The early insight was invaluable and it highlighted that clients had essentially two concerns:


(1) Hallucinations - that the technology 'made stuff up'. Although clients didn't really worry on the basis that the firm would check, and take responsibility, for the outputs before they hit the client's desk


(2) Data security - whilst less of an issue now, there was no way at the time for people to use ChatGPT whilst keeping data secure (esp. client data)


That Voice of the Customer (VoC) exercise enabled us to focus development and training efforts, deploy FleetAI and the world orbited the sun a couple more times


Except I'm not sure, ~2 years later, that the world has really moved on as much as I thought it might have by now


When it comes to hallucinations, tech vendors are chasing ever decreasing hallucination rates in synthetic benchmarks (notionally 0.7 - 2.2% on the April Vectara benchmarking)


But business generally (and legal specifically) still needs to step up and properly measure Critical to Quality (CTQ) Defects per Million Opportunities (DPMO) within the wider business process


Setting GenAI hallucination rates within that broader measurement enables:


- assessment of where a hallucination is critical to quality, rather than incidental


- robust measurement of defects before and after the technology has been implemented at the business-process (not technology-task) level


By not doing this, we fail to:


(i) properly assess the impact of hallucination on business outcomes (see Klarna's call centre debacle, and the FRC's recent criticism of UK accounting firms not tracking the impact of AI tools on audit quality)


(ii) continue to direct GenAI tools inappropriately at hallucination sensitive processes without adequate compensating controls (e.g. the examples of hallucinated citations in court filings is more than a bit embarrassing at this stage)


(iii) miss more appropriate opportunities for the technology (e.g. compare the risks in "summarise the emails in my inbox" vs "prepare this court filing")


(iv) see the woods for the trees in measuring ever so finely the performance of one element (e.g. the GenAI component) of a process whilst ignoring the potentially much bigger impact on, and of, upstream and downstream tasks


If I were a betting man, I'd suggest it's unlikely that hallucination rates will reach (practically) zero


But rather than get hung up on them, the established Lean Six Sigma concepts, of VoC, CTQ and DPMO, are perfectly suited to implementing GenAI effectively within wider business processes


It's worth understanding these concepts, even if it is a rather heavy dose of acronyms...


VoC, CTQ and DPMO or Hallucinations?
VoC, CTQ and DPMO or Hallucinations?


Recent Posts

See All
Return to Office (again)

The post-covid mess that keeps on giving... We should all be questioning the sanity of mandating workers to return to desks that don't...

 
 

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.

© 2024. The Legal MBA is brought to you by Limitless Professional Limited, UK company 13850756. Suite 5, 5th Floor, City Reach, 5 Greenwich View Place, London, E14 9NN

bottom of page