Pro Tips

Quality Beats Quantity in RCAs, But AI Lets You Have Both [Part 2]

Quality Beats Quantity in RCAs, But AI Lets You Have Both [Part 2]

Jan 15, 2026

RCA Quantity vs. Quality
RCA Quantity vs. Quality
RCA Quantity vs. Quality

In the first part of this blog, we established that RCA quality is usually the higher-leverage variable once you have baseline coverage on quantity, we referenced empirical industry research to support the argument, and we discussed a practical definition of RCA quality. In this second part, we will introduce a quantitative KPI to measure quality and discuss that with AI support, you don’t have to choose between quality and quantity. You can get the best of both worlds. 

The practical operating model: tiered investigations plus quality scoring

If you want both learning coverage and rigor, a tiered system is still the most defensible structure:

  • Tier 1: serious injury, high potential events
    Full cross-functional RCA, higher rigor, leadership review (AI co-pilot with human-in-the loop, potentially use an RCA methodology like Haven’s AI-powered Multi-threaded analysis) 

  • Tier 2: recordables and high-frequency events
    Structured investigation with streamlined causal analysis (also AI co-pilot with human-in-the loop, potentially use RCA methodology like Fishbone/6M, which is also be AI-powered in Haven) 

  • Tier 3: near misses and low severity
    Rapid learning reviews and trend capture (AI-automated to capture the learning with human-in-the-loop spot checking)

The key is adding a quality gate: a lightweight rubric or review standard that prevents Tier 1 and Tier 2 work from collapsing into surface-level narratives.

RCA Quality Score KPI

As we established in Part 1, industrial research shows that the better-performing programs have stronger investigation capability, better methods, and stronger linkage from investigation to action. 

If you want fewer incidents, the KPI is not the volume or rate of “RCAs completed.” That metric mostly tells you how fast your team can produce reports.

A prevention-grade RCA program behaves more like a conversion funnel:

Meaningful event → investigated at the right depth → produces strong controls → controls get implemented → controls are verified effective → recurrence drops

So the KPI set you want is one that measures conversion quality, not throughput.

  • Right Depth Coverage: percent of meaningful events investigated at the right tier

  • RCA Quality Score: distribution, median, percent below 55, percent above 85

  • Corrective Action Strength Index (CASI): measures the actions against the hierarchy of controls. Trend the percent of actions at engineering or elimination, and percent of RCAs with at least one 4 to 5 actions, taking into account their feasibility and potential impact (easier said than done, look for a future article on how Haven AI helps here).

  • Controls In Place On-Time: percent controls in place by due date, plus time-to-control

  • Verification Effectiveness Rate: percent with verification plans, completed verification, and verified effective

  • Recurrence by Mechanism and Exposure: repeat event rate for targeted mechanisms normalized by exposure (leverage AI combined with mechanisms like Haven's Industry Knowledge Graph)

How AI helps: score at scale with human-in-the-loop governance

Once you define RQS and CASI, the next challenge is scale. Scoring every RCA manually can become yet another workload sink.

This is an ideal use case for AI as a “first-pass judge”:

  • AI does the initial scoring against your rubric:

    • evidence completeness signals

    • causal depth indicators (latent conditions, barrier analysis, human factors)

    • action strength classification (CASI)

    • presence and specificity of verification plans

  • Humans stay in control:

    • spot-check a statistically meaningful sample each month

    • automatically route low-scoring or high-risk RCAs for mandatory review

    • calibrate the rubric and AI scoring through periodic review sessions

A simple governance model looks like:

  • AI scores 100% of Tier 1 and Tier 2 RCAs

  • Humans review:

    • 100% of Tier 1 RCAs below a threshold (for example RQS <70)

    • all RCAs involving certain mechanisms (SIF precursors, critical control failures)

    • a random 10% to 20% sample for quality assurance and calibration

This approach gives you two wins at once:

  • Consistency: every RCA is measured against the same standard

  • Efficiency: your experts spend time on coaching and high-risk reviews, not grading every document

And once scoring is consistent, you can finally manage the program like a prevention system:

  • raise coverage where it matters

  • raise the conversion rate from investigations to strong controls

  • and prove progress with metrics tied to recurrence.

Where AI fits: scaling both RCA quantity and quality

This is exactly the bottleneck that tools in the AI-powered RCA category are designed to address, a category that Haven is helping lead: modern industrial organizations need more learning coverage without sacrificing rigor, and they need more rigor without slowing to a crawl.

At a practical level, these tools map directly to the two constraints that create the quality-versus-quantity tradeoff: investigation friction (time) and investigation variability (inconsistency).

Increasing volume by removing investigation friction

AI-powered RCA tools increasingly support capabilities like:

  • guided, structured witness statements (instead of unstructured narratives)

  • automated evidence collection and organization (photos, documents, logs, policies)

  • timeline synthesis and event reconstruction

  • drafting standardized “what happened” summaries from source inputs

That matters because the biggest time sinks that limit RCA coverage are usually not the “thinking” steps, they are the document and data crunching steps:

  • chasing down incomplete narratives

  • manually building timelines

  • hunting for artifacts across systems

  • rewriting the same sections of the report every time

When you reduce that overhead, teams can investigate more events at the right tier, not just the top 1% to 2%, without immediately collapsing under workload.

Improving quality through consistency and reasoning support

The second constraint is variability. Even in mature programs, RCA quality can swing dramatically based on who ran it, how rushed they were, and whether the team had the right evidence on hand.

AI-powered RCA tools are increasingly positioned to improve:

  • investigation consistency (standardized questioning and completeness prompts)

  • causal reasoning rigor (surfacing latent contributors, barrier failures, and human factors)

  • linkage from causes to controls (ensuring corrective actions map to causal findings)

  • corrective action quality (nudging toward higher-order controls rather than training-only defaults)

  • verification planning (forcing explicit effectiveness checks instead of “close and move on”)

This matters because many “quality failures” are really process failures:

  • inconsistent questioning

  • missed contributing factors

  • weak linkage between causes and controls

  • corrective actions that default to training and reminders

  • lack of explicit effectiveness verification

A well-designed AI copilot like Haven makes it harder to skip steps, easier to compare to similar prior events, and easier to apply a consistent causal and control logic across investigators and sites.

Conclusion: with AI support, you do not need to choose quantity or quality

Historically, safety leaders have been forced into a tradeoff:

  • Increase RCA volume and accept shallow, inconsistent investigations, or

  • Preserve RCA quality and accept minimal coverage.

With AI support, you do not need to choose one or the other.

AI efficiency gains support increasing the volume of investigations you can complete (and the percentage of incidents you can meaningfully learn from). AI reasoning capabilities, consistency, and data crunching support significantly improving quality by strengthening causal analysis, standardizing outputs, and improving the linkage from causes to strong, verifiable corrective actions. 

That is the opportunity: more learning, better learning, and a tighter prevention loop.


References and Further Readings

Industrial safety and learning-from-incidents:

  • Stemn et al., investigation report quality and safety performance (mining). (Springer)

  • Stemn et al., investigation maturity framework (mining). (MDPI)

  • Wachter & Yorio, accident investigation characteristics and safety performance (300+ establishments). (ResearchGate)

  • Jacobsson, Ek, & Akselsson, learning-from-incidents cycle assessment method (process industries). (ScienceDirect)

  • Drupsteen, Groeneweg, & Zwetsloot, bottlenecks in learning from incidents (evaluation stage). (PubMed)

Adjacent evidence on RCA outputs and sustainability:

  • Hibbert et al., recommendation strength and sustainability. (OUP Academic)

  • Martin-Delgado et al., systematic review on whether RCAs reduce recurrence. (PMC)

Haven Safety AI:

  • Haven platform overview. (Haven Safety)

  • Haven launch announcement and platform modules (havenSIGHT, havenEDGE, havenIMPACT). (Haven Launch)

See Haven in Action

See Haven in Action

Experience how AI-powered safety intelligence can transform your workplace. Book a demo to see our platform in action.