Skeptical process engineer reviewing AI-generated fault diagnosis recommendations

Industrial AI

Why Process Engineers Don’t Trust AI — and What Changes That

Moe Tanabian · April 25, 2026 · Intuigence AI

Accuracy scores don't build trust on a factory floor. This is a lesson that comes up repeatedly in conversations with process engineers who have encountered industrial AI systems — and it runs counter to most of how AI products are marketed. The implicit promise of a high-accuracy system is that you should trust it because it is usually right. The problem is that "usually right" is not a useful operating instruction for an engineer standing at a halted line with a shift supervisor asking for a restart time.

Trust in industrial AI is built through a different mechanism: explainability. The engineer does not need the AI to be right 95% of the time. They need to be able to verify whether the AI is right in this specific case, with this specific fault, in this specific production context. A system that shows its reasoning can be checked. A system that only shows its conclusion can only be trusted or dismissed — and experienced engineers will dismiss it until it has demonstrated its reasoning enough times to earn a baseline of credibility.

Moe Tanabian spent years working in process engineering roles at automotive and manufacturing operations before founding Intuigence. The trust problem was not academic — he encountered it from the inside, as the engineer on the floor. The question was never "is AI accurate enough to be useful?" The question was "can I check it fast enough to rely on it under production pressure?" That question shaped every design decision in how Intuigence AI surfaces its hypotheses and evidence.

What Engineers Don't Trust — and Why

In conversations with process engineering teams at Tier-1 plants in Michigan and Ohio, we consistently hear three categories of AI distrust, each with a distinct cause:

The black-box rejection: "It gave me a station and a percentage. I have no idea how it got there." This is the most common form of distrust, and it is entirely rational. An AI recommendation with no visible reasoning is indistinguishable from a random output to an engineer who cannot verify it. Even if the recommendation is correct, the engineer cannot know that without going to verify it manually — at which point, they have done the diagnostic work themselves and the AI has added a verification step, not removed a search step.

The one bad call: "It was wrong about the weld station last month and we chased it for an hour." A single high-confidence wrong recommendation can undo weeks of correct ones. Engineers are right to apply this asymmetry — the cost of a wrong recommendation is an hour of misallocated diagnostic effort; the cost of a correct recommendation is a few minutes saved. The trust deficit from a wrong call accumulates faster than the trust credit from a correct one.

The general skepticism about AI in operational contexts: "This isn't a chatbot. I have 400 parts per hour on the line." This is a deeper orientation — a reasonable response to years of technology tools that were designed for office environments and deployed on factory floors without regard for the operational context. Engineers on Tier-1 lines have seen enough failed IT deployments to be skeptical of anything that arrives on a laptop with a PowerPoint deck.

Each of these trust deficits requires a different response. The black-box rejection requires transparency features. The bad-call asymmetry requires calibration honesty. The general skepticism requires earned credibility through consistent, verifiable performance over time.

Transparency Feature 1: Fault Confidence Intervals

A confidence score — "Station 14, 87% probability" — is not a confidence interval. A confidence interval — "Station 14, probability range 74–91% based on 8 matching historical fault events" — is. The difference is significant for an engineer deciding how much investigation effort to allocate before verifying the AI's hypothesis.

A narrow confidence interval (e.g., 82–88%) means the evidence strongly and consistently points to the hypothesis. A wide interval (e.g., 51–79%) means the evidence is pointing in a direction but with meaningful uncertainty — perhaps because the fault pattern partially matches multiple known fault modes, or because the correlation chain involves a station the system has limited historical data on.

The engineer who sees a narrow interval goes to Station 14 expecting to confirm the hypothesis. The engineer who sees a wide interval goes to Station 14 planning to investigate with an open mind. These are different inspection strategies, and the confidence interval is the information that distinguishes them. A single number does not.

Transparency Feature 2: Trace Citation

Every hypothesis the system generates should cite the specific trace evidence that supports it. Not a summary — the actual tag names, the actual timestamps, the actual deviation values. This is the manufacturing equivalent of showing your work.

In practice, this means the engineer can read: "Hypothesis: Station 14 cylinder seal degradation. Evidence: ST14_CYL_A_POS_FB — overshoot pattern onset 14:22, deviation +12% from 30-day baseline. ENV_TEMP_ZONE3 — below 18°C threshold at 14:19. Pattern matches 6 of 8 historical seal degradation events on this cell type. Non-matching features: fault reset at 15:07 not consistent with prior events (5 of 6 prior events had no reset)."

That last sentence — noting where the current event does not match the historical pattern — is critical for trust-building. A system that cites supporting evidence without noting contradictory evidence is overfitting its hypothesis presentation. An engineer who reads a note that flags an anomaly in the match will treat the recommendation more carefully. That is the correct calibration, and it builds trust faster than a system that only shows the supporting side.

Transparency Feature 3: Station-Level Evidence Panel

For a multi-station line, the engineer needs to see not just the top hypothesis but the full ranked candidate list with the evidence for each candidate. A top-ranked station at 87% confidence is more convincing when the engineer can also see that the second-ranked station is at 41% confidence — and can see why (the second-ranked station showed a deviation but its onset was 18 minutes after the downstream fault, which is more consistent with consequence than cause).

The station-level evidence panel is the mechanism that allows an engineer to do a rapid sanity check on the AI's reasoning without re-running the analysis themselves. They can scan the panel, see whether the rank ordering makes sense given what they know about the line, and identify any cases where their domain knowledge suggests the AI has missed a relevant factor.

This interaction pattern — engineer reads the AI's evidence, applies their own knowledge, confirms or adjusts — is the correct model for human-AI collaboration in fault diagnosis. The AI does the data correlation; the engineer does the contextual validation. Neither can do the other's job well. Both are necessary for the diagnosis to be reliably correct and reliably trusted.

The Accumulation Model of Trust

Process engineers on Tier-1 lines learn their environment through accumulation — repeated cycles of observation, prediction, and outcome verification. An engineer who has watched the same cell for two years knows from experience that "when Station 19's weld current drifts in the first hour of a Monday morning shift, the electrode tip is worn and will need replacement by noon." That knowledge was earned through dozens of confirmations of the pattern.

AI-assisted fault diagnosis earns trust through the same accumulation model, but the timescale is compressed if the system makes its reasoning visible. When an engineer can verify the AI's hypothesis in real-time — go to the cited station, observe the cited deviation, confirm it matches the AI's description — each verification is a trust increment. Over weeks of consistent, verifiable recommendations, the baseline credibility builds to the point where the engineer's default posture is "probably right, will verify" rather than "probably wrong, will ignore."

We are not claiming this trust accumulation is guaranteed or automatic. Systems that have been wrong on high-confidence calls, or that have surfaced false evidence in their trace citations, will experience trust erosion that takes longer to reverse than the initial trust-building took to establish. The asymmetry of trust gain and trust loss in engineering contexts is real and should be respected in how AI systems are deployed. The right posture is to start with conservative confidence thresholds, show evidence for every recommendation, and let the track record build the credibility — not to promise accuracy scores upfront and let the first bad call undercut the deployment.

What Changes: Adoption Patterns

The adoption pattern for process engineers using AI-assisted fault diagnosis follows a recognizable arc. In the first two weeks, engineers verify every recommendation manually — they go to the recommended station, check the cited evidence, confirm or reject the hypothesis. The AI is functioning as a hypothesis generator, not a trusted advisor. This is healthy and expected.

By the end of the first month, if the recommendation accuracy on verifiable fault events has been strong and the evidence citations have been accurate, engineers begin allocating their investigation time differently. Instead of starting at the fault detection point and working upstream, they start at the AI's top-ranked station and work outward. The AI has earned the right to be the first place they look, not the last.

By the end of the third month, the interaction model has shifted: the engineer reviews the AI's evidence panel before going to the floor, uses it to decide what tools to bring and what to check first, and treats the AI's output as the shift briefing for a fault event rather than a separate verification task. The trace citation isn't something they check to validate the AI — it's the context they carry into the investigation.

IATF 16949 corrective action requirements are also better served by this model. The trace citation embedded in the AI's hypothesis becomes the documented evidence trail for the corrective action record — the connection between the fault event, the root cause analysis, and the work order that closes the event. The audit trail that quality management requires is a natural output of the transparency mechanism that engineers require for trust. Those two requirements, properly designed for, are the same requirement.

Trust in industrial AI is earned in verifications, not announcements. The engineers who are most skeptical of AI on factory floors are also the engineers who, given transparent evidence and honest confidence presentation, become the most productive users of it. Getting there requires building a system that treats verifiability as a core feature, not an afterthought.