We stand on the
shoulders
of giants.
Now it is
our turn.
"Every Cpk we compute, every control chart we plot, every FMEA we run — these are not bureaucratic checkboxes. They are acts of responsibility. Somewhere at the end of our supply chain is a person who will use what we make. They trust us, without knowing us, to have done the work properly."
Deming, Juran, Shewhart, Taguchi, Ishikawa — they spent lifetimes building the statistical and philosophical foundations of quality. Their tools are not old. They are permanent. Ours to use, teach, and pass forward.
This reference was built because quality knowledge should be accessible, precise, and free — not locked behind expensive textbooks or five-day seminars. Whether you are running a PFMEA at midnight, explaining Ppk to your manager, or diving deep into reliability analysis, this is for you.
Six Sigma & DPMO
Normal distribution, Z-values, DPMO formulas, the 1.5σ shift convention, battery anode moisture worked example, and Monte Carlo simulation. The statistical core of Six Sigma methodology.
Measurement System Analysis
S.W.I.P.E. error model, stability, bias & linearity, GR&R via X̄-R & ANOVA, torque wrench drift case study.
Quality Philosophy
Deming, Juran, Crosby, Ishikawa. PDCA, DMAIC, Lean frameworks, strategic planning, facilitation tools.
Quality Systems
ISO 9001 → IATF 16949 maturity, PPAP levels, special characteristics, 8D problem solving, escalation models, customer-specific requirements.
Statistical Process Control
Cp/Cpk/Ppk, chart selector decision tree, annotated out-of-control patterns, Western Electric rules.
DPMO & Capability Calculator
Enter LSL, USL, mean, sigma. Instantly compute DPMO, sigma level, Cpk, Cp, defect probability.
Reliability Engineering
MTBF/MTTR/Availability formulas, full bathtub curve SVG, Weibull β shape cards, series & parallel systems.
Statistical Distributions
Normal, Weibull, Exponential, Lognormal, Binomial, Chi-square, Poisson, t, F — formulas, properties, applications.
Military & Defense Standards
MIL-STD-1629A FMECA, MIL-HDBK-217F reliability prediction, ANSI Z1.4 sampling, AS9100D, AQAP-2110.
Applied Statistics
Hypothesis testing, confidence intervals, regression, ANOVA, chi-square — with quality engineering examples.
FMEA & RPN
DFMEA and PFMEA structure, S/O/D scales, live RPN calculator, action priority matrix.
Risk Management
ISO 31000 framework, risk matrix construction, bow-tie diagrams, failure mode prioritization.
Design of Experiments
Full factorial, fractional factorial, Taguchi orthogonal arrays, main effects, interaction plots, ANOVA.
Design for Six Sigma
DMADV roadmap, VOC to CTQ, concept selection, DOE optimisation, tolerance design, full worked example — from brief to production.
As we stand on the shoulders of giants, we have a responsibility to be better — to strive continuously for quality products reaching the customer. Every engineer carries the trust of the end user, someone they will never meet, who relies on the work being done properly.
Good engineering is not sufficient if competitors select their designs from better alternatives. The goal is to empower every quality engineer to make better decisions, ship better products, and uphold the responsibility we carry — to the customer, to the craft, and to those who came before us.
Six Sigma & DPMO
From normal distribution tails to defect probability — how sigma level, specification limits, and the 1.5σ long-term convention translate into real manufacturing quality targets.
Six Sigma Metrics Toolkit — DPU, DPO, DPMO, Yield & RTY
Before you can improve a process, you must be able to measure it precisely. Six Sigma uses a tightly connected family of metrics that scale from a single unit all the way to a million-opportunity benchmark. This tab gives you every formula, example, and visual you need.
① DPU — Defects Per Unit
The simplest defect metric — average number of defects found on each unit regardless of how many opportunities for failure each unit had. DPU of 0.15 means roughly 1 defect per 7 units.
Limitation: DPU ignores complexity. A complex PCB and a simple bracket both become "one unit." Use DPO for cross-process comparison.
② DPO — Defects Per Opportunity
Normalises the defect rate by the number of distinct ways a unit can fail. Enables fair comparison between processes of different complexity. An "opportunity" is any characteristic that could be measured and found defective.
Defining opportunities consistently is critical — too many opportunities dilutes DPO; too few inflates it.
③ DPMO — Defects Per Million Opportunities
Scales DPO to a per-million basis, making tiny defect rates intuitive and industry-comparable. The Six Sigma world-class target is 3.4 DPMO — accounting for the 1.5σ long-term drift of a real process.
④ FPY — First Pass Yield
The percentage of units that complete a process step without any rework, repair, or scrap. FPY declining is often the first visible signal that hidden rework costs are accumulating. A plant can show high throughput but terrible FPY if rework is baked into the process.
⑤ RTY — Rolled Throughput Yield
RTY multiplies yields across all process steps. Even individually high-yield steps compound to a much lower overall throughput. This is the metric that exposes the true cumulative cost of a multi-step process and shows why Six Sigma targets perfection at each step.
Three steps each at ≥95% FPY combine to only 90.2% RTY. Nearly 1 in 10 units has a defect somewhere in the process. RTY forces the question: where is the quality loss occurring?
⑥ Converting DPMO to Sigma Level (Z)
Sigma level (Z) is derived from DPO using the inverse normal CDF. Short-term Z always looks better — the 1.5σ shift accounts for long-term process drift. A process measuring 4.5σ short-term is considered "6 Sigma" because the shift brings it to 3.0σ long-term — but Six Sigma convention adds 1.5 to both sides.
⑦ Quick Reference — Diagnostic Signals
More defects = lower process capability. Focus improvement on the highest DPMO step first.
Process spread is fine but the mean is off-center. Fix centering before reducing σ.
Compounded yield drop exposes hidden rework cost. Drill into which step has the lowest FPY.
Units leaving a step with defects silently inflate cost. FPY below 95% warrants immediate DMAIC attention.
How the Normal Distribution Creates DPMO
Every manufacturing process produces outputs that vary. When plotted, most processes follow a normal distribution — a symmetric bell curve where values cluster near the mean (µ) and tail off toward the extremes.
The specification limits define the acceptable range. Any output beyond LSL or USL is a defect. DPMO = the area of both red tails × 1,000,000.
Step-by-Step: µ and σ → DPMO
- 1
Standardize to Z
Z = (X − µ)/σ — converts any measurement to "how many standard deviations from the mean?" Z ~ N(0,1).
- 2
Find Z at each spec limit
ZUSL = (USL−µ)/σ ZLSL = (µ−LSL)/σ. Distance from mean to each spec in σ units.
- 3
Compute both tail areas
p = [1−Φ(ZUSL)] + [1−Φ(ZLSL)] — the red shaded areas on both sides of the curve.
- 4
Scale to DPMO
DPMO = p × 1,000,000. Each additional σ level reduces DPMO by ~100×–1000×.
TAILS
DPMO is per opportunity. If one unit has 5 weld joints and each is one "opportunity," unit defect rate ≠ DPMO. Always define what "one opportunity" means before comparing across processes.
🔑 Key Definitions
DPMO
Defects Per Million Opportunities — normalizes defect rates for fair comparison across different process complexities.
Φ(z)
Standard normal CDF — cumulative area under the bell curve to the left of z. Tail = 1 − Φ(z).
Sigma Level (Z)
Distance from process mean to nearest spec in standard deviations. Higher = better quality.
True 6σ Centered
Two-sided DPMO ≈ 0.002. Roughly 1 defect per 507 million opportunities.
Plastic Housing Wall Thickness: 2.450 – 2.550 mm
A precision injection-moulded housing for an electronic sensor. The design team has set a tight wall-thickness specification to ensure structural integrity and correct fit. LSL = 2.450 mm, USL = 2.550 mm — a bilateral tolerance of ±0.050 mm. Your task: determine whether the current process is capable, and what happens when it drifts.
Production data from 200 parts shows: µ = 2.500 mm (centred), σ = 0.00833 mm. A process audit later reveals mean drift to µ = 2.5125 mm — a +1.5σ shift typical of long-term process behaviour.
Step A — Compute σ Required for True 6σ
Step B — Centred Process (Short-term, µ = 2.500 mm)
= 0.050 / 0.00833
= 6.000
= 0.050 / 0.00833
= 6.000
DPMO = 0.002 | Sigma level = 6.0σ (ST)
Step C — After +1.5σ Drift (µ = 2.5125 mm)
= 0.0375 / 0.00833
= 4.500
= 0.0625 / 0.00833
= 7.500
DPMO = 3.4 | Sigma level = 4.5σ (LT)
Step D — Capability Indices
Even a well-designed 6σ process accumulates drift over time. This is why Six Sigma reports two separate numbers: short-term Cp/Cpk (from a tightly controlled study) and long-term Ppk (from production data including all sources of variation). Always specify which you are reporting.
📋 Process Summary
| Parameter | Value |
|---|---|
| Feature | Wall thickness |
| LSL | 2.450 mm |
| USL | 2.550 mm |
| µ (centred) | 2.500 mm |
| σ (at 6σ) | 0.00833 mm |
| Cp | 2.000 |
| Cpk (centred) | 2.000 |
| DPMO (centred) | 0.002 |
| µ after +1.5σ drift | 2.5125 mm |
| ZUSL (drifted) | 4.500 |
| Cpk (drifted) | 1.500 |
| DPMO (drifted) | 3.4 |
🔑 What This Tells You
- Cp = 2.0 — the tolerance window is twice what the process spread needs. Excellent potential.
- Cpk = 2.0 (centred) — the process is hitting its potential. World class.
- Cpk = 1.5 (drifted) — still very capable, but DPMO jumped from ~0 to 3.4.
- This is why control charts matter — to catch drift before it escalates.
The 1.5σ Shift — Why "3.4 DPMO at 6σ"?
The famous 3.4 DPMO figure comes from a single assumption: real-world processes drift by approximately 1.5σ over the long term due to tool wear, raw material shifts, and environmental changes.
ZLSL = (312.5 − 0) / 41.667 = 7.500
pLSL ≈ 3.186×10⁻¹⁴ (negligible)
DPMO ≈ 3.4
Cp vs Cpk — The Critical Distinction
Process potential. Ignores mean position. "Could it fit if centered?"
Actual capability. Accounts for mean position. Cpk ≤ Cp always.
Long-term performance. Includes all variation sources including drift.
Rule: Large Cp−Cpk gap = process is capable but off-center. Fix centering first before trying to reduce σ. If Cp ≥ 1.33 but Cpk < 1.33, the problem is mean position, not spread.
⚖️ ST vs LT Sigma
| ST Z | LT Z | LT DPMO |
|---|---|---|
| 3σ | 1.5σ | 66,807 |
| 4σ | 2.5σ | 6,210 |
| 5σ | 3.5σ | 233 |
| 6σ | 4.5σ | 3.4 |
Sigma Level ↔ DPMO Reference
The sigma-DPMO relationship is exponential — each additional sigma level cuts DPMO by one to three orders of magnitude. The visual below makes this concrete.
| Sigma (Z) | 1-sided DPMO | 2-sided DPMO | LT DPMO (+1.5σ) | Defect % | Yield % |
|---|---|---|---|---|---|
| 1σ | 158,655 | 317,311 | 697,672 | 31.73% | 68.27% |
| 2σ | 22,750 | 45,500 | 308,537 | 4.55% | 95.45% |
| 3σ | 1,350 | 2,700 | 66,807 | 0.270% | 99.73% |
| 4σ | 31.67 | 63.34 | 6,210 | 0.0063% | 99.9937% |
| 5σ | 0.287 | 0.573 | 233 | 0.000057% | 99.99994% |
| 6σ | 0.000987 | 0.00197 | 3.4 | 3.4×10⁻⁷% | 99.99966% |
| 7σ | 1.28e-7 | 2.56e-7 | 0.019 | ~0 | ~100% |
Monte Carlo Simulation
Monte Carlo generates thousands of random N(µ,σ) samples and counts how often they fall outside spec limits. It validates analytical DPMO and teaches tail probability concepts visually — especially useful for non-normal processes.
Simulation Results (N = 400,000)
| Case | µ | Defects (N=400K) | Est. DPMO | Analytical |
|---|---|---|---|---|
| Centered 6σ | 250 | 0 | 0.000 | 0.00197 |
| +1.5σ Shifted | 312.5 | 2 | 5.000 | 3.398 |
Zero defects in 400,000 samples at true 6σ is correct, not a bug. You'd need ~500 million samples to reliably observe a single 6σ defect. For extreme sigma levels, analytical methods are far more practical than simulation.
When Simulation Beats Analytical Methods
- When the process distribution is non-normal (skewed, bimodal, truncated)
- When multiple interacting dimensions or GD&T stackups are involved
- When teaching the effect of mean shift, σ reduction, or spec change visually
🎲 Required Sample Size
| Sigma | Need N ≥ |
|---|---|
| 3σ | 37,000 |
| 4σ | 1.6M |
| 5σ | 175M |
| 6σ | 507M |
Rule of thumb: N ≥ 10/p for reliable estimation. Use analytical at 5σ+.
DMAIC — The Five-Phase Process Improvement Roadmap
DMAIC is the backbone of every Six Sigma project. It takes a problem through five sequential phases — each with specific tools and deliverables — to arrive at a sustainable solution that eliminates root cause rather than treating symptoms.
Starts with COPQ/Pareto analysis to identify and prioritise the problem. SIPOC diagram scopes the project boundaries (7–8 key process steps). Ends with a signed charter containing problem statement, goal, scope, estimated savings, team, and timeline.
A Y=F(X) process map identifies all inputs and outputs. FMEA quantifies risk by RPN. Gage R&R validates measurement before collecting capability data. The phase ends with a confirmed baseline sigma level (Cpk) and an accepted measurement system.
Hypothesis tests (t-test, ANOVA) compare means between conditions. Correlation and regression reveal input-output relationships. 5-Whys and Ishikawa structure the cause-and-effect thinking. The phase ends with statistically validated root causes.
Design of Experiments (DOE) maps the relationship between input factors and output responses, finding optimal operating conditions. Solutions are piloted before full rollout. The Improve phase ends with a statistically significant improvement in the baseline metric.
SPC charts monitor the improved process in real-time. A Control Plan documents what to measure, how often, and what action to take on signals. Updated FMEA, process maps, and SOPs transfer ownership back to the process team. Project savings are calculated and reported.
DMAIC is not always needed. If a problem already has a known solution and action plan, it is an implementation project — just execute the plan. DMAIC is reserved for problems where the root cause is genuinely unknown.
Splitting the DMAIC — Four Focused Paths to Improvement
Full DMAIC training covers dozens of tools across five phases. Research into successful projects shows that four common paths account for the vast majority of real improvements. Each path has a clear objective, a targeted tool set, and a repeatable sequence. Matching the right path to the right problem dramatically increases success rate.
Goal: Achieve stable, predictable, capable output (Cpk ≥ 1.33)
This is the heart of classic Six Sigma — SPC was its original tool. Key insight: don't start with the control chart. First validate the measurement system, then characterise the process, then chart it. Starting with charts on an unvalidated measurement system is a very common and costly mistake.
Goal: Increase machine/process uptime and throughput
Targets machine breakdowns and availability losses. Asset Utilization (AU) waterfall charts identify the top loss categories. Component matrices link failure modes to parts. Weibull analysis predicts failure timing and drives condition-based maintenance strategy.
Goal: Eliminate the 8 wastes — TIMWOOD + Skills
Value Stream Mapping reveals waste across the flow. 5S eliminates inventory and motion waste. Kanban controls overproduction. QFD aligns specs to customer need — often revealing specs that are unnecessarily tight (over-processing waste). The Lean path is the newest and most popular, but its sand pits can trap unwary teams if used for the wrong problem type.
Goal: Drive defect frequency to zero
Defects are things that shouldn't be there at all — unlike variability, there is no optimal level other than zero. This is the widest, most common DMAIC path. Its tools are simple and accessible to anyone at any belt level: Pareto to prioritise, Fishbone/5-Whys to find cause, Poka-Yoke to prevent recurrence, and standardisation to sustain.
Path selection principle (Quick, 2019): Management sets the goal and links it to KPIs. Teams never choose their own projects — projects without management linkage lose resources to crises. The need should drive the method, just as form follows function.
COPQ & Project Selection — Linking Six Sigma to Business Results
Every Six Sigma project must be tied to real business cost — otherwise it competes with day-to-day operations and loses. The Cost of Poor Quality (COPQ) framework ensures projects are prioritised by financial impact, not by seniority or gut feeling.
Six Sigma projects must be identified from internal failure and external failure categories first — these directly impact bottom-line results. Prevention spending typically returns 3–5× its cost by reducing the failure categories. "Gating the defect" — catching quality issues in-house before they reach the customer — is a fundamental discipline.
Multi-Level Pareto — Drilling to Project Scope
A single Pareto identifies the biggest problem category. A second-level Pareto drills into that category. If a problem appears in the top 3 at both levels — by frequency and cost — it is the ideal project candidate.
Project Charter — The Contract Between Team and Management
- ✓ Problem statement (what, where, when, magnitude)
- ✓ Measurable goal with deadline
- ✓ Scope — start & end point, in/out of scope
- ✓ Team members & roles
- ✓ Estimated savings from COPQ analysis
- ✓ Timeline with phase gate milestones
- ✓ Management signature (resource commitment)
- ✗ Choosing the hardest problem (years-old issue)
- ✗ Selecting an already-approved capital project
- ✗ No link to financial impact or KPIs
- ✗ Team members choose their own projects
- ✗ Scope too broad — "reduce all defects"
- ✗ No management sign-off or resource commitment
- ✗ Renaming existing firefighting as a DMAIC project
Selection rule (Shankar, ASQ 2009): Start from external failure costs, then internal failure costs. Problems with the highest combined frequency and cost across multiple Pareto levels are the ideal candidates. The data dictates priority — not management preference or the loudest voice in the room.
Measurement System Analysis
Before trusting process data, trust your measurement system. MSA quantifies how much observed variation is process — and how much is just the gauge. Every PPAP, every SPC chart, every capability index depends on getting this right first.
MSA Variation Taxonomy — The Complete Tree
Every measurement you take contains two fundamentally different kinds of variation. Understanding their structure is the foundation of all MSA work. The tree below shows the complete decomposition — from total observed variation down to each individual error source.
Accuracy vs Precision — The Core Distinction
How close measurements are to the true reference value. Accuracy errors are consistent — they shift every reading in the same direction. A perfectly precise gauge can still be completely inaccurate.
How close repeated measurements are to each other. Precision errors are random — they scatter results around some central value. High precision doesn't guarantee accuracy; a precise gauge can be precisely wrong.
The Five MSA Error Components Explained
The systematic offset from true value
Bias is the difference between the observed average measurement and the reference/true value for the same part. A gauge with positive bias reads high consistently; negative bias reads low. It is measured by comparing the gauge average against a known reference standard (master part).
Cause: Worn gauge, incorrect calibration, wrong reference standard, elastic deformation of gauge or part.
Bias that changes across the measurement range
Linearity asks: "Is bias the same at low values as at high values?" A gauge may read accurately near 5mm but overread near 25mm. Linearity is assessed by measuring multiple reference parts spread across the full operating range and plotting bias vs. reference value. The slope of the regression line is the linearity error.
Cause: Gauge not calibrated across full range, non-linear amplifier response, mechanical wear concentrated at one end of travel.
Bias drift over time
Stability (also called drift) measures whether the gauge's accuracy changes over time. A stable gauge produces the same average reading on a reference part measured today, next week, and next month. It is assessed by measuring a master part periodically and charting the averages on an Individuals (XmR) control chart. An out-of-control point signals a stability problem.
Cause: Thermal drift, electrical component aging, mechanical wear, contamination, re-calibration interval too long.
Within-operator scatter — Equipment Variation
Repeatability is the variation obtained when one operator measures the same part multiple times under the same conditions. It represents the fundamental noise floor of the instrument — the best the gauge can possibly do. AIAG calls this Equipment Variation (EV). Even with a perfect operator technique, a poor gauge yields high repeatability error.
Reduces with: Gauge overhaul, reducing environmental noise, better fixturing, increased resolution. This is the component that can ONLY be improved by instrument upgrade.
Between-operator scatter — Appraiser Variation
Reproducibility is the variation in measurement averages obtained by different operators measuring the same part with the same gauge. It captures differences in technique, fixture loading, data reading habits, and environmental sensitivity. AIAG calls this Appraiser Variation (AV). High AV tells you training and procedure standardisation is the priority — not a new gauge.
Reduces with: Operator training, written measurement procedures (SOP), better fixtures, fixture gauging to remove human positioning variation.
AIAG Priority Rule: Always resolve accuracy problems (Bias → Linearity → Stability) before running a GR&R study. A biased or drifting gauge will corrupt your GR&R data. Recalibrate first, then study precision.
Three Methods to Quantify GR&R (Precision)
See the GR&R — 3 Methods and EMP Method tabs for full worked examples using the AIAG 4th Edition reference dataset (3 operators × 10 parts × 3 trials).
The Fundamental MSA Equation
Every measurement you take is the sum of two things: what the process actually produced, and the noise your gauge added. MSA separates them.
S.W.I.P.E. — The Five Error Sources (AIAG 4th Ed.)
AIAG Mandatory Sequence — Never Skip
Discrimination — The 10-to-1 Rule
The 4th Edition updated this rule: instrument discrimination must be at most 1/10 of the process variation (σ × 6), not 1/10 of the tolerance. This reflects the philosophy of process-focused quality — the process, not the spec, drives measurement requirements.
| ndc | Ability | Use case |
|---|---|---|
| 1 | Go/no-go only | Cannot distinguish values. Control only if large Cp and flat loss function. |
| 2–4 | Coarse estimation | Semi-variable control only. Cannot reliably estimate process parameters. |
| ≥ 5 | Adequate | Can be used with variables control charts. AIAG minimum requirement. |
| ≥ 10 | Excellent | Full analytical resolution. No discrimination concerns. |
Deming's Funnel / Tampering Warning (AIAG 4th Ed. Ch. I-B): A measurement system with large variation causes operators to adjust processes that don't need adjustment. Autocompensation that adjusts by the last result (Rule 2) adds variation — the exact opposite of its intent. Never adjust a stable process based on a single measurement.
🔑 Key Definitions (AIAG 4th Ed.)
Bias
Difference between observed average and reference value. Systematic error. Assessed by t-test: H₀: bias=0 at α=0.05.
Stability (Drift)
Change in bias over time. Tracked with X̄&R control charts on a reference part. Must be confirmed FIRST.
Linearity
Change in bias over the operating range. Regression: slope=0 (H₀) tested at α=0.05. 5 parts covering full range.
Repeatability (EV)
One appraiser, same part, same gage. Equipment Variation. Within-system error.
Reproducibility (AV)
Different appraisers, same gage, same part. Appraiser Variation. Between-system error.
GR&R
GRR² = EV² + AV². The combined measurement system capability estimate.
Measurement Uncertainty
Different from MSA. MSA = understand sources. Uncertainty = range expected to contain true value. True = Observed ± U.
Bias Study — Independent Sample Method
Tests H₀: bias = 0. The calculated average bias is evaluated to determine if it could be due to random sampling variation — or if there is a true systematic offset that needs recalibration.
Step-by-Step Procedure
- 1
Establish Reference Value
Send part to metrology lab or measure n≥10 times with higher-order instrument. Average = reference value. Choose a part near mid-range of production variation.
- 2
Collect Measurements
Measure the same part n≥10 times under normal conditions by the lead operator.
- 3
Check Repeatability First
%EV = 100[σ_r / TV]. If %EV is large, fix repeatability before continuing — bias test assumes acceptable repeatability.
- 4
Compute t-statistic
t = bias / σ_b where σ_b = σ_r / √n. Reject H₀ if |t| > t(α/2, n−1). Default α = 0.05.
- 5
Check CI Contains Zero
bias ± t(0.025, n−1) × σ_b. If zero is within CI → bias is acceptable.
Worked Example — AIAG MSA 4th Ed. (p.90–91)
Reference value = 6.00. n = 15 readings by lead operator. Expected process variation (σ) = 2.5.
−0.2
−0.3
−0.1
−0.1
0.0
+0.1
0.0
+0.1
+0.4
+0.3
0.0
+0.1
+0.2
−0.4
0.0
x̄ = 5.772 (15 readings)
Bias = 5.772 − 6.00 = −0.228
t = Bias / (σ/√n) = −2.977
tcrit(14df, α=0.05) = 2.145
|t| > tcrit → Bias is significant
Result from AIAG 4th Ed.: The bias is statistically acceptable. Zero falls within the 95% CI of (−0.1107, +0.1241). The measurement system can proceed to GR&R study.
Common Causes of Non-Zero Bias
- Instrument needs calibration (most common)
- Worn instrument, equipment, or fixture
- Worn or damaged master; error in master
- Instrument made to wrong dimension
- Instrument measuring the wrong characteristic
- Instrument correction algorithm incorrect
📋 Bias Study Summary
| Parameter | Value |
|---|---|
| Reference Value | 6.000 |
| X̄ (15 readings) | 6.067 |
| Bias | +0.067 |
| σ_r (repeatability) | 0.2120 |
| %EV | 8.5% |
| t_stat | 1.224 |
| t_critical (α=0.05) | 2.145 |
| 95% CI lower | −0.1107 |
| 95% CI upper | +0.1241 |
| Zero in CI? | YES ✓ |
| Verdict | ACCEPTABLE |
Linearity Study
Linearity = bias that changes with the size of the part being measured. A gage may be perfectly accurate at one point in its range and badly biased at another. Tests if the slope of bias vs. reference value equals zero.
How to Conduct (AIAG 4th Ed.)
- 1
Select 5 Parts Across Full Range
Choose g ≥ 5 parts whose measurements, due to process variation, cover the full operating range of the gage.
- 2
Establish Reference Values
Have each part measured by layout inspection. Confirm the gage's operating range is fully covered.
- 3
Measure m ≥ 10 Times Each
One operator, same gage, random order (to prevent recall bias).
- 4
Regression Analysis
Fit bias = a × reference + b. Test H₀: a=0 (no linearity) AND H₀: b=0 (no constant bias). Both must pass.
Worked Example — AIAG MSA 4th Ed. (Table III-B 4)
| Part | Ref. Value | Avg Bias | Verdict |
|---|---|---|---|
| 1 | 2.00 | +0.507 | Large positive bias |
| 2 | 4.00 | +0.144 | Moderate bias |
| 3 | 6.00 | +0.083 | Near zero |
| 4 | 8.00 | −0.300 | Negative bias |
| 5 | 10.00 | −0.614 | Large negative bias |
b (slope) = −0.1429
a (intercept) = 0.8373
→ Linearity is significant
R² = 0.3266
AIAG conclusion: This measurement system has a linearity problem. The bias starts large and positive at small part sizes and switches to large negative at large sizes. The gage must be recalibrated across its full range before use. Cannot be used for product/process analysis in this condition.
Graphical Pass/Fail Rule
Plot bias vs reference value with best-fit line and confidence bands. For linearity to be acceptable, the "bias = 0" horizontal line must lie entirely within the confidence bands of the fitted regression line. If the zero line exits the bands at any point — linearity problem exists regardless of numerical results.
📊 Linearity vs Bias at a Glance
Constant bias can be corrected by recalibration. A linearity error requires hardware or software modification across the full operating range.
Stability Study — Change in Bias Over Time
A stable gauge gives the same bias today as it did last month. Stability must be confirmed with X̄&R control charts on a reference part before any GR&R study begins — an unstable system produces meaningless GR&R results.
Procedure
- 1
Select Reference Part
Near mid-range of production variation. Establish reference value from lab/higher-order system. May want masters at low, mid, and high range — separate charts for each.
- 2
Periodic Measurement
Measure the reference part n=3–5 times per period. Weekly or daily, depending on expected drift rate. Plan ≥20 subgroups before final assessment.
- 3
X̄&R Control Charts
Plot and analyze. Look for: trends, shifts, out-of-control signals, cycles. No specific %Stability index — analysis is through control chart interpretation.
- 4
Pass / Fail
Stable = no OOC signals, no trends. Unstable = any OOC signal, trend, or systematic drift. Do not proceed to GR&R until stable.
AIAG Example — Stability Study Data (Figure III-B 1)
Control limits: UCLx̄ = 6.11 | LCLx̄ = 5.72
UCLR = 0.73 | LCLR = 0
All points within limits → measurement system stable
Stability vs Other MSA Properties
| Property | Varies with | Study design | Chart type | Order |
|---|---|---|---|---|
| Stability | TIME | Same part, time changes | X̄&R over time | ① First |
| Linearity | RANGE | Different parts, same time | Regression plot | ② Second |
| Bias | — | Same part, single session | Histogram + CI | ③ Third |
| GR&R | — | Multiple parts, appraisers, trials | X̄&R or ANOVA | ④ Last |
No specific %Stability threshold exists in AIAG 4th Ed. The manual explicitly states: "Other than normal control chart analyses, there is no specific numerical analysis or index for stability." Pass/fail is entirely based on control chart interpretation. The torque wrench example from our other module uses a calculated percentage — that is a customer-specific metric, not an AIAG standard.
🔧 Why Stability First?
If the bias is changing over time while you conduct a GR&R study, your results are meaningless. The study will reflect a snapshot of a moving target — not the true long-term measurement system capability.
GR&R on an unstable system = wasted effort. Calibrate, investigate, and restore stability first.
- Wear in measurement equipment
- Damaged or worn standard/master
- Temperature / humidity cycling
- Electronic drift in sensors
- Spring fatigue (torque wrenches)
- Contamination or lubricant buildup
GR&R Study Methods — X̄-R Method
The X̄-R method (Average and Range) is the automotive industry standard: 3 appraisers × 10 parts × 2–3 trials, randomised order. Cannot detect appraiser-by-part interaction, but well understood and widely accepted for PPAP.
Complete AIAG Example (Table III-B 15/16)
Appraisers F = 0.424 (p = 0.661)
Interaction F = 0.434 (p = 0.850)
σ²reproducibility = 0.00456
σ²GRR = 0.04463
σ²parts = 0.17020
GRR Acceptance Zones
Three Accepted Methods
Uses ranges from pairs of measurements. Provides only combined GRR — cannot separate EV from AV. Not acceptable for PPAP submission. Used for quick initial screening to see if a formal study is warranted.
GRR = R̄ / d₂* where d₂* depends on sample size and number of subgroups.
3 appraisers × 10 parts × 2–3 trials, random order. Uses control chart constants K1, K2, K3 to separate EV and AV. Cannot estimate appraiser-by-part interaction. Most common in PPAP packages.
Most statistically powerful. Handles any experimental setup. Detects appraiser-by-part interaction — a source X̄-R method misses. Decomposes: Parts, Appraisers, Interaction, Equipment. AIAG recommends this method when a computer is available.
What GRR Diagnostics Tell You
| Finding | Root Cause | Action |
|---|---|---|
| EV large vs AV | Instrument problem | Maintenance, redesign, fix clamping |
| AV large vs EV | Appraiser technique differs | Retrain, clarify procedure, add fixture |
| Interaction significant | Appraisers handle parts differently | Standardise measurement procedure |
| ndc = 1 or 2 | Poor discrimination | Upgrade gauge resolution |
📋 Study Results (AIAG Example)
| Source | StdDev | %TV |
|---|---|---|
| EV (Repeat.) | 0.202 | 17.6% |
| AV (Reprod.) | 0.230 | 20.1% |
| GRR Total | 0.306 | 26.7% |
| PV (Parts) | 1.104 | 96.4% |
| TV | 1.086 | 100% |
| ndc | ≈ 5 (borderline) | |
At 26.7% GRR, this system is in the "may be acceptable" zone. AIAG says decision should be based on application importance and cost.
ANOVA Method — Same Data, Better Results
ANOVA on the same 3×10×3 dataset detects whether appraiser-by-part interaction is significant — something the X̄-R method simply cannot see. When interaction is non-significant, results are pooled into the equipment term.
The ANOVA Table (AIAG Table III-B 7)
| Source | DF | SS | MS | F | Significant? |
|---|---|---|---|---|---|
| Appraiser | 2 | 3.1673 | 1.58363 | 34.44 | Yes (α=0.05) |
| Parts | 9 | 88.3619 | 9.81799 | 213.52 | Yes (α=0.05) |
| Appraiser×Part | 18 | 0.3590 | 0.01994 | 0.434 | NO — pooled |
| Equipment | 60 | 2.7589 | 0.04598 | — | — |
| Total | 89 | 94.6471 | — | — | — |
Pooled MSequipment = (SSinteraction + SSrepeatability) / (dfint + dfrep)
= (0.7783 + 7.2129) / (18 + 60) = 0.10247
σ²repeatability = MSequipment = 0.04007
ANOVA vs X̄-R: Side-by-Side
| Method | EV | AV | GRR | %GRR | Interaction |
|---|---|---|---|---|---|
| X̄-R Method | 0.202 | 0.230 | 0.306 | 26.7% | Cannot detect |
| ANOVA | 0.200 | 0.227 | 0.302 | 27.9% | 0 (not significant) |
Results are very close — this is expected when interaction is non-significant. ANOVA gives slightly more accurate estimates due to better partitioning. The key ANOVA advantage is detecting the interaction term.
When does interaction matter? If the interaction term were significant (parallel lines on interaction plot = no interaction; crossing lines = interaction), it would indicate different appraisers handle different parts inconsistently — a training or fixture problem specific to certain part geometries.
📊 Graphical Tools — ANOVA
Interaction Plot
Appraiser avg per part vs part number. Parallel lines = no interaction. Crossing lines = interaction present.
Error Charts
Individual deviations from reference. Appraiser A: positive bias. Appraiser C: negative bias (from AIAG example).
Whiskers Chart
High/low/average per part per appraiser. Reveals inconsistent appraisers across different part sizes.
Residual Plot
Fitted vs residual values. Check for randomness — any pattern suggests model inadequacy.
EMP Method — Evaluating the Measurement Process
The EMP methodology, developed by Dr. Donald J. Wheeler, goes beyond a simple pass/fail percentage. It uses control charts to validate the study, computes variance components (not standard deviations), and classifies your measurement system as a First, Second, Third, or Fourth Class Monitor — giving you actionable intelligence about what the gauge can actually do in production.
Source: All three GR&R methods below use the AIAG 4th Edition reference dataset — 3 operators (A, B, C) × 10 parts × 3 trials each = 90 measurements total. This allows direct comparison of methods on identical data.
The AIAG Reference Dataset (Table 1)
| Op. | Trial | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| A | 1 | 0.29 | −0.56 | 1.34 | 0.47 | −0.80 | 0.02 | 0.59 | −0.31 | 2.26 | −1.36 |
| A | 2 | 0.41 | −0.68 | 1.17 | 0.50 | −0.92 | −0.11 | 0.75 | −0.20 | 1.99 | −1.25 |
| A | 3 | 0.64 | −0.58 | 1.27 | 0.64 | −0.84 | −0.21 | 0.66 | −0.17 | 2.01 | −1.31 |
| B | 1 | 0.08 | −0.47 | 1.19 | 0.01 | −0.56 | −0.20 | 0.47 | −0.63 | 1.80 | −1.68 |
| B | 2 | 0.25 | −1.22 | 0.94 | 1.03 | −1.20 | 0.22 | 0.55 | 0.08 | 2.12 | −1.62 |
| B | 3 | 0.07 | −0.68 | 1.34 | 0.20 | −1.28 | 0.06 | 0.83 | −0.34 | 2.19 | −1.50 |
| C | 1 | 0.04 | −1.38 | 0.88 | 0.14 | −1.46 | −0.29 | 0.02 | −0.46 | 1.77 | −1.49 |
| C | 2 | −0.11 | −1.13 | 1.09 | 0.20 | −1.07 | −0.67 | 0.01 | −0.56 | 1.45 | −1.77 |
| C | 3 | −0.15 | −0.96 | 0.67 | 0.11 | −1.45 | −0.49 | 0.21 | −0.49 | 1.87 | −2.16 |
EMP Variance Component Formulas
Like ANOVA, EMP works in variances (not standard deviations). Subgroups are each operator×part combination (e.g., A-Part1 = {0.29, 0.41, 0.64}). The average range R̄ drives all calculations.
R̄ = 0.4267 (avg range)
K₁ = 0.5908 (3 trials)
EV = 0.2520
x̄diff = 0.2533, K₂ = 0.7071
AV = 0.1715
EMP Variance Results (Table 6)
| Component | Variance | % of Total |
|---|---|---|
| Repeatability | 0.0407 | 3.1% |
| Reproducibility | 0.0531 | 4.1% |
| R&R (GRR) | 0.0938 | 7.2% |
| Product (Part-to-Part) | 1.216 | 92.8% |
| Total | 1.310 | 100.0% |
The Intraclass Correlation Coefficient (ρ)
This is EMP's key metric — the ratio of part variance to total variance. It tells you what fraction of observed variation is real product signal vs. gauge noise.
Wheeler's Four Monitor Classes — Interpreting ρ
| ρ Range | Class | Signal Reduction | Chance Detect ±3σ Shift | Track Process? | %R&R / AIAG |
|---|---|---|---|---|---|
| 0.8 – 1.0 | First Class ★ | <10% | >99% (Rule 1) | Up to Cp₈₀ | 0–20% · Acceptable |
| 0.5 – 0.8 | Second Class | 10–30% | >88% (Rule 1) | Up to Cp₅₀ | 20–50% · Marginal |
| 0.2 – 0.5 | Third Class | 30–55% | >91% (Rules 1–4) | Up to Cp₂₀ | 50–80% · Unacceptable |
| 0.0 – 0.2 | Fourth Class | >55% | Rapidly Vanishing | Unable to Track | 80–100% · Unacceptable |
Adapted from EMP III: Evaluating the Measurement System, Donald J. Wheeler, SPC Press, 2006.
Our example result: ρ = 0.928 → First Class Monitor. This means less than 10% reduction in process signal, better than 99% chance of detecting a ±3σ shift with Rule 1, and the measurement system can track process improvements all the way to Cp₈₀. The gauge is excellent for SPC use.
All Three Methods Side-by-Side (Same Data)
| Source | Average & Range | ANOVA | EMP | |||
|---|---|---|---|---|---|---|
| Std Dev | %TV (σ) | Variance | %TV (σ²) | Variance | %TV (σ²) | |
| Repeatability | 0.202 | 17.61% | 0.0400 | 3.39% | 0.0407 | 3.1% |
| Reproducibility | 0.230 | 20.04% | 0.0515 | 4.37% | 0.0531 | 4.1% |
| R&R | 0.306 | 26.68% | 0.0914 | 7.76% | 0.0938 | 7.2% |
| Part-to-Part | 1.104 | 96.37% | 1.086 | 92.24% | 1.216 | 92.8% |
| Total | 1.146 | — | 1.178 | 100% | 1.310 | 100% |
Why the Average & Range method is misleading: Standard deviations are not additive (σ_total ≠ σ_parts + σ_ms), so the % column doesn't sum to 100% and is mathematically incorrect for decision-making. The 26.68% R&R figure from the Avg & Range method on this same data looks "marginal" under AIAG criteria, while ANOVA and EMP correctly show 7–8% — clearly acceptable. Bottom line: use ANOVA or EMP.
Which Method to Use?
Only use if hand calculations are required with no software. Always convert to variance before interpreting. Not recommended.
AIAG-preferred. Detects operator×part interaction. Best for automated environments and PPAP submissions. Use this by default.
Adds control chart validation and the Monitor Class framework. Use when you want to understand what the gauge can actually do for process control.
Attribute Measurement System Analysis
Attribute gauges produce finite categories (pass/fail, good/bad, or colour grades). Standard GR&R methods don't apply — instead AIAG uses Cohen's Kappa for agreement and Effectiveness for decision accuracy.
Cross-Tabulation and Cohen's Kappa
Kappa measures inter-rater agreement beyond what chance alone would produce.
0.7 ≤ κ < 0.9 → Acceptable
κ < 0.7 → Inadequate — investigate
AIAG Attribute MSA Example (Table III-C 3)
| Pair | Kappa | Verdict |
|---|---|---|
| Appraiser A vs B | 0.86 | Good agreement |
| Appraiser B vs C | 0.79 | Good agreement |
| Appraiser A vs C | 0.78 | Good agreement |
| Appraiser | κ vs Reference | Effectiveness | Miss Rate | False Alarm | Verdict |
|---|---|---|---|---|---|
| A | 0.88 | 84% | 6.3% | 4.9% | Marginal |
| B | 0.92 | 90% | 6.3% | 2.0% | Borderline |
| C | 0.77 | 80% | 12.5% | 8.8% | Unacceptable |
Effectiveness Acceptance Criteria (Table III-C 6)
| Decision | Effectiveness | Miss Rate | False Alarm Rate |
|---|---|---|---|
| Acceptable | ≥ 90% | ≤ 2% | ≤ 5% |
| Marginal | 80–89% | ≤ 5% | ≤ 10% |
| Unacceptable | < 80% | > 5% | > 10% |
Important AIAG caution: A 90% agreement rate on a process with Pp=1.0 doesn't mean 90% of bad parts are caught. Bayes' Theorem must be applied — the probability a rejected part is truly bad depends on the underlying defect rate. At very low defect rates, most "rejected" parts are actually false alarms.
Signal Detection Approach (for %GRR)
When variable reference data is available, the gray zone width between the last universally-accepted and first universally-rejected part estimates 6σ_GRR:
dLSL = same calculation at LSL
d = average of dUSL and dLSL
GRRboundary = d / 5.15 (5.15σ = 99% spread)
📊 Attribute MSA Summary
No single appraiser in the AIAG example met ALL three criteria simultaneously. This is the key finding — a system-level decision is needed.
Kappa > 0.75
All pairs met this. Appraisers agree with each other well.
Effectiveness
Only B reached ≥90%. A and C are marginal/unacceptable.
Miss Rate
All three had 6.3%+ miss rate, exceeding the ≤2% threshold. Training needed.
How GRR Distorts Your Cp — AIAG Appendix B
The most important and most overlooked MSA insight: your observed Cp is always lower than your actual process Cp because measurement error inflates the observed variation. Appendix B of AIAG MSA 4th Ed. gives the exact formula.
%AV = 100 × (AV / TV)
%GRR = 100 × (GRR / TV)
%PV = 100 × (PV / TV)
%AV = 100 × (AV / Tol)
%GRR = 100 × (GRR / Tol)
TV = √(GRR² + PV²)
What This Means in Practice
A high GRR makes your process capability look worse than it really is. This has real consequences: a process may be denied production approval because of its measurement system, not because of the process itself.
Critical insight: At GRR=70% with Cp_obs=1.30, the actual process Cp is still only 1.04 — barely capable. This means high GRR doesn't just disguise a capable process — it may be masking a barely capable one. Always investigate GRR before concluding a process is incapable.
📊 Appendix B Table — Observed vs Actual Cp
Actual Cp = 1.30, GRR varies (process-based)
| GRR % | Cp_obs (process) | Cp_obs (tolerance) |
|---|---|---|
| 10% | 1.29 | 1.29 |
| 20% | 1.27 | 1.26 |
| 30% | 1.24 | 1.20 |
| 40% | 1.19 | 1.11 |
| 50% | 1.13 | 0.99 |
| 60% | 1.04 | 0.81 |
| 70% | 0.93 | 0.54 |
| 90% | 0.57 | never |
At GRR=50%, tolerance-based Cp_obs drops to 0.99 — looks incapable even though actual Cp=1.30!
Measurement Tools, Destructive & Non-Destructive Testing
Before you can analyse measurement system variation, you need to select the right measurement tool and understand its capabilities. The Rule of 10 governs tool selection; destructive and NDT methods determine what kind of testing is possible.
Measurement Tools — Precision Hierarchy
| Tool | Least count / Resolution | Principle | Typical use |
|---|---|---|---|
| Scale / Tape Measure | 1 mm or 0.5 mm | Direct linear measurement against graduated scale | Rough dimensions, layout |
| Vernier Caliper | 0.1 mm or 0.05 mm | Main scale + vernier scale alignment | OD/ID/depth/step measurements |
| Micrometer | 0.01 mm | Screw thread advancement per revolution | Shaft/bore diameters, wall thickness |
| Gage Blocks (Slip Gauges) | 0.001 mm (1 μm) | Precision ground blocks — stacked by "wringing" (molecular adhesion up to 330 N pull force) | Calibration reference, setting instruments |
| Optical Comparator | Depends on magnification | Magnified silhouette projected on screen — dimensions measured against prescribed limits | Complex profiles, thread forms, gear teeth |
Rule of 10 (10:1 Rule): The measuring instrument resolution should divide the tolerance into at least 10 parts. Example: tolerance = ±0.05mm (range = 0.10mm) → minimum instrument resolution = 0.01mm → Digital Vernier (0.01mm) is acceptable; tape measure (1mm) is not. Calibration instruments should be 10× better than the measuring instrument.
Destructive Testing
Destructive tests damage or destroy the test piece. Used when the test must measure failure — cannot be used for 100% inspection. Drives the need for acceptance sampling.
Stress-Strain curve analysis. Pulls the specimen to failure.
- Stress = Force / Area (Pa = N/m²)
- Strain = ΔLength / Length (unitless)
- Measures: UTS, yield strength, elongation, Young's modulus
- Curve shapes: ductile steel, brittle (concrete/carbon fibre), non-ferrous
Measures notch toughness — ability to absorb energy during fracture. A pendulum swings and strikes a notched specimen.
- Result: energy absorbed (Joules)
- Critical for low-temperature applications
- Identifies brittle-ductile transition temperature
Applies cyclic loading until failure. Most engineering failures are fatigue-related.
- Determines S-N curve (stress vs cycles to failure)
- Identifies endurance limit (some steels)
- Critical for rotating machinery, aircraft structures
Non-Destructive Testing (NDT)
NDT methods inspect materials and components without causing damage — enabling 100% inspection for critical items. Each method has specific capabilities and limitations.
| Method | Principle | Detects | Applicable materials |
|---|---|---|---|
| Radiography (X-ray / Gamma) | Radiation passes through material; defects absorb differently and show on film/detector | Internal voids, porosity, inclusions, weld defects | Most materials — metals, composites, castings |
| Ultrasonic Testing (UT) | Sound waves >20 kHz transmitted into material; reflections from defects detected | Internal defects, thickness measurement, delaminations | Metals, composites, welds |
| Magnetic Particle (MT) | Magnetic field applied; field leaks at surface/near-surface defects; magnetic particles accumulate | Surface and near-surface cracks | Ferromagnetic materials ONLY (steel, iron) |
| Liquid Penetrant (PT) | Dye penetrant drawn into surface cracks by capillary action; developer reveals defects | Surface-breaking defects only | Any material — magnetic AND non-magnetic |
| Hardness Testing | Indenter pressed into surface; hardness = resistance to indentation (Vickers HV, Brinell HB, Rockwell HR) | Material hardness, heat treatment verification | Most solid materials |
Crossed vs Nested GR&R Studies
Each operator measures every part. The parts are re-measured multiple times. Enables separation of EV (repeatability) and AV (reproducibility) components.
- ✓ Standard AIAG GR&R method
- ✓ Used for non-destructive measurements
- ✓ Provides separate EV, AV, and interaction estimates
- ✓ Typical design: 3 operators × 10 parts × 2 replicates
Each operator measures a different set of parts — typically because the measurement destroys the part. Parts are nested within operators; cannot be measured by more than one operator.
- ✓ Used for destructive tests (tensile, hardness, chemical)
- ✓ Cannot separate repeatability from part-to-part variation within operator
- ⚠️ Reproducibility is confounded with part variation
- ✓ Requires more parts than crossed design
Quality Philosophy
The foundational reference for quality engineering. Covers the evolution of quality, the philosophies of every major quality pioneer, continuous improvement frameworks, strategic planning, facilitation tools, customer relations, supplier management, and barriers to quality improvement.
Evolution of Quality & the Philosophies That Shaped It
Quality management evolved from pure inspection through statistical control, quality assurance, and total quality management into today's business excellence frameworks. Each pioneer contributed a distinct, testable philosophy that forms the foundation of modern quality engineering.
W. Edwards Deming — 14 Points & System of Profound Knowledge
Deming taught that 85–94% of quality problems are caused by the system itself — not the workers. His message to Japan in the 1950s transformed their manufacturing. His framework rests on four areas of Profound Knowledge: appreciation for a system, knowledge about variation, theory of knowledge, and psychology.
| # | Point | Core idea | Quality engineering implication |
|---|---|---|---|
| 1 | Create Constancy of Purpose | Long-term commitment to improvement; customer focus; invest in innovation & training | Drives design for reliability, not just today's spec compliance |
| 2 | Adopt the New Philosophy | Management must lead change; be prepared for transformation | Quality is not a department — it is a system responsibility |
| 3 | Cease Dependence on Mass Inspection | Build quality into the process; inspection is too late & too costly | Prevention > detection; PFMEA before production, not rework after |
| 4 | End Lowest-Price Purchasing | Move toward single suppliers on long-term trust; multiple suppliers = more variability | Supplier qualification programs, approved vendor lists |
| 5 | Improve Constantly and Forever | PDCA; reduce variation; engage all employees | SPC, DMAIC, continuous capability improvement |
| 6 | Institute Training on the Job | People must know how to do their job; training includes tools and improvement methods | Calibration training, GR&R awareness, SPC chart reading |
| 7 | Institute Leadership | Supervisors are coaches, not police; understand processes | Process owners empowered to stop the line on defects |
| 8 | Drive Out Fear | Mutual respect; workers feel valued and can flag problems freely | Open reporting of defects; psychological safety for quality escalation |
| 9 | Break Down Barriers | Cross-functional teams; internal customer concept; common vision | APQP teams, design-manufacturing-quality integration |
| 10 | Eliminate Slogans & Posters | Slogans assume people cause problems — the system does | Fix the process, not the person; root cause analysis not blame |
| 11 | Eliminate Numerical Quotas | Quotas without a plan are demoralising; substitute leadership | Capability targets backed by process improvement plans |
| 12 | Remove Barriers to Pride | Abolish annual merit rating that creates competition; recognise craftsmanship | Team-based quality improvement rewards over individual rankings |
| 13 | Institute Education & Self-Improvement | Workers learn new skills to face future challenges | Statistical literacy training; professional development |
| 14 | Take Action — Transform | Transformation is everybody's job; cultural change starts at the top | Quality culture deployment through management commitment |
Deming's Chain Reaction: Improve quality → costs decrease (less rework, fewer mistakes) → productivity improves → capture the market → stay in business → provide more jobs. The chain begins with quality, not with cost-cutting.
Joseph Juran — The Quality Trilogy & Fitness for Use
Juran defined quality as fitness for use — not conformance to specification. He emphasised top management involvement, project-by-project improvement, and the Pareto principle (vital few vs. useful many). His Quality Control Handbook (1951) remains the definitive reference.
Preparing to meet quality goals. Identify customers, determine their needs, develop product/process features that respond to those needs, establish quality goals.
Meeting quality goals during operations. Evaluate actual performance, compare to goals, act on the difference. The ongoing process of holding the gains — SPC, inspection, audits.
Breaking through to unprecedented levels of performance. Project-by-project — select the project, organise the team, diagnose causes, implement remedies, hold the gains.
| # | Juran's 10 Steps to Quality Improvement |
|---|---|
| 1 | Build awareness of the need and opportunity for improvement |
| 2 | Set goals for improvement |
| 3 | Organise to reach the goals (establish a quality council, identify problems, select projects) |
| 4 | Provide training |
| 5 | Carry out projects to solve problems |
| 6 | Report progress |
| 7 | Give recognition |
| 8 | Communicate results |
| 9 | Keep score of improvements achieved |
| 10 | Maintain momentum by making annual improvement part of the regular systems and processes |
Philip Crosby — Four Absolutes & Quality is Free
Crosby defined quality as conformance to requirements — not goodness or elegance. His 1979 book Quality is Free argued that the cost of poor quality always exceeds the cost of preventing defects. His message to management: the system causes non-conformance, and prevention — not appraisal — is the correct system.
- Definition: Quality is conformance to requirements — not elegance. Do It Right the First Time (DIRFT).
- System: The system of quality is prevention, not appraisal. An error that doesn't exist can't be missed.
- Standard: The performance standard is zero defects — a management standard, not a motivational slogan.
- Measurement: Quality is measured by the Price of Non-Conformance — cost of doing things wrong.
Price of Conformance (POC): All expenses necessary to make things right. Quality functions, prevention efforts, quality education, audits.
Price of Non-Conformance (PONC): All expenses involved in doing things wrong — fixing problems, correcting orders, rework, scrap, warranty claims, customer returns.
Walter A. Shewhart — Father of Statistical Quality Control
Shewhart invented the control chart in 1924 at Western Electric's Hawthorne Works and introduced the PDCA (Plan-Do-Check-Act) cycle. He was the first to distinguish between common cause (chance) variation and special cause (assignable) variation — the foundational insight behind all SPC.
- 📈 Invented the control chart (1924) — X̄-R, p, c, u charts
- 🔄 Developed the PDSA/PDCA cycle (Shewhart Cycle — later popularised by Deming)
- 📊 Distinguished common cause (system) from special cause (assignable) variation
- 📖 Published Economic Control of Quality of Manufactured Product (1931)
Common Cause (Chance): Inherent in the process. Many small, independent sources. Stable and predictable. Only the system (management) can reduce it.
Special Cause (Assignable): An identifiable, specific source outside the system. Intermittent and unpredictable. Operators and engineers can find and fix these.
Pioneer Philosophy Quick-Reference
| Pioneer | Quality defined as | Primary framework | Key exam trigger word |
|---|---|---|---|
| Deming | Reduction of variation; customer satisfaction | 14 Points, System of Profound Knowledge, PDCA | "Common cause / special cause", "chain reaction" |
| Juran | Fitness for use | Quality Trilogy (Planning, Control, Improvement), 10 Steps | "Fitness for use", "project-by-project", "vital few" |
| Crosby | Conformance to requirements | Four Absolutes, Zero Defects, PONC/POC | "Conformance to requirements", "zero defects", "prevention" |
| Shewhart | Statistical control | Control charts, PDCA cycle, common/special cause | "Control chart", "assignable cause", "PDSA" |
| Taguchi | Minimum loss to society | Loss function, robust design, parameter/tolerance design | "Loss function", "nominal is best", "signal-to-noise" |
| Ishikawa | Total quality through all employees | Cause-and-effect diagram, QC circles, 7 tools | "Fishbone", "cause-and-effect", "QC circles" |
Continuous Improvement Frameworks
Five major CI frameworks every quality engineer needs to understand — how they relate, where they differ, and when to apply each.
Lean — Eliminate Waste, Maximise Flow
Lean originated with Ford's mass production principles (1910s) and was systematised into the Toyota Production System (TPS) in the 1950s. James Womack, Daniel Roos, and Daniel Jones documented it for the West in The Machine That Changed the World (1990). Lean identifies eight types of waste (DOWNTIME) and organises the entire enterprise around delivering value at the rate demanded by the customer.
- Value: Specify what creates value from the customer's perspective — not the producer's.
- Value Stream: Map all steps in the process chain; eliminate non-value-adding steps.
- Flow: Make value-creating steps flow without interruption, batching, or waiting.
- Pull: Produce only what is needed by the customer — short-term response to demand rate (takt time).
- Perfection: Continuously pursue elimination of all waste; the process never ends.
- ✓ Reduced waste (DOWNTIME: Defects, Overproduction, Waiting, Non-utilised talent, Transport, Inventory, Motion, Extra processing)
- ✓ Improved quality and customer satisfaction
- ✓ Reduced inventory and cycle time
- ✓ Flexible manufacturing capability
- ✓ Safer workplace and improved employee morale
Six Sigma — Reduce Variation to Near-Zero Defects
Motorola developed Six Sigma in 1987, raising quality standards dramatically. AlliedSignal (now Honeywell), GE, Dow Chemical, DuPont, Whirlpool, and IBM adopted it in the mid-1990s, proving its cross-industry applicability.
Identify what's critical to quality from the customer's perspective
Drive DPMO down; measure defects per million opportunities
Minimise deviation of mean from nominal target value
Tighten standard deviation; narrow the process spread
Theory of Constraints (TOC) — Focus on the Weakest Link
Introduced by Eliyahu Goldratt in The Goal (1984). TOC holds that every system has exactly one constraint limiting overall throughput at any given time. Improving a non-constraint does not improve the system — only improving the current constraint does.
| TOC Step | Action | Key principle |
|---|---|---|
| 1. Identify | Find the current constraint — the weakest link in the chain | Physical, Policy, Paradigm, or Marketplace constraints |
| 2. Exploit | Squeeze maximum performance from the constraint using existing resources — no new investment yet | Don't waste constraint capacity on anything non-essential |
| 3. Subordinate | Align all other activities to support the constraint's pace | A non-constraint running faster than the constraint builds WIP, not output |
| 4. Elevate | If the constraint persists after exploiting and subordinating, invest to break it | Add capacity, change the process, redesign |
| 5. Repeat | Once broken, a new constraint will emerge — return to step 1 | Continuous improvement is never finished |
Total Quality Management (TQM)
TQM is a management approach to achieving customer satisfaction through every person in the organisation working to continuously improve products, processes, and services. Unlike Six Sigma (project-focused) or Lean (waste-focused), TQM is a cultural philosophy. Most quality awards (Baldrige, EFQM, Deming Prize) are grounded in TQM principles.
- 🎯 Customer focus — internal and external customers
- 🔄 Continuous improvement (Kaizen) — forever and ever
- 👥 Total employee involvement — every person owns quality
- 📊 Process approach — manage activities as interconnected processes
- 🤝 Supplier partnerships — extend quality into the supply chain
| Framework | Primary focus | Methodology |
|---|---|---|
| Lean | Waste elimination, flow | Value stream mapping, 5S, Kaizen |
| Six Sigma | Variation reduction, defects | DMAIC, statistical analysis |
| TOC | Throughput, bottleneck | 5 focusing steps, drum-buffer-rope |
| TQM | Culture, customer satisfaction | Quality awards, customer surveys |
| SPC | Process stability and capability | Control charts, capability studies |
Strategic Planning, Deployment & Information Systems
Strategic planning aligns the quality function with organisational goals — covering planning frameworks, deployment tools, and performance measurement including the Balanced Scorecard, leading vs lagging indicators, and project management techniques.
Strategic Planning — VMOSA Framework
The dream — what the organisation aspires to become in the long term
What the organisation does and why it exists — the purpose statement
How much of what — specific, measurable goals to achieve the mission
How — broad approaches used to achieve each objective
Who will do what by when — the specific tasks assigned to specific people
Balanced Scorecard — Kaplan & Norton
Developed by Robert Kaplan and David Norton, the Balanced Scorecard translates strategy into four perspectives of performance measurement — preventing over-reliance on financial metrics alone. Quality professionals use it to frame the value of quality investments in language executives understand.
How do we look to shareholders? Revenue growth, profitability, cost reduction, ROI. Quality metric: Cost of Poor Quality (COPQ) as % of sales revenue.
How do customers see us? Satisfaction scores, NPS, on-time delivery, defect rates in the field, warranty claims per unit.
What must we excel at internally? Process yield, Cpk levels, first-pass yield, defect rate, audit outcomes, cycle time.
Can we continue to improve and create value? Training hours, certifications (ASQ, IASSC), employee engagement, suggestion rate, new quality tools adopted.
Leading vs Lagging Indicators
| Type | Definition | Characteristics | Quality examples |
|---|---|---|---|
| Lagging Indicators | Post-event (output) measures — what has already happened | Easy to measure, historically accurate, but cannot prevent what already occurred | DPMO, defect rate, warranty returns, customer complaints, scrap cost, Cpk |
| Leading Indicators | Predictive (input) measures — early signals of future performance | Difficult to identify and validate; harder to measure; not guaranteed predictors | Training hours, PFMEA completion %, process audit scores, SPC chart compliance, supplier qualification status |
Best practice: Use a mix of both. Lagging indicators tell you what happened; leading indicators tell you where you're heading. A dashboard with only lagging metrics is a rearview mirror — add leading metrics to steer the process proactively.
Stakeholder Identification & Analysis
ISO 9001:2015 clause 4.2 requires organisations to determine interested parties and their requirements. Stakeholder analysis maps each party by their level of interest and power/influence, then defines the appropriate engagement strategy.
- 👔 Internal: Owners, managers, employees, partners
- 🏭 Supply chain: Suppliers, sub-tier suppliers
- 🛒 Market: Customers, end users
- 🏛️ External: Regulators, industry associations, media, local community
Quality Information System (QIS)
A QIS is the data-centric infrastructure of the quality management function — the systems used to collect, store, analyse, and report quality-related data across the organisation.
- 📋 Design reviews and change records
- 🔍 Audit findings and corrective actions
- ⚠️ Non-conformances and dispositions
- 🔧 Repairs, returns, warranty claims
- 😊 Customer satisfaction surveys
- 📊 Test reports, certificates, performance data
- ✓ Identifies priorities for improvement investment
- ✓ Tracks performance of quality initiatives and ROI
- ✓ Enables competitor performance benchmarking
- ✓ Breaks silos — all departments access the same quality data
- ✓ Supports fact-based decision making at every level
Team Dynamics, Leadership & Facilitation Tools
Effective quality improvement requires high-performing teams — covering team types, the Tuckman model of team development, team roles, and the facilitation tools used in quality projects.
Team Types
| Team Type | Description | Quality context |
|---|---|---|
| Functional | Members from same department/function with similar expertise | Quality lab team, inspection team, calibration group |
| Cross-Functional | Members from multiple departments working on a shared goal | APQP team, PFMEA team, 8D corrective action team |
| Virtual | Geographically dispersed team relying on technology to collaborate | Global supplier quality teams, multi-site audit teams |
| Self-Managed | Team with authority to set own goals, methods, and schedules | Autonomous production cells with built-in quality responsibility |
| Quality Circles | Voluntary groups of front-line workers meeting regularly to identify and solve quality problems — introduced by Ishikawa | Shop-floor improvement groups, Kaizen circles |
Tuckman Model of Team Development
Bruce Tuckman's five-stage model (1965, extended 1977) describes the predictable journey teams undergo from formation to high performance. Understanding which stage a team is in allows a leader or facilitator to apply the right intervention.
Members first come together; polite, uncertain about roles and goals; depend on leader for direction
Conflict emerges; teamwork harder than expected; power struggles; important not to suppress but navigate
Team moves beyond storming; norms established; collaboration improves; roles clarified
High performance; team is self-directing; interdependent; focused on goals
Task complete; team disbands; celebrate achievements, capture lessons learned
Team Roles — Leader, Facilitator, Coach, Members
| Role | Primary responsibilities | Key distinction |
|---|---|---|
| Leader | Provides direction; clarifies roles; establishes ground rules; ensures goal completion; conducts meetings; assigns tasks | Has formal authority and accountability for the team's output |
| Facilitator | Helps the team understand its objective and how to achieve it; guides process without dictating content | No formal authority to make decisions — leads by process, not position |
| Coach | One-to-one support after training; first point of contact for issues; uses GROW model | Develops individuals; not the same as a trainer (one-to-many) |
| Members | Participate actively in meetings; perform assigned tasks; contribute ideas in brainstorming | Own the work; team's subject matter experts |
GROW Coaching Model: Goal — what does the team/individual want to achieve? Reality — what is the current state and what challenges exist? Obstacles — what is stopping progress? Way forward — what specific steps will be taken and by when?
Facilitation Tools
Group or individual technique to generate ideas spontaneously for a specific problem. Quantity over quality — defer all judgment during generation.
- 1. Focus on quantity — more ideas = more options
- 2. Withhold criticism — no evaluation during generation
- 3. Welcome unusual ideas — wild ideas often spark practical ones
- 4. Combine and improve — build on others' ideas (1+1=3)
Structured process for problem identification, solution generation, and group decision-making. Prevents dominant voices from controlling the output.
- Introduction and explanation of the problem
- Silent individual generation of ideas (written)
- Round-robin sharing — one idea per person per turn
- Group discussion and clarification
- Voting and ranking to reach group decision
Used after brainstorming generates a long list — reduces/narrows the list using group consensus without endless debate.
Each member selects their top N ideas and ranks them (e.g. top 5, scored 5 down to 1). Scores are summed — highest total = group priority. Repeat until a manageable shortlist remains.
Identifies and maps the forces driving change against the forces resisting it. Developed by Kurt Lewin. Used in change management and improvement planning.
Driving forces (strengthen these): customer demand for fewer defects, competitive pressure, lower downtime, increased sales opportunity. Restraining forces (weaken these): initial investment cost, fear of new technology, habit/inertia.
Conflict Resolution — Thomas-Kilmann Model
| Style | Concern for Self | Concern for Others | When to use |
|---|---|---|---|
| Competing | High | Low | Safety emergencies; critical quality hold decisions; when you know you're right |
| Collaborating | High | High | Complex quality problems requiring buy-in from all parties; best long-term solution matters |
| Compromising | Medium | Medium | When a temporary solution is needed; when both parties have equally valid goals |
| Avoiding | Low | Low | When the issue is trivial; when more information is needed before engaging |
| Accommodating | Low | High | When preserving the relationship matters more than the outcome; when you're wrong |
Customer Relations & Supplier Management
Quality professionals must manage both directions of the value chain — understanding and capturing customer requirements, and ensuring suppliers deliver conforming product and services reliably.
Supplier Lifecycle Management
With mid-to-large corporations spending ~50% of revenue on purchased goods and services, supplier management is critical to organisational success. The Supplier Lifecycle Management framework is a structured, end-to-end approach to managing suppliers transparently, mitigating risk, reducing costs, and building long-term partnerships.
Identify → Shortlist → Prequalify → Bidders list → RFP/RFQ → Evaluate → Award. Includes sub-tier supplier identification.
Set performance expectations; process reviews; evaluations against KPIs (cost, quality, schedule, responsiveness); improvement plans; exit strategies.
Tier suppliers: Non-approved → Approved → Preferred → Certified → Partnership → Disqualified. Classification drives audit frequency and oversight level.
Develop strategic customer-supplier partnerships; shared improvement initiatives; joint development; supply chain resilience strategies.
Supplier Selection Process
| Step | Activity | Key considerations |
|---|---|---|
| 1. Identify | Find potential suppliers; new suppliers may offer cost or quality advantage; promote local suppliers | Market research, industry directories, referrals |
| 2. Shortlist | Screen to avoid late delivery, poor quality, non-responsive suppliers | Market reputation, public information, financial health |
| 3. Prequalify | Assess financial stability, capacity, quality certifications (ISO 9001), client approvals | On-site surveys, questionnaires, certificate verification |
| 4. Bidders List | Maintain a qualified list to avoid repeating prequalification each time | Approved Vendor List (AVL) maintenance |
| 5. Request Bids | RFP — buyer states preferences, bidder explains how they'll meet them. RFQ — buyer provides exact spec, bidder quotes a price | Choose RFP when requirements are not fully defined |
| 6. Evaluate Bids | Score against pre-determined criteria: price, quality, schedule, commercial terms, financial stability, production capability, HSE responsibility | Weighted scoring matrix; multi-person evaluation team |
| 7. Award | Place Purchase Order with selected supplier | Contractual quality requirements, inspection criteria, escalation process |
Supplier Performance Monitoring Parameters
- Under/over budget variance
- Cost savings achieved
- Cost-reduction proposals
- Incoming defect rate (PPM)
- Returns and failures
- Corrective action closure rate
- On-time delivery %
- Shortage incidents
- Lead time vs committed
- Response time to queries
- Flexibility to order changes
- Escalation engagement
Risk Management, Business Continuity & Barriers to Quality
Risk — ISO 31000 Definition & Framework
Risk = Effect of uncertainties on objectives (ISO 31000:2009). An effect is a deviation from expected — positive (opportunity) or negative (threat). A risk that has already occurred is reclassified as an issue. Risk is characterised by its potential consequences and the likelihood of occurrence.
| Risk Management Step | Activity | Quality tool |
|---|---|---|
| 1. Identify Risks | List all potential threats and opportunities that could affect objectives | FMEA, HAZOP, brainstorming, risk register |
| 2. Prioritise Risks | Score by probability × impact; focus resources on high-priority risks | Risk matrix (5×5), RPN in FMEA |
| 3. Mitigation Control | Define actions to reduce probability and/or impact of each risk | Control plans, poka-yoke, redundancy |
| 4. Mitigation Effectiveness | Monitor whether controls are working; update risk register | KPIs, audits, leading indicators tracking |
A system of prevention and recovery for potential threats to the organisation. Covers extreme, existential scenarios.
Common threats: Fire, flood, earthquake, strike, war, power outage, cyber attack, terrorist attack.
A plan for outcomes other than the expected — less extreme than BCP. Covers probable disruptions.
Examples: Supplier bankruptcy, price/currency fluctuation, component discontinuation, key personnel departure.
The capacity to rapidly adapt and recover from internal or external disruptions. IBM identifies six building blocks of resilience:
Recovery · Hardening · Redundancy · Accessibility · Diversification · Autonomic Computing
Supply Chain Risk Categories
| Where | Risk category | Examples |
|---|---|---|
| At Supplier | Natural causes | Flood, earthquake, wildfire destroying plant or inventory |
| At Supplier | Man-made causes | Strike, fire, civil unrest, quality failure, management change |
| At Supplier | Economic causes | Insolvency, sub-supplier failure, currency collapse, credit freeze |
| In Transit | Natural or man-made | Port closure, transport strike, customs hold, damage in transit |
| On Receipt | Quality or reputational | Defective product, counterfeit parts, labelling errors, regulatory non-compliance |
Barriers to Quality Improvement
Understanding why quality improvement initiatives fail is as important as knowing how to run them. The engineering practice tests recognition of these barriers and appropriate countermeasures.
- 🔀 Confusion over the definition of quality — when quality means different things to different stakeholders, initiatives fragment
- 👤 Lack of leadership — quality improvement without visible management commitment fails at the first obstacle
- ⏳ Short-term thinking — quality ROI is often long-term; pressure for immediate financial results kills improvement programs
- 📊 Lack of data — unable to quantify the magnitude of the problem or the benefit of fixing it
- 🎓 Insufficient qualified people — quality improvement requires statistical literacy and tool expertise (Black Belt, quality engineering, etc.)
- ✓ Align on a single, clear quality policy — signed by top management and communicated to all
- ✓ Visibly involve senior leaders in quality reviews, audits, and improvement projects
- ✓ Link quality metrics to the Balanced Scorecard to give them financial language
- ✓ Build a QIS to capture and surface data that quantifies the cost of poor quality
- ✓ Invest in CQT/Black Belt certifications; develop internal quality competency
ASQ Code of Professional Ethics — Three Pillars
Be truthful in all professional interactions. Accurately represent qualifications, certifications, and affiliations. Offer services only within areas of genuine competence. Make decisions in an objective, factual manner.
Hold paramount the safety, health, and welfare of individuals and the public. Treat others fairly, courteously, with dignity, and without discrimination. Act in a socially responsible manner.
Protect confidential information; never use it for personal gain. Disclose and avoid real or perceived conflicts of interest. Give credit where due; do not plagiarise. Obtain and document permission to use others' intellectual property.
Classification of Quality Characteristics
Understanding what quality means to different stakeholders — from product performance to service interactions — is foundational to quality engineers Body of Knowledge. Three frameworks define quality characteristics at different levels of abstraction.
Garvin's 8 Dimensions of Product Quality
David Garvin (Harvard, 1987) proposed that quality is multi-dimensional — a product can be high quality on one dimension and poor on another. This prevents organisations from optimising a single metric at the expense of overall customer value.
| # | Dimension | Definition | Quality engineering relevance |
|---|---|---|---|
| 1 | Performance | Primary operating characteristics — does the product do what it should? | CTQ characteristics, functional specifications, Cpk targets |
| 2 | Features | Secondary supplementary attributes that enhance the basic function | Voice of Customer (QFD), feature vs cost trade-offs |
| 3 | Reliability | Probability that the product performs its intended function over time without failure | MTBF, Weibull analysis, bathtub curve, reliability testing |
| 4 | Conformance | Degree to which a product meets pre-established standards and specifications | Cpk, DPMO, attribute inspection, MIL-STD-1916 |
| 5 | Durability | Useful life of the product before replacement is preferable to repair | Accelerated life testing, design for reliability |
| 6 | Serviceability | Speed, courtesy, competence, and ease of repair | MTTR, design for maintainability, spare parts availability |
| 7 | Aesthetics / Style | How the product looks, feels, sounds, tastes, or smells — subjective | Visual inspection standards, appearance audits, colour matching |
| 8 | Perceived Quality | Reputation and image — what the customer believes based on brand and word of mouth | Customer satisfaction surveys, NPS, warranty claim rates |
Key relationships: Reliability = MTBF/failure rate. Conformance = meets spec/Cpk. Serviceability = MTTR/maintainability. Perceived quality = customer perception/surveys.
SERVQUAL — Service Quality Dimensions
Parasuraman, Zeithaml, and Berry (1985) identified 10 service quality dimensions that customers use to evaluate service. These were later consolidated into 5 dimensions — the RATER model.
Original 10 SERVQUAL Dimensions
| # | Dimension |
|---|---|
| 1 | Reliability |
| 2 | Responsiveness |
| 3 | Competence |
| 4 | Access |
| 5 | Courtesy |
| 6 | Communication |
| 7 | Credibility |
| 8 | Security |
| 9 | Understanding the customer |
| 10 | Tangibles |
Consolidated to 5 — The RATER Model
The ability to perform the promised service dependably and accurately
Knowledge and courtesy of employees; their ability to convey trust and confidence
Appearance of physical facilities, equipment, personnel, and communication materials
Provision of caring, individualised attention to customers
Willingness to help customers and provide prompt service
Lean Deep-Dive — Waste, Metrics, SMED & Visual Controls
Lean is built on one fundamental idea: waste exists in all processes at all levels. Eliminating waste is the key to successful lean implementation and the most effective way to increase profitability without capital investment.
Muda, Mura & Muri — The Three Types of Waste
Type I Muda (Incidental): Non-value-added tasks that seem necessary — business conditions must change to eliminate them (e.g. regulatory inspections).
Type II Muda (Pure Waste): Non-value-added tasks that can be eliminated immediately — no business justification.
Mura exists when workflow is out of balance or workload is inconsistent. Creates alternating overloading and underloading.
SMED reduces Mura by enabling smaller batch sizes and more frequent changeovers — smoothing out production flow.
For people: too heavy a mental or physical burden — leads to quality errors, injuries, and absenteeism.
For machines: running beyond designed capacity — leads to breakdowns and quality deterioration.
8 Types of Muda — DOWNTIME
The original Toyota Production System identified 7 types of muda. Western lean practitioners added an 8th — under-utilised staff (knowledge, talent, and creativity). The acronym DOWNTIME (or TIMWOOD) covers all eight:
| Letter | Waste | Definition | Example |
|---|---|---|---|
| D | Defects | Sorting, rework, repetition, or making scrap | Welding defects requiring re-weld; wrong labels requiring replacement |
| O | Overproduction | Producing too much, too early, and/or too fast | Printing 1,000 brochures when only 200 are needed |
| W | Waiting | People or parts waiting for a work cycle to finish | Operator idle while machine cycles; material waiting in queue |
| N | Non-utilised talent | Failure to exploit employees' knowledge, skills, and creativity | Asking assembly workers to follow instructions without seeking their improvement ideas |
| T | Transportation | Unnecessary movement of people or parts between processes | Moving parts from one building to another before assembly |
| I | Inventory | Materials parked and not having value added to them | Raw material sitting in a warehouse for 3 weeks |
| M | Motion | Unnecessary movement of people or parts within a process | Operator walking 15m to get tools that could be stored at the workstation |
| E | Extra Processing | Processing beyond what the customer requires or demands | Polishing a surface that will be hidden; generating reports nobody reads |
Standard Work
Standard Work means doing work in a standard way — one best-known method, followed consistently by all people for that task. It is the foundation of quality, safety, and continuous improvement.
- ✓ All people perform one task in one way only
- ✓ Eliminates variation caused by different methods
- ✓ Makes abnormalities immediately visible
- ✓ Improvements lead to revised standard work — the PDCA cycle applied to work methods
- ✓ Not "the boss's way" — the best-known way, documented and agreed
- Standard Work Chart: Shows sequence of tasks, times, and movement in a cell layout
- Job Instruction Sheet: Step-by-step WI with quality checkpoints and safety notes
- Time Observation Sheet: Records actual vs takt time — identifies bottlenecks
Process Flow Metrics — Takt, Cycle, Lead Time & Throughput
| Metric | Definition | Formula | Worked example |
|---|---|---|---|
| WIP Work In Progress |
Partially finished goods in the process waiting for completion | — | 50 units partially assembled on the production floor |
| WIQ Work In Queue |
Material at a workstation waiting to be processed (subset of WIP) | — | 12 units waiting in the queue at Process 3 (the bottleneck) |
| Touch Time | Time material is actually being worked on — excludes moving and waiting | — | 30-minute cycle time; 8 min actual machining → touch time = 8 min |
| Takt Time | Time available to produce one unit to meet customer demand | Takt = Net time / Demand | 40 hrs/week, 10 units/week → Takt = 4 hrs/unit. With 1 hr breaks: Net = 35 hrs → Takt = 3.5 hrs/unit |
| Cycle Time | How long it takes to complete a specific task from start to finish — for one process step | CT = 1 / Throughput | If takt = 3.5 hrs, and Process 3 takes 3.5 hrs → Process 3 is balanced. If it takes 4 hrs → bottleneck. |
| Lead Time | Total time from work requested to work delivered — includes all waiting and processing time | LT = WIP / Throughput | WIP = 50 units, Throughput = 10 units/day → Lead Time = 50/10 = 5 days |
| Throughput Rate | Average number of units processed per time unit | TR = 1 / Cycle Time | Cycle time = 20 min → TR = 3 units/hr → 24 units/8hr shift |
SMED — Single-Minute Exchange of Die
SMED is a lean methodology for rapidly converting a manufacturing process from running one product to running the next. Developed by Shigeo Shingo. "Single-Minute" means less than 10 minutes (single digit) — not literally 1 minute.
- ✓ Reduced inventory (smaller economic batch sizes)
- ✓ Increased machine utilisation despite more changeovers
- ✓ Elimination of setup errors
- ✓ Reduced defect rates (less scrap at startup)
- ✓ Reduces Mura — balances production line
- Separate internal from external setup operations (internal = machine must stop; external = can be done while machine runs)
- Convert internal to external setup
- Standardise function, not shape
- Use functional clamps or eliminate fasteners altogether
- Use intermediate jigs
- Adopt parallel operations
- Eliminate adjustments
- Mechanisation
Visual Controls — Andon & Jidoka
| Type | Question answered | Examples |
|---|---|---|
| Identification | What is it? | Labels, colour-coded bins, part numbers |
| Informational | What is the current status? | Andon lights, production boards, KPI dashboards |
| Instructional | How should the task be performed? | WI posted at workstation, standard work charts |
| Planning | What is the plan? | Kanban boards, production schedules, Gantt charts |
A visual control device that indicates the status of a machine, line, or process at a glance:
The ability to stop work (machine or line) when a problem is detected. Prevents defects from being passed downstream and ensures immediate corrective action. The Andon system is the device that activates Jidoka by signalling the problem.
OEE — Overall Equipment Effectiveness
OEE measures how effectively a manufacturing operation is utilised, combining availability, performance, and quality into a single metric. World-class OEE is generally considered to be ≥85%.
| Component | Formula | Measures |
|---|---|---|
| Availability | Run Time / Planned Production Time | Unplanned downtime losses |
| Performance | Actual Output / Max Possible Output | Speed losses and minor stoppages |
| Quality | Good Parts / Total Parts Produced | Defects and rework losses |
Downtime: 60 min
Run time: 420 min
Availability: 420/480 = 87.5%
Ideal cycle time: 1 min/part
Actual output: 400 parts (420 possible)
Performance: 400/420 = 95.2%
Good parts: 380 of 400
Quality: 380/400 = 95.0%
OEE = 87.5% × 95.2% × 95.0% = 79.1%
World-class benchmark: Availability ≥90%, Performance ≥95%, Quality ≥99.9% → OEE ≥85%
Root Cause Analysis — Finding the Real Problem
Most organisations fix the same problems over and over. Root cause analysis (RCA) breaks that cycle by asking why until the true source of a problem is found — then eliminating it permanently. Based on ASQ sources including Andersen & Fagerhaug and Duke Okes.
Only a system-level cause — a change to the way the organisation operates — truly prevents recurrence. Physical cause fixes are necessary but not sufficient.
The Cause Hierarchy — Drilling Down
Every visible problem sits at the top of an iceberg. Below it are layers of cause. Most organisations only fix the visible tip.
The tangible, material thing that failed or caused the event. Also called direct, immediate, or proximate cause. Fixing it is necessary — but only solves this occurrence.
Human error, forgetfulness, or lack of skill. Critical: don't stop here. Ask what system failed to support the human. Blame eliminates people, not problems.
A policy, procedure, training gap, or organisational decision that created the conditions for the failure. This is the root cause. Fixing it changes how the organisation operates — preventing the whole class of problem.
The 6-Step RCA Process — The Story Arc
Think of RCA as a detective story. You start with a crime scene (the event), gather evidence (causes), interrogate witnesses (data), find the culprit (root cause), and change the system so it can never happen again.
The 5 Whys — A Worked Example
Developed at Toyota as part of the TPS. The idea: keep asking "why" until you reach the system-level cause. Five iterations is a guideline — stop when you reach something that can be permanently changed.
RCA Toolbox — The Right Tool for Each Step
Fishbone (Ishikawa) Diagram
Organises possible causes into 6M categories: Machine, Method, Material, Man, Measurement, Mother Nature. The "spine" points to the problem; "bones" are cause categories.
5 Whys
Ask "why" repeatedly until a system-level cause is reached. Simple, fast, and effective for straightforward problems. For complex issues, use a Cause-and-Event Tree.
Pareto Chart
Ranks causes by frequency or cost. Reveals the vital few from the trivial many. The 80/20 principle — 20% of causes typically create 80% of problems.
Cause-and-Event Tree
A hierarchical diagram showing connections between causes at different levels. Used to prune possible causes, reveal compound causes, and trace pathways from event back to root.
Impact / Effort Matrix
Plot each potential solution on a 2×2 grid: impact (high/low) vs effort (high/low). Quick wins sit in high-impact, low-effort. Avoid low-impact, high-effort.
Force Field Analysis
Lists forces driving the change against forces restraining it. Helps teams plan how to amplify driving forces and reduce resistance before implementation begins.
8 Mistakes That Kill an RCA
"We fixed the jam" — without asking why the jam happened or why it wasn't prevented.
Stopping at the physical cause — "we replaced the part." The system that allowed it to fail is unchanged.
"Operator error" is never a root cause. It is always a prompt to ask: what system failed to prevent or catch the human error?
"The process is slow." Without specifics — what, where, how often, at what cost — the team will solve different problems.
Teams jump to "I think it's X" before mapping the process or collecting evidence. Confirmation bias sets in.
Many problems have multiple independent causes — fixing one doesn't eliminate the other. Each branch needs its own "why" chain.
Implementing a solution at full scale without testing it first. If it doesn't work, the cost and disruption are multiplied.
Without returning to the Step 1 metrics after implementation, you never know if the root cause was truly found and fixed.
Quality Systems
QMS certification, PPAP/APQP, special characteristics, 8D problem-solving with hard deadlines, and supplier performance management — the complete automotive supply chain quality framework.
Quality, Cost & Delivery — Zero Defect is Not Aspirational
Every supplier QMS must deliver green-rated performance across QCD. These are operational standards with zero tolerance on Safety & Regulation requirements.
Zero-Defect Core Objectives
Zero defective parts shipped to the customer. No acceptable defect rate — the target is absolute prevention, not statistical tolerance.
Safety and Regulation characteristics carry absolute zero tolerance. No sampling plan, no concession, no deviation permitted.
Zero Incidents per Billion — the field performance target for safety-critical systems. Drives design robustness requirements upstream.
Supplier Self Assessment (SSA) fully compliant. Maintained green status on the OEM Supplier Scorecard across quality, delivery, and responsiveness metrics.
QMS Certification Progression
AIAG Core Tools — All Five Required in Every Supplier QMS
Maintain quality records — retrievable and legible — for the life of the program. Applies to sub-suppliers.
Non-conforming product records retained for trend analysis per AIAG / ISO 9001 / IATF 16949.
PPAP — Proving Production is Ready Before It Starts
PPAP is the supplier's formal proof that the production process can consistently make conforming parts at the quoted rate. It is not a one-time paperwork exercise — it is evidence of process understanding. Level 3 is the default: PSW + complete 18-element data package.
The 18 PPAP Elements — What Every Package Must Contain
The AIAG PPAP manual (4th edition) defines 18 elements. Which elements are required for submission depends on the Level (1–5) — but the supplier must generate all elements internally regardless of what is submitted to the customer.
All drawings (CAD/2D), specifications, and engineering change documents. If supplier owns design: DFMEA required. Customer-owned design: drawings provided by customer.
All open engineering changes not yet incorporated into the design record. Must show written customer authorisation. Includes ECNs, deviation permits, and waivers.
Written approval from the customer engineering activity — typically a signed prototype or pre-production buy-off. Required before production tooling is committed.
Required when supplier owns the design. Documents all potential failure modes of the design and their effects. Severity, Occurrence, and Detection ratings. Must be live — not a snapshot.
Step-by-step flow of the entire production process — from incoming material through shipping. Must match the Control Plan and PFMEA. Includes all operations, inspections, and rework loops.
Risk analysis of the manufacturing process — not the design. Documents how each process step can fail, its effect on the product, and controls in place. Drives the Control Plan. RPN threshold typically ≤100.
Three phases required: Prototype, Pre-Launch, and Production. Documents every control method for each characteristic — measurement method, frequency, sample size, reaction plan. The living document of process control.
GR&R studies for all gauges measuring CCs and SCs. Typically 3 operators × 10 parts × 2 trials. %GRR <10% preferred; <30% conditionally acceptable; >30% — gauge must be improved before PPAP.
Full balloon-drawing inspection of a minimum 6 parts (or per customer requirement). Every characteristic on the print — not just CCs. Results shown in table format with nominal, tolerance, and actual measured values.
Test results for all material specifications (tensile, hardness, chemical composition) and functional performance tests (fatigue, pressure, thermal cycling). Must include lab certification and traceability to production material.
SPC data from the PPAP production run for all CCs and SCs. Minimum 25 subgroups / 100 data points. Cpk ≥ 1.67 required for initial study. If not achieved: 100% inspection mandatory until Cpk improves.
Scope of accreditation for all labs performing tests (internal or external). ISO/IEC 17025 accreditation preferred. Must show the tests performed are within the lab's accredited scope.
Required only for parts with appearance specifications (colour, texture, gloss, surface finish). Customer sign-off on physical colour/texture masters. AAR is a separate customer approval — not a dimensional check.
Typically 6 production parts from the PPAP run (or per customer CSR). Must be from production tooling, at production rate, using production materials. Not prototype or pre-production parts.
One part signed off by both supplier and customer. Retained at the supplier (or customer if required) as the reference standard for appearance, dimensions, and functional acceptance criteria throughout the programme.
All part-specific gauges, fixtures, jigs, and templates used for inspection. Must be documented and calibrated. Checking aid drawings and calibration records submitted where required by the customer.
Any additional requirements from the OEM Customer Specific Requirements (CSRs). Each OEM publishes their own CSR supplement — e.g. GM BIQS, Ford Q1, Stellantis Supplier Quality. These override the standard PPAP manual where they conflict.
The cover document — supplier's declaration that the submitted parts meet all requirements and the package is complete. Signed by authorised supplier representative. No PPAP is valid without a signed PSW. This is Element 18 and the final gating document.
PPAP Submission Levels — What You Send vs What You Keep
The Level defines what is physically submitted to the customer. All 18 elements must be generated and retained at the supplier site regardless of level.
| Level | What is Submitted to Customer | When Used |
|---|---|---|
| 1 | PSW only (warrant only, no data) | Non-critical, commodity parts; customer waives data submission |
| 2 | PSW + limited supporting data + samples | Low-risk parts; customer selects specific elements to review |
| 3 | PSW + complete data package (all 18 elements) | Default level — used unless customer specifies otherwise |
| 4 | PSW + other requirements as defined by customer | Customer specifies exactly what additional data is required beyond PSW |
| 5 | PSW + complete package reviewed at supplier's manufacturing site | New suppliers, new processes, high-risk parts — customer sends team to supplier |
| Characteristic | Study Type | Min Cpk |
|---|---|---|
| Critical Characteristic (CC) | Initial PPAP | ≥ 1.67 |
| CC / SC | Ongoing production | ≥ 1.33 |
| Below target | Any | 100% inspect |
All changes require minimum 90 days advance notice and written approval before implementation.
A new PPAP with PSW is required before serial production resumes after any approved change.
Triggers: manufacturing location change · material change · design change · tooling inactive 12+ months · sub-supplier change
PPAP is the output; APQP is the process that generates it. These APQP deliverables directly populate the 18 elements:
Special Characteristics — CC / SC / IC
Must appear on all supplier Process Flow Diagrams, FMEAs, and Control Plans. Identified by symbols on engineering drawings.
Critical Characteristic
Affects government regulation compliance or safety. Any deviation could endanger the end user.
- Process performance studies + ongoing monitoring per Control Plan
- Cpk > 1.33 (initial: 1.67) or 100% inspection
- 100% automatic control + poka-yoke + SPC
Significant Characteristic
Important for customer satisfaction. Affects fit, functionality, durability, or processing.
- Process performance studies + ongoing monitoring
- Cpk > 1.33
- 100% automatic control + poka-yoke + SPC
Important Characteristic
Identified by expert knowledge as important product/process parameter for quality performance.
- Process performance studies at initial and subsequent part submissions only
8D Problem Solving
Structured 8-step approach — find and eliminate the systemic weakness that allowed the problem to occur, not just fix the symptom.
Problem Description & Team
Define the problem with data. Assemble cross-functional team with relevant expertise. Launch immediately.
Problem Definition
Quantify with data. Is/Is-Not analysis. Define what is wrong, where, when, how much.
Containment Actions
Protect the customer immediately. Document D3 actions and verify their effectiveness.
Root Cause Analysis
Identify root cause for occurrence AND non-detection. Use 5-Why, fishbone, fault tree.
Define Corrective Actions
Select best permanent corrective action. Define implementation plan with owners & dates.
Implement & Verify
Confirm actions implemented. Provide evidence (photos, data, updated documents). Verify effectiveness with data.
Prevent Recurrence
Update FMEAs, Control Plans, Process Flow, work instructions, training. Apply lessons to similar processes.
Official Closure
Confirm effectiveness, remove containment, officially close, recognize the team, file the report.
Must communicate at D3, D5, and D8. When D8 takes >90 days, weekly reviews with the SQR are expected. Written response required for all chargebacks, even disputed ones.
Supplier Performance Evaluation
Expectation: zero (0) defects. Performance tracked across KPI categories for volume allocation, global expansion, and future business decisions.
Scorecard KPIs
Response Time Requirements
| Milestone | Deadline | Deliverable |
|---|---|---|
| Initial response | 24 hours | Problem description + team launch |
| D3 Containment | 48 hours | Containment actions confirmed in place |
| D5 Root Cause | 10 working days | Root cause + corrective action plan |
| D6 Implementation | 30 working days | Actions confirmed + supporting evidence |
| D8 Closure | ≤ 90 days | Official 8D closed & filed |
🚨 Escalation Model
NCR Tracking
Non-conformances tracked, action plans monitored.
Increased Oversight
Weekly reviews, SQR direct involvement.
Special Status
Customer notification, audit scheduled.
Business Hold
No new business awards. Potential disqualification.
Glossary
Structured methodology defining steps to ensure products satisfy customers. Covers design/development, process design, product/process validation, and feedback/corrective action.
Defines requirements for production part approval including bulk materials. Determines customer requirements are understood and the process can consistently produce conforming product at the quoted rate.
Authorizes serial production. Contains supplier/part info, required documentation, and disposition. An approved PSW is required before the first serial production shipment.
Proactive risk management tool. Identifies potential failure modes, their effects, and causes. DFMEA required when supplier owns product design. PFMEA covers process failures. RPN = Severity × Occurrence × Detection.
SNCR (Supplier Non-Conformance Report) issued when plant receives out-of-spec material — triggers an 8D. SCB (Supplier Charge Back) recovers costs: extra freight, line stoppages, rework, sort, scrap, travel, recalls.
Suppliers must screen ECHA publications at least twice per year. Submit Article 33 information to customers if products contain SVHC above 0.1% w/w. Safety Data Sheets required per Art. 31 EU REACH Regulation.
Inventory practice ensuring oldest stock shipped first. Prevents obsolete material reaching the customer. Mandatory for all suppliers. Shelf-life limits must be monitored and respected at all times.
ISO 9001:2015 — The Complete Quality Management System Standard
ISO 9001:2015 is the world's most widely adopted quality management system standard. It moved from a 23-element prescriptive model (2008) to a risk-based, process-driven framework built on Annex SL's High Level Structure, enabling integration with ISO 14001, ISO 45001, and other management system standards.
ISO 9001 Revision History
| Year | Edition | Key change |
|---|---|---|
| 1987 | 1st issue | First international QMS standard — prescriptive 20-element model |
| 1994 | 2nd issue | Minor updates, maintained 20-element structure |
| 2000 | 3rd issue | Major restructure — process approach introduced, 8 sections |
| 2008 | 4th issue | Clarifications only — no new requirements added |
| 2015 | 5th issue (current) | Annex SL structure, risk-based thinking, no Quality Manual required, no Management Representative, no Preventive Action clause |
Key Changes: 2008 → 2015
| ISO 9001:2008 term | ISO 9001:2015 term |
|---|---|
| Products | Products and services |
| Documentation / Records | Documented information |
| Work environment | Environment for the operation of processes |
| Purchased product | Externally provided products and services |
| Supplier | External provider |
ISO 9001:2015 — 10-Section Structure (PDCA)
Document Control (ISO 9001:2015 §7.5) & Configuration Management (ISO 10007)
Documentation Hierarchy
| Level | Document type | Contains | ISO 9001:2015 |
|---|---|---|---|
| 1 | Quality Manual | System overview, scope, policy | No longer mandatory |
| 2 | Procedures | High-level process overview — multi-discipline, no detailed "how" | "Documented information" |
| 3 | Work Instructions | Step-by-step "how the work is done" | Retain as evidence |
| 4 | Forms / Records | Empty = document; filled = record | Protected from alteration |
§7.5.3 requires documented information to be: available and suitable for use when needed, and adequately protected. Control activities include distribution, version control, storage, retention, and disposition.
Configuration Management (ISO 10007:2017)
Configuration management ensures product integrity over time by systematically controlling changes to the interrelated functional and physical characteristics of a product.
| Step | Activity |
|---|---|
| Identification | Define and label all configuration items (part numbers, revision levels) |
| Change Control | Formal review and approval before any change is implemented |
| Status Accounting | Record and report on the current state of all configuration items |
| Audit | Verify actual product matches documented configuration baseline |
Example: Product version A = Part A rev 0 + Part B rev 1 + Part C rev 7. Version B = Part A rev 0 + Part B rev 2 + Part C rev 7. Change control ensures version B is formally released before production switches.
ISO 9001 Certification Chain
Core vs Support Processes
Processes that must be performed and have significant direct impact on the organisation's success and ability to meet customer requirements.
Examples: order processing, product design, manufacturing, delivery, customer service
ISO 9001:2015 explicitly requires a process approach. Processes are defined by their inputs, outputs, interrelationships, and alignment with the strategic plan.
↑_______________________↑
Feedback loop
Quality Audits — Complete Reference
ISO 19011:2018 provides guidelines for auditing management systems. Audits are systematic, independent, documented processes for obtaining evidence and evaluating it objectively to determine the extent to which audit criteria are fulfilled.
Audit Types — Two Classification Systems
By Scope
| Type | Scope | Purpose |
|---|---|---|
| System Audit | Comprehensive — multiple processes and their interactions | Overall QMS conformance |
| Process Audit | One specific process, activity, or function | Compare actual process to documented requirements |
| Product Audit | A specific product or batch | Assess "fitness for use" — does product meet design requirements? |
By Party
| Party | Conducted by | When |
|---|---|---|
| 1st Party | Internal — organisation audits itself. Auditors have no vested interest in the area audited. | Ongoing internal improvement |
| 2nd Party | Customer — audits its supplier before or after awarding a contract | Supplier qualification, surveillance |
| 3rd Party | Independent audit organisation — free from any conflict of interest in the customer-supplier relationship | ISO certification, regulatory compliance |
| Special type | Description |
|---|---|
| Registration Audit | Third-party audit to obtain ISO 9001 (or other standard) certification |
| Compliance Audit | Confirms conformance to a specific standard or procedure. Differs from improvement audits — focuses on evidence of conformance, not performance improvement. |
Audit Participants — Roles & Responsibilities
| Role | Definition | Key responsibilities |
|---|---|---|
| Client | Organisation or person requesting the audit | Initiates audit · Defines purpose and scope · Provides resources · Receives report · Determines distribution · Decides on actions |
| Lead Auditor | Auditor responsible for leading the audit team | Develops and communicates audit plan · Assigns roles · Chairs opening and closing meetings · Ensures team stays on track · Issues report and follow-up |
| Auditor | Person who conducts the audit | Understands purpose and scope · Plans audit · Collects and analyses evidence · Reports findings · Follows up actions |
| Auditee | Organisation or individual being audited | Informs staff · Provides resources and escorts · Shows objective evidence · Cooperates · Determines and initiates corrective actions |
| Technical Expert | Person who provides specific knowledge or expertise to the audit team | Supports auditors with specialist knowledge — not an auditor themselves |
| Observer | Accompanies the audit team but does not audit | May be a trainee auditor or a regulatory observer — no active role in the audit |
| Guide | Person appointed by the auditee to assist the audit team | Facilitates access, escorts, helps with logistics — does not influence audit findings |
The Audit Process — Six Stages
Audit Report — 7 Quality Characteristics
Free from errors and distortions — purpose clearly communicated
Fair, impartial, and unbiased — evidence-based conclusions only
Easy to understand, logical flow — no ambiguous language
Straight to the point — no unnecessary detail or padding
Helps the client improve — practical, actionable recommendations
Includes all relevant facts — nothing important omitted
Well-timed to enable decisions on recommendations — not delayed
Correction — fix the immediate problem
Corrective Action — eliminate the root cause
Preventive Action — prevent potential future issues
Effectiveness is verified, possibly in a subsequent audit.
Cost of Quality & Quality Training
Cost of Quality — The Four Categories
Management understands the language of money. Quantifying the cost of quality justifies spending on prevention and improvement activities, and sets measurable targets. Every pound/dollar spent on prevention reduces the much larger internal and external failure costs.
Money spent to prevent defects from occurring in the first place. The highest ROI category — every £1 spent on prevention saves £10–£100 in failure costs.
- • Quality planning and system development
- • Education and training (SPC, FMEA, statistical methods)
- • Design reviews and FMEA
- • Supplier reviews and qualification
- • Quality system audits
- • Process planning and capability studies
Money spent on inspecting and testing to detect defects. Necessary but non-value-adding — the goal is to reduce the need for appraisal by improving prevention.
- • Test and inspection (receiving, in-process, final)
- • Supplier acceptance sampling
- • Product audits
- • Calibration of measurement equipment
Cost of defects discovered before the product reaches the customer. Painful but preferable to external failures.
- • In-process scrap and rework
- • Troubleshooting and repair
- • Design changes caused by quality problems
- • Extra inventory to buffer poor yields
- • Re-inspection and retest of reworked items
- • Downgrading (selling at lower price)
The most expensive category — defects discovered after delivery. Includes not just direct costs but reputational damage and lost future business.
- • Sales returns and allowances
- • Service level agreement penalties
- • Complaint handling and investigation
- • Warranty field labour and parts
- • Recalls
- • Legal claims and litigation
- • Lost customers and business opportunities
Visible COPQ (above the waterline): rejection, rework, repair, inspection costs — easily measured
Invisible COPQ (iceberg below waterline): lost sales, excess inventory, additional controls and procedures, complaint investigation, legal fees, customer dissatisfaction — hard to quantify but often much larger
Optimum Quality Cost Model
Assumed that improving quality beyond a certain level leads to increasing costs — there was an "optimal" defect rate where prevention + appraisal costs balanced failure costs. This model suggested that 100% quality was too expensive.
Quality improvement consistently leads to cost reduction — there is no point of diminishing returns. Higher quality means fewer failures, less rework, less inspection, less warranty. Crosby's "Quality is Free" thesis is supported by this model.
Quality Training — ADDIE Model
The ADDIE model is the standard instructional design framework for developing quality training programmes. It provides a systematic approach to ensure training is effective, relevant, and measurable.
Learning environment, learners' existing knowledge, needs analysis, gap assessment
Learning objectives, exercises, content structure, lesson planning, media selection
Create and assemble the content, materials, and resources
Deliver the curriculum — method of delivery, testing procedures, actual training
Collect feedback, measure outcomes, refine the programme
Kirkpatrick Model — 4 Levels of Training Effectiveness
Donald Kirkpatrick's four-level model (1959, still the industry standard) provides a framework for evaluating whether training actually achieves its intended purpose. Levels build on each other — you must satisfy Level 1 before Level 2 matters, and so on.
| Level | Name | What is measured | How measured | Quality context |
|---|---|---|---|---|
| 1 | Reaction | The degree to which participants find the training favourable, engaging, and relevant to their jobs | Post-training surveys, smile sheets, immediate feedback forms | Did quality engineers find the SPC training useful and applicable to their work? |
| 2 | Learning | The degree to which participants acquired the intended knowledge, skills, attitude, confidence, and commitment | Pre/post knowledge tests, skill demonstrations, simulations | Can engineers now correctly calculate Cpk and interpret control chart signals? |
| 3 | Behaviour | The degree to which participants apply what they learned when back on the job | Observation on the job, supervisor assessments, 90-day follow-up | Are engineers actually using SPC charts and reacting to out-of-control signals? |
| 4 | Results | The degree to which targeted outcomes occur as a result of the training and the support package | Business metrics — scrap rate, Cpk improvement, DPMO reduction, COPQ reduction | Has the quality of shipped products improved as a result of the SPC training programme? |
Most organisations only measure Level 1 (satisfaction surveys) and stop there. True training effectiveness requires measuring Level 4 business results — which is the only way to justify the training investment. For quality engineers, the ROI metric is usually COPQ reduction.
Product & Process Control — Material, Nonconformance & HACCP
Section IV of quality engineers Body of Knowledge covers the practical controls applied during production — from hazard analysis through material identification, segregation, nonconformance handling, and corrective action.
Documentation Hierarchy — Quality System Pyramid
System overview, scope, policy. Not mandatory under ISO 9001:2015 but still widely used.
High-level process overview — multi-discipline, does not include detailed "how". Answers WHAT and WHO.
Step-by-step detail of how work is performed. SOPs describe a process; WIs describe a task within a process.
Empty form = document. Filled-in form = record. Records provide evidence of compliance and must be protected from unintended alteration.
HACCP — Hazard Analysis Critical Control Point
HACCP is a systematic preventive approach to food safety. It identifies physical, chemical, and biological hazards in production processes and establishes key limits to reduce these risks. The underlying goal: preventing problems from occurring is better than correcting them after the fact. The term "Critical Control Point" (CCP) is widely borrowed beyond food — it refers to any point where failure of the SOP could cause harm to customers or the business.
| # | HACCP Principle | What it means |
|---|---|---|
| 1 | Hazard Analysis | Identify all potential hazards (biological, chemical, physical) at each process step |
| 2 | CCP Identification | Determine which steps are Critical Control Points — where control is essential to prevent/eliminate a hazard |
| 3 | Critical Limits | Establish the maximum/minimum values (e.g. minimum cooking temperature) that must be met at each CCP |
| 4 | Monitoring Procedures | Define how and how often each CCP will be monitored to ensure critical limits are met |
| 5 | Corrective Actions | Specify actions to take when monitoring indicates a CCP is not under control |
| 6 | Verification Procedures | Confirm the HACCP system is working effectively — audits, testing, record reviews |
| 7 | Record Keeping | Maintain documentation of monitoring, deviations, corrective actions, and verification activities |
- 🌡️ Thermal processing — cooking temperature/time
- ❄️ Chilling — storage temperature control
- 🧪 Testing ingredients for chemical residues
- ⚖️ Product formulation control
- 🔩 Testing product for metal contaminants
A CCP is the "stop sign" of the process — the point where if the control fails, the hazard reaches the customer. Not every process step is a CCP; only those where control is critical to safety or product integrity.
Material Identification, Status & Traceability (ISO 9001:2015 §8.5.2)
Ability to determine that the specified material grade and size are being used at every stage.
PMI (Positive Material Identification) — mandatory physical test for critical materials (e.g. alloy verification for pressure vessels, pipelines)
Material must be clearly labelled with its current disposition status:
Ability to identify a specific item throughout its life and link it to its Mill Test Report (MTR). Covers: origin of materials and parts, processing history, distribution and location after delivery.
Organisation shall control unique identification of outputs when traceability is required, and retain documented information to enable traceability.
Material Segregation & Classification
Physical separation of materials to prevent mixing, cross-contamination, or unintended use. Key segregation categories:
- ✓ Pass / Fail separation at inspection
- ⏳ Quarantine area — material pending review decision
- 🏷️ Different material classes (e.g. Carbon Steel vs Stainless Steel — must never mix)
| Term | ISO 9000:2015 definition |
|---|---|
| Nonconformity | Non-fulfilment of a requirement. Broader term — includes any deviation from spec, process, or standard. |
| Defect | Nonconformity related to an intended or specified use. Defects adversely affect the functionality of the product. All defects are nonconformities, but not all nonconformities are defects. |
Use "nonconformity" in contractual/legal contexts (safer). Use "defect" only when the functionality impact is confirmed.
Nonconforming Outputs — ISO 9001:2015 §8.7
§8.7 requires that nonconforming outputs be identified and controlled to prevent unintended use or delivery. The organisation must take action based on the nature and effect of the nonconformity — including after delivery.
| §8.7 Disposition option | What it means |
|---|---|
| a) Correction | Rework, repair, or reprocess to make the output conform |
| b) Segregation / Containment / Return / Suspend | Physically separate, return to supplier, or stop provision of service |
| c) Inform the customer | Notify the customer that nonconforming product may have been delivered |
| d) Accept under concession | Release with customer or relevant authority authorisation — documented deviation |
After correction, conformity must be re-verified before release. All dispositions must be documented (retain the documented information).
Corrective Action — ISO 9001:2015 §10.2
When a nonconformity occurs (including a complaint), the organisation must react and take corrective action to eliminate the root cause:
| Step | Requirement |
|---|---|
| a) | React to the nonconformity — contain, correct immediately |
| b) | Evaluate need to eliminate root cause(s) — to prevent recurrence |
| c) | Implement any needed action |
| d) | Review the effectiveness of the corrective action taken |
| e) | Update risks and opportunities if necessary |
| f) | Make changes to the QMS if necessary |
Correction vs Corrective Action: Correction fixes the immediate problem (rework). Corrective Action eliminates the root cause (process change) to prevent it recurring. Only CA prevents future occurrences.
Corrective Action Process — Problem Solving Steps
| Step | Activity |
|---|---|
| 1. Problem Identification | Define and quantify the problem clearly — what, where, when, how often, how much |
| 2. Failure Analysis | Analyse the failure — what failed and how. Reproduce the failure if possible. |
| 3. Root Cause Analysis | Identify the true root cause — use 5-Why, Fishbone, or fault tree. Address the system, not just the symptom. |
| 4. Problem Correction | Implement the corrective action — change the process, design, procedure, or training to eliminate the root cause |
| 5. Recurrence Control | Implement controls to prevent recurrence — update FMEA, control plan, WI, training records |
| 6. Verification of Effectiveness | Confirm the CA worked — monitor KPIs, check DPMO, audit the new process. Close only when effectiveness is confirmed. |
- 🔒 Error proofing / Poka-Yoke
- 🛡️ Robust Design (Taguchi parameter design)
- 📋 QMS — ISO 9001:2015
- 📊 FMEA — proactive risk identification
- 🏭 Lean thinking — 5S, standard work
Correction vs CA vs PA (ISO 9000:2015): Correction = fix this defect now. Corrective Action = eliminate the cause so it doesn't recur. Preventive Action = eliminate the cause of a potential (not yet occurred) problem.
Seven Basic Quality Tools
Introduced by Kaoru Ishikawa in the 1960s, these seven tools form the foundation of quality problem-solving. All Quality Circle members are trained to use them. Together they move a team from raw data collection through root cause identification to ongoing process monitoring.
Check Sheet
A structured data-collection form used to manually tally and record the number of observations of specific events. It is the first tool applied — it creates the raw data that feeds every other tool.
Cause-and-Effect Diagram
Also called Fishbone or Ishikawa diagram. Graphically displays the relationship between an effect (the problem) and all possible causes, organised by the 6M categories.
Histogram
A bar chart displaying the distribution of measurements — the bars touch (continuous data). Quickly reveals the centre, spread, and shape of the data, providing clues to reducing variation.
Pareto Chart
Bars in descending order of magnitude with a cumulative percentage line. Based on the Pareto Principle (80/20 rule): approximately 80% of problems come from 20% of causes.
Scatter Diagram
A plot of one variable against another on an X-Y graph. Reveals the strength and direction of a relationship between two variables. Leads into regression analysis in DMAIC Analyse phase.
Control Chart
A line graph of measurements over time with statistically derived UCL and LCL. The most powerful of the 7 tools. Distinguishes common cause from special cause variation — tells the operator when to act and when to leave the process alone.
Stratification
Breaking data down into meaningful sub-categories (machine, shift, material, operator, time period) so patterns that are hidden in the combined data become visible.
Seven Management & Planning Tools
The Seven Management and Planning Tools (7MP / New Seven Tools) complement the Basic 7 by handling qualitative, language-based, and planning data. Where the Basic 7 analyse numbers, the 7MP tools organise ideas, reveal relationships, and plan complex activities. They are particularly powerful in the early stages of DMAIC (Define/Measure) and for strategic planning.
Affinity Diagram (KJ Method)
Organises a large number of ideas, opinions, or facts into natural groupings by affinity (similarity). Developed by Japanese anthropologist Kawakita Jiro (KJ). Ideal after brainstorming when you have 20–200+ ideas to make sense of.
Tree Diagram
Breaks down a broad goal into progressively finer levels of detail. Reveals all the activities, tasks, and sub-tasks that must be accomplished to achieve the objective. Also used to show hierarchical structures.
PDPC — Process Decision Program Chart
Identifies what could go wrong in a plan and develops countermeasures before problems occur. Similar to FMEA for project plans. Starts with a tree diagram and adds risk branches with labelled countermeasures: O = practical, X = impractical.
Matrix Diagram
Shows the relationship between two or more groups by arranging them in rows and columns with relationship symbols at intersections. Multiple shapes: L-shaped (2 groups), T-shaped (3 groups), Roof-shaped (1 group vs itself — used in House of Quality).
Interrelationship Digraph
Analyses cause-and-effect relationships between multiple factors in a complex situation. Unlike fishbone (one effect), the digraph handles multiple interconnected causes and effects simultaneously — ideal for chronic, systemic quality problems.
Prioritisation Matrix
Compares and ranks choices against weighted criteria to select the best option objectively. Removes subjectivity from project selection, supplier choice, or design decisions. Each criterion has a weight (sum to 1.0), and each option is rated 1–5 against each criterion.
Activity Network Diagram (CPM / PERT)
Manages tasks in sequence to identify the critical path, bottlenecks, and float (slack). The Critical Path Method (CPM) finds the longest sequence of dependent tasks — delays on the critical path delay the entire project.
Statistical Process Control
SPC is manufacturing's early-warning system — detecting real process shifts before they become defects, while distinguishing true signals from random noise.
Cp, Cpk, Pp, Ppk — The Capability Family
Capability indices answer two separate questions: "Can the process fit within spec?" (Cp) and "Is it actually centred there?" (Cpk). The gap between them is your centering loss.
Cpk Acceptance Thresholds
Large Cp − Cpk gap? Fix centering first — not spread reduction. If Cp ≥ 1.33 but Cpk < 1.33, the process is capable of meeting spec but is running off-target. Adjust the mean before spending on variation reduction.
📋 Cpk Targets by Char. Type
| Characteristic | Min Cpk | Initial |
|---|---|---|
| CC (Critical) ongoing | ≥ 1.33 | ≥ 1.67 |
| SC (Significant) ongoing | ≥ 1.33 | ≥ 1.50 |
| General process control | ≥ 1.33 | ≥ 1.33 |
| Cpk | DPMO (approx) |
|---|---|
| 1.00 | 2,700 |
| 1.33 | 63 |
| 1.50 | 6.8 |
| 1.67 | 0.57 |
| 2.00 | 0.002 |
Control Chart Selection Guide
The right chart depends on two things: data type (measured value vs pass/fail count) and subgroup size. Using the wrong chart gives misleading signals.
X̄ chart monitors process mean (location); R chart monitors within-subgroup spread. Uses constants A₂, D₃, D₄ from standard tables. Best for rational subgroups of 2–8. Most common in PPAP control plans and IATF 16949 production environments.
Used when one measurement per cycle is all that's available: slow processes, destructive testing, chemical batches, daily lab results. Less sensitive to small shifts than X̄–R. Moving Range tracks point-to-point variation.
p-chart: proportion defective (variable subgroup size). np-chart: count defective (constant n). Both use the binomial distribution. Foundation of attribute acceptance sampling plans.
c-chart: total defect count per unit (constant inspection area). u-chart: defects per unit (variable inspection area). Both based on the Poisson distribution. Examples: scratches per panel, solder defects per board, paint runs per door.
Western Electric / Nelson Out-of-Control Rules
These 8 patterns on a control chart each indicate a special cause of variation — something changed in the process. Any single rule triggering is sufficient grounds for investigation. Each chart below shows the rule in isolation with real-scale control chart zones.
| # | Rule | Signal Condition | What it usually means |
|---|---|---|---|
| 1 | Beyond ±3σ | 1 point outside control limits | Sudden shift, special event, measurement error |
| 2 | 9 same side | 9 consecutive points all above or all below CL | Process mean shift, new lot, operator change |
| 3 | 6 trend | 6 consecutive points all increasing or all decreasing | Tool wear, gradual drift, raw material degradation |
| 4 | 14 alternating | 14 consecutive points alternating up/down | Two processes alternating, overadjustment/tampering |
| 5 | 2 of 3 beyond ±2σ | 2 of 3 consecutive beyond ±2σ same side | Process shift starting, material lot change |
| 6 | 4 of 5 beyond ±1σ | 4 of 5 consecutive beyond ±1σ same side | Systematic bias, gradual drift |
| 7 | 15 within ±1σ | 15 consecutive points all within ±1σ of CL | Stratification — mixed streams in subgroups, limits too wide |
| 8 | 8 beyond ±1σ both sides | 8 consecutive points outside ±1σ (above and below) | Bimodal / mixture of two process distributions |
Common cause vs special cause. Control charts separate random noise (common cause — inherent system variation) from assignable events (special cause — investigate and fix). Reacting to common cause variation is tampering — it adds variation. Rules 1 and 2 are the most practically important: use them always. Rules 5–8 add sensitivity but also false alarms — apply them when the cost of missing a shift is high.
Capability Analysis — The Complete Framework
Capability analysis answers a single fundamental question: can this process reliably produce output that meets customer requirements? It does so by fitting a statistical model to process data and estimating the probability of producing nonconforming product — now and in the future. Before any capability number is trustworthy, however, three conditions must hold: the process must be stable, the data must be approximately normal, and there must be enough observations for the statistics to carry real precision. Failing any one of these makes the resulting Cpk figure meaningless.
Two types of capability study: A single-variable capability analysis evaluates one CTQ characteristic against its specification limits. A before/after capability comparison determines whether a process improvement project produced a measurable, statistically confirmed improvement in capability — not just noise.
① Process Stability — The First Gate
Capability statistics estimate a future defect rate, not just a historical snapshot. That projection is only valid if the process is operating in a stable, predictable state — meaning only common-cause variation is present and no special causes are inflating or shifting the output. A capability study on an unstable process produces a number that describes a process that no longer exists.
The eight Western Electric stability tests are available for variables control charts, but using all eight simultaneously drives up the false-alarm rate. Research comparing sensitivity and false-alarm behaviour identified three tests that give the best balance for capability pre-screening:
Point Beyond Control Limits
Signals when any single point lies more than 3 standard deviations from the centreline. Universally recognised as the primary out-of-control signal. False alarm rate: 0.27% — the baseline for all other tests. Applied to all chart types: I-MR, X̄-R/S.
9 Consecutive Points, One Side
Signals when 9 successive points all fall on the same side of the centreline. Simulation showed that combining Test 2 with Test 1 reduces the average subgroups needed to detect a 0.5σ mean shift from 154 to just 57 — a 63% improvement in detection speed. Applied to I-chart and X̄-chart only.
12–15 Points Within ±1σ
Signals when an unusual number of consecutive points cluster within ±1σ of the centreline — the opposite of what Test 1 catches. This pattern reveals stratification: multiple distinct populations mixed into a single subgroup (e.g. two machines sampled together). Used only on the X̄-chart when limits are estimated from data.
| k = (Subgroups × 0.33) | Points in a row required for Test 7 signal | What it means |
|---|---|---|
| k < 12 | 12 points | Use fixed minimum — too few subgroups for adaptive rule |
| 12 ≤ k ≤ 15 | Integer ≥ k | Adaptive: scale with data volume for balanced sensitivity |
| k > 15 | 15 points | Cap at maximum — prevents excessive false alarms with large datasets |
Tests 3, 4, 5, 6, and 8 are excluded from pre-capability screening. Tests 3 (trends) and 4 (alternating) add no unique detection power over Tests 1+2. Tests 5, 6, and 8 don't isolate special cause patterns common enough to justify their false-alarm cost. For the R, S, and MR charts (spread charts), only Test 1 is applied — extreme spread points are the only practically relevant signal.
② Normality Testing — The Anderson-Darling Approach
Standard capability indices (Cp, Cpk, Pp, Ppk) are derived from the normal distribution. They convert a Z-score — the number of standard deviations between the process mean and the nearest specification limit — into a defect probability using the normal CDF. If the process data doesn't follow a normal distribution, those Z-to-DPMO conversions are wrong, and every capability index based on them is wrong.
The Anderson-Darling (AD) test is the preferred normality test for capability pre-screening. Compared to other goodness-of-fit tests, the AD test has higher statistical power — especially in the tails of the distribution, which is precisely where capability defects occur. The concern that the AD test becomes overly strict with large samples is not supported by simulation evidence: across sample sizes from 500 to 10,000, and across normal populations with varying spreads, the Type I error rate consistently tracks the target significance level (≈5% at α=0.05).
The AD test accumulates squared deviations between the empirical CDF and the theoretical normal CDF with extra weight given to the tails. Since nearly all capability defects occur in the tails, this weighting is exactly what's needed. The Kolmogorov-Smirnov test applies equal weight throughout the distribution — it can miss tail problems that dominate capability.
The Box-Cox power transformation x → (xλ−1)/λ can often convert a moderately skewed distribution into an approximately normal one. The optimal λ is found by maximum likelihood. Once transformed data passes the AD test, capability indices are computed on the transformed scale. The Cpk result describes performance on the original scale after back-transformation.
AD Test Simulation Evidence — Type I Error & Power
Extensive simulation work confirmed two important properties of the AD test for capability analysis contexts:
| Property | What was tested | Result | Practical meaning |
|---|---|---|---|
| Type I Error | 5,000 samples from normal populations (σ=0.1 to 70) at n=500 to 10,000 | ≈ 5% rejection rate at α=0.05 — consistent across all sample sizes and dispersions | The AD test does not become overly strict with large datasets — a common practitioner concern that simulation disproved |
| Power (correct rejection) | 5,000 samples from 17 non-normal distributions (t, Laplace, Uniform, Beta, Gamma, Weibull, etc.) | ≈ 100% rejection for nearly all distributions at n≥500 | If your data isn't normal, the AD test will detect it — with one exception |
| Power exception | Beta(3,3) at n<1000; Weibull(4,4) at n<3000 | Power < 100% for small samples | These distributions are visually indistinguishable from normal — a normal capability model provides a good approximation and produces reliable estimates |
③ How Much Data Do You Actually Need?
The required sample size depends on two things: the true capability of your process and the precision you need from the estimate. These are connected — at high sigma levels, even rough estimates of Z (±15%) translate into a range of DPMO values that is practically acceptable. At lower sigma levels, the same ±15% range spans thousands of DPMO, which may be unacceptable for decision-making.
The AIAG SPC reference manual recommends at least 25 rational subgroups and a minimum of 100 total measurements. Independent simulation work generating 10,000 benchmark-Z estimates at each sample size confirmed this guidance:
| Confidence Level | Precision margin | Target Z > 3 (typical capable process) | Target Z ≈ 2.5 (marginal process) |
|---|---|---|---|
| 90% | ±15% of true Z | ~100 observations | ~103 observations |
| 90% | ±10% of true Z | ~175 observations | ~215 observations |
| 90% | ±5% of true Z | ~650 observations | ~750 observations |
| 95% | ±15% of true Z | ~150 observations | ~175 observations |
| 95% | ±10% of true Z | ~200 observations | ~250 observations |
The 100-observation rule explained: With 100 measurements from a process where Z>3, you can be 90% confident that your computed benchmark Z lies within ±15% of the true Z. For a truly 6σ process (Z=4.5 long-term), that confidence interval spans roughly Z=3.8 to Z=5.2. Doubling to 175 observations tightens this to ±10%. For most industrial go/no-go decisions, 100 measurements is sufficient; for precise capability reporting in supply chain audits, target 175+.
Why Precision Matters More at Lower Sigma Levels
| True Z | True DPMO | ±15% precision → Z range | DPMO range at ±15% | Practical impact |
|---|---|---|---|---|
| 4.5σ | 3.4 | 3.83 – 5.18 | 0.9 – 13.3 DPMO | Acceptable — the difference between 1 and 13 defects per million is rarely decision-critical |
| 3.0σ | 1,350 | 2.55 – 3.45 | 280 – 5,330 DPMO | Significant — a 19× DPMO range makes pass/fail decisions unreliable |
| 2.5σ | 6,210 | 2.13 – 2.88 | 1,970 – 16,400 DPMO | Unacceptable for reporting — increase sample size to ≥200 before drawing conclusions |
The Recommended Capability Study Sequence
- 1
Define CTQ and specification limits
Confirm LSL/USL are customer-driven, not internally tightened. Incorrect spec limits make all downstream analysis meaningless.
- 2
Validate the measurement system (GR&R)
If the gauge R&R exceeds 30% of process variation, the capability index will be systematically underestimated. Fix measurement before measuring capability.
- 3
Plot an I-MR or X̄-R/S control chart
Run Tests 1, 2, and 7 (for X̄). Remove special causes before continuing. Do not compute Cpk from an unstable process.
- 4
Test for normality with Anderson-Darling
If p<0.05, attempt Box-Cox transformation. If transformation fails, use non-normal capability methods (e.g. Weibull capability, non-parametric percentile approach).
- 5
Collect at least 100 observations
Fewer than 30: do not report Cpk. 30–99: flag as preliminary. 100+: acceptable for capability reporting. 175+: preferred for formal PPAP submission.
- 6
Compute and report Cp, Cpk, Pp, Ppk
Report confidence intervals alongside the point estimates. A Cpk of 1.35 with a 95% CI of [1.10, 1.62] tells a very different story than just "1.35".
- 7
Interpret with AIAG thresholds — but don't stop there
Cpk ≥ 1.67 (initial CC), Cpk ≥ 1.33 (ongoing). Always pair the capability index with a probability plot, histogram overlay, and DPMO estimate. Never report a number without context.
Before/After Capability Comparison — Verifying Improvement
A before/after capability comparison is used at the end of a DMAIC Improve phase to confirm that an improvement action produced a real, statistically significant improvement — not a random fluctuation. The same three prerequisites apply to both datasets independently. Key considerations:
- ✓ Both datasets from stable processes (independently verified)
- ✓ Both datasets passing normality (or same transformation applied)
- ✓ Minimum 100 observations in each group
- ✓ Same measurement system used for both (GR&R unchanged)
- ✓ Statistical significance test on Cpk difference (use non-central F)
- ✗ "After" data collected during unstable trial run (Hawthorne effect)
- ✗ Sample sizes too small to detect a meaningful Cpk improvement
- ✗ Gauge R&R changed between before and after studies
- ✗ Declaring success from point estimates alone — use confidence intervals
- ✗ Not waiting long enough for the "after" data to represent steady-state
Summary rule: Stability → Normality → Sufficient data. In that order, with no shortcuts. A Cpk computed without verifying all three is a number without a foundation. Compute it if you must, but flag it clearly as unvalidated and treat it as indicative only — never as a basis for a PPAP approval or a customer capability commitment.
Nelson Rules — All 8 Rules with Probabilities & Causes
Nelson Rules (also called Western Electric / Shewhart Rules) detect special cause variation. Each rule has a known false-alarm probability — this is the probability it triggers even when the process is in statistical control (i.e., common cause variation only).
| # | Pattern | False alarm probability | Probable special cause |
|---|---|---|---|
| 1 | 1 point more than 3σ from centreline | (1−0.9973) = 0.0027 | New operator, wrong setup, measurement error, out-of-spec material |
| 2 | 7 points in a row on the same side of the centreline | (0.5)⁷ = 0.0078 | Process mean has shifted — setup change, tool wear, material batch change |
| 3 | 7 points in a row all increasing or all decreasing | ≈ 0.0017 | Trend — tool wear, gradual deterioration, temperature drift |
| 4 | 14 points in a row alternating up and down | ≈ 0.0002 | Over-control / tampering — operator adjusting too frequently |
| 5 | 2 out of 3 consecutive points more than 2σ from centreline (same side) | ≈ 0.003 | New operator, wrong setup — similar to Rule 1 but detects smaller shifts |
| 6 | 4 out of 5 consecutive points more than 1σ from centreline (same side) | ≈ 0.005 | Small sustained shift in process mean |
| 7 | 14 points in a row within 1σ from centreline (either side) | (0.68)¹⁴ = 0.0045 | Process improvement, reduced variation, or stratified sampling mixing two distributions |
| 8 | 8 points in a row more than 1σ from centreline (either side) | (1−0.68)⁸ = 0.0001 | Mixture of two processes — two machines, two shifts, or two operators being combined |
The control chart is divided into zones from the centreline outward:
- Zone C — within 1σ of centreline (≈68% of points here)
- Zone B — between 1σ and 2σ from centreline (≈27%)
- Zone A — between 2σ and 3σ from centreline (≈4.3%)
- Beyond 3σ — outside control limits (≈0.27%)
- ✓ Rule 1 (beyond 3σ) is always the most obvious — the simplest special cause signal
- ✓ Rule 2 (7-in-a-row same side) is the most common exam scenario — a mean shift
- ✓ Rule 4 (alternating 14 points) = over-control. The fix is to stop adjusting.
- ✓ Rule 7 (hugging centreline) = artificially low variation, often from stratified subgroups mixing two processes
- ✓ False alarm rate multiplies with each rule added — more rules = more false alarms
Process Capability Indices — Complete Reference
Capability indices quantify how well a process fits within its specification limits. The family of indices (Cp, Cpk, Pp, Ppk) each answer a slightly different question. Understanding when to use which index — and the conditions that must be met — is heavily tested in the engineering practice.
Short-Term Capability Indices (Within σ)
| Index | Formula | What it measures | Limitation |
|---|---|---|---|
| Cp | Cp = (USL − LSL) / (6·σwithin) | Potential capability — how wide the spec is relative to the process spread. Ignores centring. | A high Cp with a poorly centred process will still produce defects |
| CpL | CpL = (X̄ − LSL) / (3·σwithin) | Lower capability — distance from mean to lower spec in σ units | One-sided; use when only a lower limit matters |
| CpU | CpU = (USL − X̄) / (3·σwithin) | Upper capability — distance from mean to upper spec in σ units | One-sided; use when only an upper limit matters |
| Cpk | Cpk = min(CpL, CpU) | Actual short-term capability — accounts for both spread and centring. The most commonly used index. | If process is perfectly centred, Cpk = Cp |
| Cr | Cr = 1/Cp = 6σ/(USL−LSL) | Capability ratio — percentage of tolerance used by the process. Cr × 100 = % tolerance consumed. | Lower is better; Cr < 1.0 means Cp > 1.0 |
Long-Term Performance Indices (Overall σ)
| Index | Formula | Key difference from Cp/Cpk |
|---|---|---|
| Pp | Pp = (USL − LSL) / (6·σoverall) | Uses overall (total) standard deviation — includes all sources of variation over time (between-subgroup + within-subgroup) |
| PpL | PpL = (X̄ − LSL) / (3·σoverall) | Long-term indices; Ppk ≤ Cpk always. The gap between Cpk and Ppk indicates how much the process mean has drifted or shifted over time. |
| PpU | PpU = (USL − X̄) / (3·σoverall) | |
| Ppk | Ppk = min(PpL, PpU) |
| Short-term (within σ) | Long-term (overall σ) | |
|---|---|---|
| Potential (centring ignored) | Cp | Pp |
| Actual (centring included) | Cpk | Ppk |
If Cpk ≈ Ppk: the process is stable over time. If Cpk >> Ppk: the process has shifted or drifted — investigate between-subgroup variation.
| Cp / Cpk | Sigma level | Rejection rate |
|---|---|---|
| 1.00 | 3σ | 0.27% (2,700 ppm) |
| 1.33 | 4σ | 64 ppm |
| 1.67 | 5σ | 0.6 ppm |
| 2.00 | 6σ | 2 ppb |
Four conditions required: (1) sample represents population, (2) data is normally distributed, (3) process is in statistical control, (4) sample size is sufficient.
Short-Run SPC — Monitoring Low-Volume Production
A typical control chart needs 20–25 subgroups (≈100 data points) to establish reliable control limits. Short-run SPC solves the problem of low-volume or mixed-part production where insufficient data exists for traditional charts.
The Problem
When producing different-diameter items (e.g. 300mm, 400mm, 500mm) in small runs of 8 each, options are:
- ✗ 100% inspection — expensive
- ✗ First-off inspection only — misses process variation
- ✗ Last-off inspection — too late to react
- ✗ Separate chart per part — too little data per chart
- ✓ Short-run chart — plots all parts on one chart by transforming the data
Key Principle
Short-run SPC focuses on the process, not the product. By transforming raw measurements, parts with different nominal values can be plotted on a single chart — revealing process stability across multiple part numbers.
Only valid if the different part runs have similar variance. If variance differs significantly between parts, a Z-MR chart (standardised) is needed instead.
Two Short-Run Chart Methods
Subtract the nominal value for each run. Plot the deviations on a standard I-MR chart.
Run A nominal = 300 → 302.6 − 300 = 2.6
Run B nominal = 500 → 504.2 − 500 = 4.2
Run C nominal = 400 → 400.5 − 400 = 0.5
→ Plot all differences on one I-MR chart
Standardise each measurement using the run's own mean and standard deviation. The Z score is plotted — chart limits are always ±3 regardless of part.
UCL = +3, LCL = −3 always
CL = 0 always
→ All parts, all runs, one chart
Cpk vs Ppk — Real-World Scenario with Full Worked Example
Cpk and Ppk look similar on paper but measure fundamentally different things. Cpk measures what your process can do when it's running well. Ppk measures what it actually does over extended time — including every shift change, raw material lot, and seasonal temperature swing. The gap between them tells a story about process management, not just process performance.
The Fundamental Difference
Uses σWithin — estimated from within-subgroup variation only. Strips out the noise from subgroup-to-subgroup shifts. Represents the process at its best, as if operating under one stable short-term condition.
Uses σOverall — the plain sample standard deviation across all observations. Includes every source of variation: within-subgroup, between-subgroup, drift, shift, operator, raw material. The real-world performance index.
sp = √[Σ(xij−x̄i)² / Σ(ni−1)]
Pooled std dev across subgroups (default)
or: σ̂W = MR̄ / d₂(w)
Average moving range when n=1
s = √[ΣiΣj(xij−x̄)² / (n−1)]
Plain sample std dev, all data pooled
σ̂O ≥ σ̂W always
∴ Ppk ≤ Cpk always
Visual: Why Cpk > Ppk When the Process Drifts
Each narrow blue curve is a subgroup's short-term behaviour — tight, capable, well within spec. But when all three shifts combine into the long-term picture (red dashed curve), the overall spread is much wider. This is why Ppk ≤ Cpk always. The gap is not measurement error — it's process management information.
Worked Example — Automotive Fuel Injector Flow Rate
A fuel injector flow rate must meet LSL = 195 cc/min, USL = 205 cc/min (tolerance = 10 cc/min). You run a production study: 25 subgroups of n=5 collected over 3 production shifts across 5 days. The process uses Rbar to estimate σWithin.
= 1.006 cc/min
σ̂O = s / c₄(5) = 1.62 / 0.9400
= 1.723 cc/min
= 4.2 / 3.018 = 1.392
CPL = (200.8 − 195) / (3 × 1.006)
= 5.8 / 3.018 = 1.922
Cpk = min(1.922, 1.392) = 1.39
= 4.2 / 5.169 = 0.812
PPL = (200.8 − 195) / (3 × 1.723)
= 5.8 / 5.169 = 1.122
Ppk = min(1.122, 0.812) = 0.81
When this process runs stably within a shift, it is capable — the machine can hit spec consistently. The process potential meets the standard for ongoing production (Cpk ≥ 1.33).
Over the 5-day study, the process is not capable. The large gap (Cpk − Ppk = 0.58) reveals significant between-shift or between-day variation — likely from warm-up drift, operator differences, or raw material lot variation.
The engineering decision: Do not report only Cpk. A customer seeing Cpk = 1.39 would approve the PPAP. But Ppk = 0.81 tells the real story — this process will produce field defects at rates far above what Cpk predicts. The correct action is to investigate the source of between-shift variation, fix it, then re-run the study with both indices reporting ≥ 1.33.
Interpreting the Cpk–Ppk Gap
| Cpk vs Ppk Pattern | What it Means | Typical Root Cause | Action |
|---|---|---|---|
| Cpk ≈ Ppk (gap < 0.1) | Process is stable over time | No significant between-subgroup drift. What you see in a short run is what you get long-term. | Report Ppk to customer. No additional investigation needed. |
| Cpk moderately > Ppk (gap 0.1–0.3) | Some long-term drift present | Gradual tool wear, ambient temperature, material lot variation. Process is capable but not perfectly controlled. | Investigate between-subgroup sources. Tighten control plan. |
| Cpk significantly > Ppk (gap > 0.3) | Serious stability problem | Shift changes, operator methods, machine warm-up, batch material variation. Multiple distinct process streams being reported as one. | Do not submit this PPAP. Conduct MSE (Multi-Stream Evaluation). Stratify data by suspected source. |
| Ppk > Cpk | Unusual — investigate | Within-subgroup variation is inflated (e.g. too much between-part variation sampled in one subgroup — irrational subgrouping). | Review subgrouping strategy. Rational subgroups should represent only short-term common-cause variation. |
Confidence Intervals — Never Report a Point Estimate Alone
A Cpk of 1.33 computed from 30 samples has a very different meaning than the same value from 200 samples. Confidence intervals quantify this uncertainty. These formulas are from the Minitab capability analysis documentation.
Upper = Ĉp · √(χ²α/2,ν / ν)
ν = fn · k(n−1)
Upper = Ĉpk + Zα/2√(1/9kn + Ĉpk²/2ν)
k = subgroups, n = avg size
Upper = P̂p · √(χ²α/2, kn−1 / (kn−1))
Upper = P̂pk + Zα/2√(1/9kn + P̂pk²/2(kn−1))
Applied to our example (Cpk = 1.39, k=25, n=5): The 95% CI for Cpk is approximately [1.15, 1.63]. This means we cannot be certain the true Cpk exceeds 1.33 — it might be as low as 1.15. This is why 125 observations is borderline for formal PPAP submission; aim for 175+ to tighten the CI.
σ Estimation Methods — Which Formula Does Your Software Use?
Cpk and Ppk use different sigma estimates, and within each, there are multiple methods depending on subgroup size and data structure. Understanding which formula applies to your situation prevents misinterpretation — especially when comparing indices across software platforms.
Overview — When Each Method Applies
| Sigma Type | Method | When Used | Used for |
|---|---|---|---|
| σWithin Short-term Used in Cp, Cpk | Pooled Std Dev | Subgroup size n > 1 (default) | Cp, Cpk, UCL/LCL on X̄ chart |
| Rbar (Average Range) | Subgroup size n > 1, alternative method | X̄-R charts — traditional method | |
| Average Moving Range (MR̄) | Subgroup size n = 1 (default) | I-MR charts — individual measurements | |
| σOverall Long-term Used in Pp, Ppk | Sample Std Dev | All scenarios | Pp, Ppk — always this formula |
σOverall — The Long-Term Standard Deviation
Always the plain sample standard deviation across all observations, corrected by the c₄ unbiasing constant. This is the denominator for Pp and Ppk.
s = √[ ΣiΣj(xij − x̄)² / (n − 1) ]
where n = total observations, x̄ = grand mean across all data
c₄(n) → 1 as n → ∞ (correction negligible for n > 50)
σOverall includes all sources of variation: within-subgroup + between-subgroup + drift + shift + any systematic effects. It is always ≥ σWithin, which is why Ppk ≤ Cpk always.
σWithin Method 1 — Pooled Standard Deviation (Default, n > 1)
The default method when subgroup size > 1. Pools variance across all subgroups, then applies the c₄ unbiasing constant. This is what Minitab and most SPC software use by default.
sp = √[ ΣiΣj(xij − x̄i)² / Σi(ni−1) ]
d = Σ(ni−1) + 1 (degrees of freedom)
When subgroup size is constant: sp = √(Σsi² / k), d = n − k + 1
σWithin Method 2 — Rbar (Average Range, n > 1)
The traditional control chart method — divides the average range by the d₂ constant. Used on X̄-R charts. Equivalent to pooled std dev when subgroup size is constant, but less efficient for unequal subgroup sizes.
R̄ = (R₁ + R₂ + ... + Rk) / k
Unequal subgroup sizes: uses weighted formula fi = [d₂(ni)]² / [d₃(ni)]²
σWithin Method 3 — Average Moving Range (Default, n = 1)
When individual measurements are collected (subgroup size = 1), within-subgroup variation is estimated from consecutive differences — the moving range. This is the I-MR chart approach.
MRi = |xi − xi−1| (for w=2, consecutive pairs)
MR̄ = (MR2 + MR3 + ... + MRn) / (n − w + 1)
d₂(2) = 1.128 · Median MR variant: σ̂ = MR̃ / d₄(w)
σWithin Method 4 — Sbar (Average of Subgroup Standard Deviations)
Used on X̄-s charts. More efficient than Rbar for large subgroup sizes (n > 10). Applies c₄ weighting per subgroup.
hi = [c₄(ni)]² / [1 − c₄(ni)²]
When subgroup size is constant: σ̂ = s̄ / c₄(n), s̄ = Σsi/k
Unbiasing Constants — c₄ and d₂ Reference Table
These constants correct the bias in sigma estimates from small samples. c₄ is used with standard deviations; d₂ is used with ranges. Both approach 1 as sample size increases.
| n (subgroup size) | c₄ | Used in σ̂ = s/c₄ |
|---|---|---|
| 2 | 0.7979 | Pooled σ, Sbar |
| 3 | 0.8862 | |
| 4 | 0.9213 | |
| 5 | 0.9400 | Most common |
| 6 | 0.9515 | |
| 8 | 0.9650 | |
| 10 | 0.9727 | |
| 25 | 0.9896 | |
| ∞ | 1.0000 | Bias negligible |
| n (subgroup size) | d₂ | Used in σ̂ = R̄/d₂ |
|---|---|---|
| 2 | 1.128 | MR chart (w=2) |
| 3 | 1.693 | |
| 4 | 2.059 | |
| 5 | 2.326 | Most common |
| 6 | 2.534 | |
| 7 | 2.704 | |
| 8 | 2.847 | |
| 9 | 2.970 | |
| 10 | 3.078 |
Which Method Should You Use?
Use I-MR chart. Default w=2. Average moving range divided by 1.128.
Pooled std dev (default) or Rbar. Use X̄-R chart. Pooled is more efficient.
Sbar method. Use X̄-s chart. Range method loses efficiency at n > 9.
Source: All sigma estimation formulas on this page are from the Minitab Technical Support Document — Capability Analysis (Normal) Formulas: Capability Statistics (Default) and the Minitab Assistant White Paper on Capability Analysis. The c₄ and d₂ constants follow Montgomery (2001), Introduction to Statistical Quality Control, Wiley. These are the industry-standard formulas used in all major SPC software.
Quantitative Methods & Statistics
Hypothesis testing, confidence intervals, regression, ANOVA, probability distributions, and time-series analysis — the statistical toolkit every quality engineer needs to turn data into defensible decisions.
Data Types, Collection & Descriptive Statistics
Data Classification
| Category | Type | Characteristics | Examples |
|---|---|---|---|
| Qualitative Description-based | Nominal | Categories only — no order, no arithmetic. Central tendency: Mode only. | Colour (Red/Blue), Pass/Fail, Product type |
| Ordinal | Ordered categories — differences not meaningful. Central tendency: Mode, Median. | Good/Bad/Worst, 1–5 star rating, Likert scale | |
| Quantitative Number-based | Interval | Ordered + equal intervals — no true zero. All central tendency measures valid. | Temperature °C, Calendar year, IQ score |
| Ratio | Ordered + equal intervals + true zero. All calculations valid. | Length, Mass, Volume, Time, Temperature K |
Continuous: Can take any value in a range. Measurements — length, height, time, temperature. More sensitive, fewer samples needed, but more expensive to collect.
Discrete: Countable, whole numbers only. Number of defects, number of students, yes/no outcomes.
Nominal → Ordinal → Interval → Ratio. Each level adds a property: Order → Equal intervals → True zero. You can always use a higher-level statistic on lower-level data but not vice versa.
Data Collection Plan
| Element | Content |
|---|---|
| Why collect? | Goal, objective, business question to answer |
| Operational Definition | Precise definition of what is being measured — avoids ambiguity between collectors |
| How much / how / where / when | Sample size, frequency, location, time windows |
| Type of data | NOIR scale — determines which statistics and charts are appropriate |
| Collection method | Manual (check sheet) or automatic (sensors, gages) |
| Past vs future data | Historical data may have biases; prospective data is preferred |
| Reliability | Is the measurement system capable? (MSA first) |
Transforming data to simplify calculations:
- • Add/Subtract: Mean shifts by the same amount. Standard deviation unchanged.
- • Multiply/Divide: Both mean and SD scale by the same factor.
- • Truncation: Remove repetitive prefix (e.g. 0.55x → subtract 550 and divide by 1000). Reverse transform to get original mean and SD.
- • Imputation: Replacing missing data with substituted values (e.g. row mean). Missing data introduces bias.
- • Benford's Law: In natural data sets, digit 1 appears as leading digit ~30% of the time; digit 9 <5%. Violations can indicate data fabrication or errors.
- • Integrity risks: Bias, lack of knowledge, boredom, rounding, intentional falsification
Descriptive statistics are not just mean, median, and mode. A complete descriptive summary explains center, spread, position, frequency, and shape. The goal is to answer five questions: Where is the data centered? How much does it vary? Where do observations sit within the distribution? How often do values occur? And does the shape suggest skewness, heavy tails, or outliers?
Example context: Below is one raw dataset of 30 process measurements. We use the same numbers to explain central tendency, dispersion, position, frequency, and shape — exactly the way descriptive statistics are reported in tools like Excel and Minitab.
How to Read Descriptive Statistics
1) Central Tendency — Where is the data centered?
Central tendency describes the “typical” value. The mean uses all observations and shifts toward extreme values. The median is the middle observation and is more stable when data is skewed. The mode is the most frequent value; for continuous measurements it is often estimated by grouping or rounding. In this dataset, the mean (50.05) is slightly above the median (49.00), which hints at a right tail pulling the average upward.
2) Dispersion — How spread out is the data?
Dispersion measures consistency. The range is the full width from minimum to maximum (18.4). The variance (18.59) uses squared deviation, while the standard deviation (4.31) expresses spread in the original units. The IQR (5.40) focuses on the middle 50% of the data and is less sensitive to outliers. A process can have a good mean but still be poor if dispersion is too large.
3) Position — Where do values sit inside the distribution?
Position measures rank. Quartiles divide the data into four parts: Q1 = 46.88, median = Q2 = 49.00, Q3 = 52.27. Percentiles give the value below which a chosen percentage falls. Here the 10th percentile is 45.84 and the 90th percentile is 55.14. These are extremely useful for reporting tails, customer risk, and threshold-based performance.
4) Frequency — How often do values occur?
Frequency tells you how observations are distributed across intervals. The histogram is the main visual tool: tall bars mean many observations in that region, short bars mean few. In descriptive output this idea also appears as counts, relative frequency, and cumulative frequency. Frequency is what turns raw numbers into an interpretable distribution.
5) Shape — Is the distribution symmetric, skewed, or heavy-tailed?
Shape goes beyond average and spread. Skewness (1.19) measures asymmetry: positive skew means a longer right tail, negative skew means a longer left tail, and zero means near-symmetry. Kurtosis looks at tail heaviness and outlier-proneness. The excess kurtosis here is 1.27: values above zero indicate heavier tails than normal, values below zero indicate lighter tails. Shape matters because non-normal shape changes how you interpret means, control limits, and capability.
4) Shape — Skewness & Kurtosis
Skewness measures asymmetry: how far the distribution leans. A value of 0 means perfect symmetry. Positive values indicate a long right tail (mean > median), negative values a long left tail (mean < median). In quality engineering, right skewness often signals occasional high-value outliers — tool wear, burst events, occasional defects. Kurtosis measures tail weight. Excess kurtosis = 0 means the tails match a normal distribution. Positive excess kurtosis (leptokurtic) means more extreme values occur than expected — critical for capability analysis because DPMO estimates derived from Cp/Cpk assume normality. In this dataset, skewness = 1.19 and excess kurtosis = 1.27 — both moderate, indicating a slightly heavier right tail and more occasional high outliers than a pure normal would predict.
Skewness — Three Distribution Shapes Compared
Skewness tells you which direction the data has a longer tail, and where the mean sits relative to the median and mode. Rule of thumb: |skewness| < 0.5 = approximately symmetric; 0.5–1.0 = moderate skew; >1.0 = strong skew.
Quality engineering rule: Always check both skewness and excess kurtosis before computing Cp/Cpk. If |skewness| > 1 or |excess kurtosis| > 2, consider non-normal capability analysis (Weibull, Johnson transformation, or percentile-based methods) instead of assuming normality.
Descriptive Statistics — Central Tendency
| Measure | Definition | Formula / Method | Properties |
|---|---|---|---|
| Mean (x̄) | Arithmetic average | x̄ = Σx / n | Affected by extreme values (outliers). Used for ratio/interval data. |
| Mode | Most frequently occurring value | Count occurrences; highest count wins | Only average valid for nominal data. A dataset can have multiple modes (bimodal). |
| Median | Middle value when sorted ascending | Odd n: middle value. Even n: average of two middle values. | Not affected by outliers. Preferred for skewed distributions. |
| Percentile | Value below which P% of data falls | i = P·n/100. If i whole: avg(i, i+1). If not: round up to next. | Q1=25th, Q2=50th (Median), Q3=75th percentile |
Descriptive Statistics — Variability
Simplest measure of spread. Sensitive to outliers. Example: (6, 9, 10, 11, 11, 14) → R = 14−6 = 8
Range of middle 50% of data. Robust to outliers. Example: Q3=11, Q1=9 → IQR = 2. Used in box-and-whisker plots.
s = √s²
Average squared deviation from mean (sample formula uses n−1 for unbiasedness). Example: data (98, 99, 100, 101, 102, 100) → s²=2, s=1.414
Graphical Methods for Depicting Data
Each chart below is rendered from real sample data. Understanding the shape, landmarks, and interpretation of each is essential for the engineering practice.
Histogram
What it shows: Frequency distribution — the shape, centre, and spread of continuous data. Values are grouped into bins; bar height = count in that bin. Bars touch (no gaps) because data is continuous.
Key features: Shape reveals distribution type — normal (bell), right-skewed, left-skewed, bimodal, or uniform. Overlay a normal curve to visually pre-check normality before running a Q-Q plot.
First look at any dataset. Identify modality, skew, and outliers before any statistical test. Required in DMAIC Measure phase.
Box-and-Whisker Plot
What it shows: The five-number summary — Min, Q1, Median, Q3, Max — in a single compact visual. The box spans Q1 to Q3 (the Interquartile Range, IQR). The line inside the box is the Median. Whiskers extend to Min and Max within 1.5×IQR. Points beyond whiskers are outliers.
Key formula: IQR = Q3 − Q1. Outlier threshold = Q3 + 1.5×IQR (upper) or Q1 − 1.5×IQR (lower).
Compare multiple distributions side by side. Instantly reveals skew, spread, and outliers. Use in MSA to compare operator variation.
Stem-and-Leaf Plot
What it shows: The full distribution of data while keeping every original value visible. Each data point is split: the stem = leading digit(s), the leaf = the last digit. Reading the leaves left-to-right on each row gives you a mini histogram rotated 90°.
Example data: 21, 24, 26, 28, 31, 33, 35, 37, 39, 41, 43, 46, 48, 52, 55, 58
Best for small datasets (n < 50). Reveals shape, outliers, and gaps — and unlike a histogram, you can read back every original data value.
Normal Probability Plot (Q-Q Plot)
What it shows: Whether your data follows a normal distribution. Data quantiles are plotted against theoretical normal quantiles. If the data is normal, all points fall on or very close to the diagonal reference line.
Interpretation: Points hugging the line ✓ normal. S-curve = skewed. Banana curve = heavy tails. A single point far off-line = outlier. Use p-value > 0.05 (Anderson-Darling or Kolmogorov-Smirnov) to confirm at 95% confidence.
Required before running capability analysis (Cp/Cpk). Non-normal data must be transformed or analyzed with non-parametric methods.
Probability — Models, Rules & Distributions
Probability Models
Used when all outcomes are equally likely and can be counted theoretically. Example: P(rolling a 3) = 1/6. No experiment needed.
Used when theoretical probability is unknown — estimate from observed data. Approaches true probability as n → ∞. Example: defect rate from production history.
Counting — Factorial, Permutations & Combinations
| Concept | Formula | Order matters? | Example |
|---|---|---|---|
| Factorial | n! = n×(n−1)×…×1. 0!=1 | — | 5! = 5×4×3×2×1 = 120 |
| Permutation | P(n,r) = n!/(n−r)! | Yes — order matters | Lock code 3376 — P(10,4) = 5040 arrangements |
| Combination | C(n,r) = n!/[r!(n−r)!] | No — order irrelevant | Select 2 from 5 students — C(5,2) = 10 groups |
Key Probability Distributions — Summary Table
| Distribution | Type | Key parameters | Conditions / when to use | Mean | Variance |
|---|---|---|---|---|---|
| Normal | Continuous | μ, σ | Symmetric, bell-shaped. Central Limit Theorem. 68/95/99.7 rule. Z = (X−μ)/σ | μ | σ² |
| t (Student's) | Continuous | df = n−1 | Small samples (n<30) or unknown σ. Wider than normal; converges to normal as df→∞ | 0 | df/(df−2) |
| Chi-square (χ²) | Continuous | df = n−1 | Testing population variance; goodness of fit; independence in contingency tables. χ² = (n−1)s²/σ² | df | 2·df |
| F | Continuous | df₁, df₂ | Comparing two variances; ANOVA F-ratio = MS_between/MS_within. Always right-tailed. | df₂/(df₂−2) | — |
| Binomial | Discrete | n, p | Fixed n trials; 2 outcomes; constant p; independent. P(x) = C(n,x)·pˣ·(1−p)ⁿ⁻ˣ | np | np(1−p) |
| Bernoulli | Discrete | p | Binomial with n=1 (single trial). P(success) = p, P(failure) = 1−p | p | p(1−p) |
| Hypergeometric | Discrete | N, A, n | Sampling without replacement from finite population. Use instead of binomial when n > 5% of N. P(x) = C(A,x)·C(N−A,n−x)/C(N,n) | nA/N | — |
| Poisson | Discrete | μ | Rare events in fixed region. Mean = variance = μ. P(x;μ) = e⁻μ·μˣ/x! | μ | μ |
Confidence Intervals — Complete Reference
A confidence interval provides a range within which the true population parameter is believed to lie with a stated probability (confidence level). The width is controlled by sample size, standard deviation, and confidence level.
CI for Mean — z-based (σ known or n ≥ 30)
| Confidence | α | zα/2 |
|---|---|---|
| 90% | 0.10 | 1.645 |
| 95% | 0.05 | 1.96 |
| 99% | 0.01 | 2.576 |
100 random residents, x̄ = $42,000, σ = $5,000. Find 95% CI.
CI = 42,000 ± 1.96 × 500
CI = 42,000 ± 980
CI = $41,020 to $42,980
CI for Mean — t-based (σ unknown and n < 30)
Use t-distribution with (n−1) degrees of freedom. As n increases, t → z.
n=25, x̄=$42,000, s=$5,000. Find 95% CI. t0.025,24 = 2.064
CI = 42,000 ± 2,064
CI = $39,936 to $44,064
CI for Proportion
Conditions: np ≥ 5 AND n(1−p) ≥ 5 (to approximate binomial with normal)
n=100, 10 defective (p̂=0.10). Find 95% CI.
CI = 0.10 ± 1.96×√(0.10×0.90/100)
CI = 0.10 ± 0.06
CI = 0.04 to 0.16 (4% to 16%)
CI for Variance (Chi-square)
χ² is not symmetric — use two separate chi-square table values for the two tails.
n=25, s²=4. Find 90% CI for σ². χ²0.05,24=36.42, χ²0.95,24=13.848
Upper: (24×4)/13.848 = 6.93
90% CI for σ²: 2.64 to 6.93
Hypothesis Testing — 38 Tests, 6 Families
Every hypothesis test follows the same 6-step logic. What changes is the test statistic and its distribution. Master the framework once — apply it to all 38 tests.
Universal 6-Step Framework — Every Test Uses This
Use when your response is continuous and approximately normally distributed (or n ≥ 30, by the Central Limit Theorem). You are comparing one or more means. If normality is badly violated with small n, switch to Family ⑤ non-parametric alternatives.
Run ONLY after a significant ANOVA F-test. ANOVA tells you at least one pair differs — post-hoc tests identify which pairs. Running them without a significant F first inflates Type I error and produces false positives.
Use when your data is categorical — pass/fail, defect type, yes/no, attribute data. You are counting frequencies or testing proportions, not measuring a continuous response. The test statistic follows a z or χ² distribution.
Use when you need to test spread, not location. Required before independent t-tests (equal variance assumption), before ANOVA (homogeneity of variance), when comparing measurement system precision, or when a spec limit exists on process variability.
Use when normality is badly violated with small n, data is ordinal (ranked), or outliers distort parametric tests. These tests rank the data instead of using raw values — they lose some power when normality holds, but are robust and honest when it doesn't.
| Situation | Parametric Test | Non-Parametric Alternative | What It Tests |
|---|---|---|---|
| 1 sample or paired, non-normal | 1-sample / paired t | Wilcoxon Signed-Rank | Median = target; or median difference = 0 |
| 2 independent groups, non-normal | Independent t | Mann-Whitney U | Same distribution / median in both groups |
| 3+ independent groups, non-normal | One-Way ANOVA | Kruskal-Wallis H | Same distribution across all groups |
| Repeated measures, non-normal | RM-ANOVA | Friedman Test | Same distribution across conditions |
| Direction of effect only | 1-sample t | Sign Test | P(positive change) = 0.5 |
| Monotonic relationship | Pearson r | Spearman ρ / Kendall τ | Rank correlation (not just linear) |
Test relationships between variables (correlation, regression), validate model assumptions (normality, independence of residuals), and compare survival or reliability curves. These tests are prerequisites for and extensions of the parametric means tests in Family ①.
Normality Tests (4–6) — Prerequisite for Families ① and ④
Correlation, Regression & Time Series
Pearson Correlation Coefficient (r)
| r value | Interpretation |
|---|---|
| r = +1 | Perfect positive linear relationship |
| 0 < r < 1 | Positive correlation (as X increases, Y increases) |
| r = 0 | No linear relationship |
| −1 < r < 0 | Negative correlation (as X increases, Y decreases) |
| r = −1 | Perfect negative linear relationship |
A strong correlation between X and Y does not mean X causes Y. Both may be driven by a third variable (confounding). Example: ice cream sales and drowning rates are positively correlated — both caused by hot weather.
r² = proportion of variance in Y explained by X (0 to 1). If r=0.88 → r²=0.77 → 77% of variance in Y is explained by X. Remaining 23% is unexplained.
Fisher's Z Transformation — CI for Correlation
Since r is not normally distributed, a 3-step process is needed to find CI for the population correlation ρ:
- Convert r to z' (Fisher's transformation): z' = 0.5·[ln(1+r) − ln(1−r)]
- Build CI in z' space: SE = 1/√(N−3), then z'±zα/2·SE
- Back-transform CI limits from z' to r
Step 2: SE = 1/√(10−3) = 0.378
CI = 1.375 ± 1.96×0.378
z' range: 0.635 to 2.11
Step 3: Back-transform → r: 0.56 to 0.97
Regression & Time Series — Strongest Upgrade (Real Data + Visual Learning)
This upgrade follows the NIST/SEMATECH engineering-statistics philosophy: graphics are not decoration, and modeling should never be separated from diagnostics. For regression, that means fit + residuals + structure checks. For time series, that means trend + seasonality + dependence before forecasting.
Real-data example: Anscombe Data Set I
NIST uses Anscombe's example to show why graphics are essential. We start with Data Set I, which behaves approximately linearly and is appropriate for a simple linear regression. The model is Y = β₀ + β₁X + ε. Least squares chooses the line that minimizes the sum of squared residuals.
What users should learn from this example
- Slope: change in Y for one-unit change in X.
- Intercept: fitted Y when X = 0.
- R²: how much of the Y variation is explained by X.
- Residuals: the model's errors — the real diagnostic layer.
When simple linear regression is appropriate
One response, one predictor, approximately linear relationship, no strong time-order dependence, and residual variation that is roughly constant.
What to check before trusting the model
Scatter plot shape, residual plot, unusual points, leverage/influence, and whether the physics actually supports a straight-line relationship.
Real-data example: Anscombe's Quartet
NIST uses Anscombe's quartet to prove a crucial lesson: four data sets can have nearly identical summary statistics and regression results, yet have completely different structures. That means numbers alone can hide the truth.
Why this belongs in your site
- Users immediately understand why plots matter.
- It prevents blind trust in slope, r, and R².
- It visually explains linearity, outliers, curvature, and leverage.
What graphs reveal that summary statistics hide
Curvature, clusters, outliers, leverage points, unequal spread, and poor experimental design.
Best practice to teach users
Always look at the scatter plot first, then fit the model, then inspect residuals. Never reverse that order.
Real-data example: NIST monthly CO₂ concentrations
The NIST handbook uses monthly CO₂ concentrations from Mauna Loa as a sample time-series data set. Time-series data must be treated differently from ordinary regression data because the observations are ordered in time and can have trend, seasonality, and autocorrelation.
What users should learn from this example
- Trend: the long-term level is rising.
- Seasonality: there is a repeating annual pattern.
- Smoothing: moving averages reveal the underlying path.
- Modeling rule: identify the structure before forecasting.
Time-series workflow users should remember
Plot the series, check for trend, check for seasonality, check for dependence, smooth only to reveal structure, then choose a forecasting method.
When not to use ordinary regression alone
When data are collected over time and adjacent observations are related. Independence is no longer a safe assumption.
Reliability Engineering
Quantitative methods for predicting, measuring, and improving product reliability — from MTBF calculations to Weibull analysis and system configuration modeling.
Core Reliability Metrics
Six numbers tell the complete reliability story of any system. Understanding how they connect — and what levers you pull to improve each — is the foundation of reliability engineering.
A fleet of 10 pumps operated for 50,000 hours total. During this period, 5 failures were recorded with a total repair time of 20 hours.
The Four Fundamental Functions — How They Derive from Each Other
Every reliability distribution is built from a single starting point: the probability density function f(t). All other functions follow by integration or differentiation. This is the NIST 8.1.6 framework — not four separate formulas, but one coherent system.
0 to t
F(0) = 0, F(∞) = 1
= 1−F(t)
= −d[ln R(t)]/dt
exp[−H(t)]
= ∫ₜ^∞ f(u) du
= exp[−H(t)]
Every reliability distribution is fully specified by its hazard function h(t). The shape of h(t) determines the failure behaviour — decreasing, constant, or increasing — which maps directly to the three phases of the bathtub curve.
Hazard Rate h(t) — Three Shapes, Three Stories
The hazard function h(t) is the most informative reliability curve. Its shape tells you what kind of failure mechanism is at work and what action to take.
Key Distributions — Formula Sets
Two distributions cover the majority of reliability engineering problems. Know their hazard shapes and when to use each.
F(t) = 1 − e−λt
R(t) = e−λt
h(t) = λ (memoryless)
F(t) = 1 − e−(t/η)β
h(t) = (β/η)(t/η)β−1
MTTF = η·Γ(1+1/β)
Quick Reference — Model Selection Guide
| Model | R(t) Formula | h(t) Shape | β (Weibull) | Use When | Typical Applications |
|---|---|---|---|---|---|
| Exponential | e−λt | Constant ─ | β = 1 | Random, memoryless failures | Electronics, software, random events |
| Weibull (β<1) | e−(t/η)β | Decreasing ↘ | 0.5–0.9 | Infant mortality, manufacturing defects | Early field failures, weld defects |
| Weibull (β>1) | e−(t/η)β | Increasing ↗ | 2–4 typical | Wear-out, fatigue, ageing | Bearings, tyres, mechanical wear |
| Lognormal | 1−Φ[(ln t−µ)/σ] | Peaks then drops | — | Fatigue crack propagation, corrosion | Metals fatigue, semiconductor oxide |
| Normal | 1−Φ[(t−µ)/σ] | Increasing ↗ | — | Tight wear-out with known life | Light bulbs, precision wear mechanisms |
| Gamma | 1−I(λt, k) | Varies with k | — | Systems requiring k failures to fail | Standby redundancy, shock models |
Which model to choose? Plot your data on Weibull probability paper first. If it falls on a straight line, Weibull fits. If the β you estimate is 1.0, use the simpler exponential. Only choose lognormal or normal when engineering knowledge of the failure mechanism supports it.
The Bathtub Curve — Failure Rate Over Product Lifetime
The bathtub curve describes how the failure rate λ(t) changes across a product's life. Three distinct phases require different engineering strategies.
Decreasing Failure Rate
High initial failure rate that falls rapidly. Caused by manufacturing defects, design weaknesses, and substandard components.
- Burn-in / ESS testing
- Process improvement (SPC)
- Incoming inspection
Constant Failure Rate
Low, approximately constant random failure rate. MTBF = 1/λ applies here. Normal operating life of the product.
- Exponential distribution (β=1)
- Preventive maintenance
- Redundancy design
Increasing Failure Rate
Failure rate rises as components age, fatigue, or corrode. Planned maintenance replaces components before this phase starts.
- Predictive maintenance
- Scheduled replacement (B10)
- Weibull β > 1
Weibull Analysis — The Universal Reliability Distribution
The Weibull distribution models all three bathtub phases by adjusting a single parameter β. It's the most widely used distribution in reliability engineering.
Interpreting β
Characteristic Life η: Always the time at which 63.2% of units fail, regardless of β. F(η) = 1 − e⁻¹ = 0.632. On a Weibull probability plot, η is where the fitted line crosses the 63.2% horizontal.
Generalised Bx Life — Beyond B10
B10 is the automotive standard, but any Bx life (the time by which x% of units have failed) can be computed directly from the Weibull parameters. This is the NIST-standard approach (NIST 8.2.2).
= η · (0.01005)^(1/β)
= η · (0.10536)^(1/β)
= η · (0.69315)^(1/β)
B10 = 8000 · (0.10536)^(1/2.5) = 8000 · 0.2347 = 1,878 hr
B50 = 8000 · (0.69315)^(1/2.5) = 8000 · 0.7917 = 6,334 hr
Weibull Probability Plotting — Rank Regression Method (NIST 8.2.2)
The Weibull probability plot linearises the Weibull CDF so failure data falls on a straight line — slope gives β, x-intercept at 63.2% gives η. The step-by-step NIST procedure:
Order n failures as t₁ < t₂ < … < tₙ. Assign median rank (Benard's approximation):
Take double log of both sides of R(t) = e^(−(t/η)^β):
Y = β·X − β·ln(η)
Fit straight line to (ln(tᵢ), ln[ln(1/(1−F̂ᵢ))]) by least squares:
η̂ = exp(−intercept / β̂)
n = 5, Benard ranks:
i=1: F̂ = 0.70/5.4 = 0.130
i=2: F̂ = 1.70/5.4 = 0.315
i=3: F̂ = 2.70/5.4 = 0.500
i=4: F̂ = 3.70/5.4 = 0.685
i=5: F̂ = 4.70/5.4 = 0.870
→ Plot, fit line → β̂ ≈ 2.1, η̂ ≈ 1,580 hr
📈 Weibull Quick Ref
η (Characteristic Life)
63.2% of units fail by η. Always true regardless of β.
B10 Life
Time by which 10% of units fail. Standard bearing and automotive spec metric.
Weibull Probability Plot
Plot ln(ln(1/(1−F))) vs ln(t). Slope = β. Intercept gives η. Straight line confirms Weibull fit.
Random Number Gen.
x = σ(−ln ξ)^(1/η) where ξ ~ Uniform(0,1). From the Stockholm Distributions Handbook.
Series vs Parallel Systems
Design implication: Critical single-point failures (no redundancy = series) dramatically reduce system reliability. Adding even one parallel backup on a 0.9 R component raises it from 0.9 to 0.99 — a 10× reliability improvement for that subsystem.
Probability Foundations for Reliability — NIST 8.1.6
Reliability is fundamentally a probability — the probability that a device performs its intended function during a specified period under stated conditions. The four-function framework below is the mathematical backbone of all reliability analysis, per NIST Engineering Statistics Handbook Section 8.1.6.
The Four Functions — Complete Derivation Chain
Requirements: f(t) ≥ 0, ∫₀^∞ f(t)dt = 1
Meaning: instantaneous failure rate density
F(0) = 0, lim F(t) = 1 as t→∞
Meaning: fraction failed by time t
= ∫ₜ^∞ f(u)du
Meaning: probability of surviving beyond t
= −d[ln R(t)] / dt
Meaning: conditional failure rate at time t
given survival to t
R(t) = exp[−H(t)] ← universally valid
f(t) = h(t)·R(t) = h(t)·exp[−H(t)]
MTTF = ∫₀^∞ R(t)dt = E[T]
Five Distributions — h(t), R(t), F(t), f(t) Side-by-Side
| Distribution | h(t) Hazard | R(t) Reliability | F(t) CDF | MTTF | Shape |
|---|---|---|---|---|---|
| Exponential | λ (constant) | e^(−λt) | 1 − e^(−λt) | 1/λ | Flat — useful life, β=1 |
| Weibull | (β/η)(t/η)^(β−1) | exp[−(t/η)^β] | 1−exp[−(t/η)^β] | η·Γ(1+1/β) | Power — all phases |
| Lognormal | φ(z)/[σt·Φ(−z)] z=(ln t−µ)/σ |
1 − Φ[(ln t−µ)/σ] | Φ[(ln t−µ)/σ] | exp(µ+σ²/2) | IFR then DFR — fatigue, corrosion |
| Normal | φ(z)/[1−Φ(z)] z=(t−µ)/σ |
1 − Φ[(t−µ)/σ] | Φ[(t−µ)/σ] | µ | IFR — tight wear-out |
| Gamma | Complex — see NIST 8.1.9 | 1 − I(t/β, k) incomplete gamma |
I(t/β, k) | kβ | k<1: DFR, k=1: Exp, k>1: IFR |
Hazard Function Shapes — The Physical Meaning
Failure rate decreases with time. Indicates infant mortality — early failures remove weak units. Example: Weibull β < 1, Gamma k < 1.
Constant failure rate. Memoryless — age does not affect remaining life. Exponential distribution. Example: electronic components in useful life.
Failure rate increases — component ages and wears out. Weibull β > 1, Normal, most mechanical components under fatigue and corrosion.
Real-world products combine all three phases. The lognormal has a unimodal hazard — rises then falls. Mixed Weibull populations generate bathtub curves.
Types of Events — Probability Rules
Cannot occur simultaneously.
A's occurrence doesn't affect P(B).
A' is the event that A does NOT occur.
MTBF Worked Examples
100 items tested for 10,000 hours. 5 items failed at 5,000 hours.
= 975,000 / 100 = 9,750 hrs
MTBF = (95×10,000 + 5×5,000) / 5
= 975,000 / 5 = 195,000 hrs
MTTF divides by total units (100); MTBF divides by failed units only (5)
Exponential distribution with λ = 0.001 failures/hr. Find h(t), R(t) at t = 500 hr:
F(t) = 1 − e^(−0.001t)
R(t) = e^(−0.001t)
h(t) = f(t)/R(t) = λ = 0.001 (constant)
R(500) = e^(−0.5) = 0.6065 = 60.65%
Exponential h(t) = λ always — this is the memoryless property
Fault Tree Analysis — Top-Down Deductive Reliability
Fault Tree Analysis (FTA) is a top-down, deductive technique that models how a defined system failure (the top event) can occur through combinations of component failures and human errors. It uses Boolean logic gates to trace failure pathways. Foundational to MIT 22.38 and MIL-STD-1629A. Complement to FMEA: FTA asks "what combinations of events cause this failure?" while FMEA asks "what does each component failure cause?"
The Logic Gates — Boolean Building Blocks
Output event occurs only if all input events occur simultaneously. Represents redundancy — protective when components are independent.
Output event occurs if at least one input event occurs. Most common gate — represents that any single failure propagates upward.
The lowest-level failure event in the tree. Has an assigned failure probability λ (from field data, MIL-HDBK-217F, or manufacturer's data).
An event not developed further — either insufficient data, or judged insufficiently important. Marked explicitly so reviewers know it was a conscious decision.
The FTA Process — 6 Steps
Minimal Cut Sets — The Mathematics
For a system with minimal cut sets K₁, K₂, …, Kₘ, the top event T occurs if any cut set occurs completely. Using the inclusion-exclusion principle:
+ Σ P(Kᵢ ∩ Kⱼ ∩ Kₖ) − …
Minimal cut sets: K₁ = {A,B}, K₂ = {C}, K₃ = {A,D}
q_C = 0.005 q_D = 0.03
P(K₁) = 0.01 × 0.02 = 2×10⁻⁴
P(K₂) = 0.005
P(K₃) = 0.01 × 0.03 = 3×10⁻⁴
P(T) ≈ 2×10⁻⁴ + 5×10⁻³ + 3×10⁻⁴
= 5.5×10⁻³
Component Importance Measures
| Measure | Formula | Interpretation | Use when |
|---|---|---|---|
| Birnbaum (Structural) | IB(i) = ∂P(T)/∂qᵢ | Rate of change of top event probability with respect to component i's failure probability | Comparing sensitivity — which component improvement gives biggest P(T) reduction? |
| Fussell-Vesely | IFV(i) = P(at least one MCS containing i fails) / P(T) | Fraction of total risk contributed by cut sets containing component i | Maintenance prioritisation — where does this component contribute most to risk? |
| Risk Reduction Worth (RRW) | RRW(i) = P(T) / P(T | qᵢ=0) | Factor by which P(T) decreases if component i is made perfect (qᵢ→0) | Investment decisions — what is the maximum achievable benefit of improving component i? |
| Risk Achievement Worth (RAW) | RAW(i) = P(T | qᵢ=1) / P(T) | Factor by which P(T) increases if component i is guaranteed to fail | Maintenance criticality — how important is it to keep this component working? |
- ▸ Top-down (deductive)
- ▸ Starts from a specific failure
- ▸ Finds all combinations that cause it
- ▸ Handles complex logic & dependencies
- ▸ Quantitative probability output
- ▸ Best for: safety-critical top events
- ▸ Bottom-up (inductive)
- ▸ Starts from each component
- ▸ Traces all effects of each failure
- ▸ Covers the full system comprehensively
- ▸ RPN prioritisation (qualitative)
- ▸ Best for: comprehensive coverage of all failure modes
Best practice: use FMEA first for broad coverage, then FTA for deep analysis of the highest-severity failure modes identified by FMEA. Together they give both breadth and depth.
Reliability Block Diagrams — System Architecture & Redundancy
A Reliability Block Diagram (RBD) is a success-oriented model that shows how components must function for the system to function. Unlike FTA which models failure, RBD models success paths. Based on MIT 22.38 Section IX (Simple Logical Configurations) and Rausand & Høyland Chapter 4.
Series, Parallel, and k-out-of-n Systems
All components must work
System fails if any single component fails. Reliability is always lower than the weakest component. The engineering challenge: every component is a single point of failure.
Any one component is sufficient
System fails only if all parallel components fail. Reliability always exceeds the best single component. Each component runs continuously (hot standby).
k-out-of-n Systems — Voting Architectures
A k-out-of-n system succeeds if at least k of n components function. This generalises both series (k=n) and parallel (k=1). Common in safety systems: 2-out-of-3 voting gives high reliability without the cost of full parallel redundancy.
= 3×0.81×0.1 + 1×0.729
= 0.243 + 0.729 = 0.972
Standby Redundancy — Cold, Warm, and Hot
| Type | Standby State | Switch Reliability | Reliability Formula | Application |
|---|---|---|---|---|
| Hot Standby | Fully powered, running at full load — instant takeover | Near 1.0 (automatic) | Same as active parallel: R = 1−(1−R)ⁿ | Aircraft hydraulics, nuclear safety systems |
| Warm Standby | Partially energised — reduced failure rate λ_s < λ during standby | High, with brief startup | Requires Markov model — intermediate between hot/cold | Generator sets, server farms |
| Cold Standby | De-energised — zero failure rate during standby | R_sw required (switch may fail) | R_s = e−λt(1 + λt) for 1-unit standby with perfect switch | Backup pumps, emergency systems |
= e−λt + ∫₀ᵗ λe−λτ · e−λ(t−τ) dτ
= e−λt + λt·e−λt
= e−λt(1 + λt)
Stress-Strength Interference — MIT 22.38 Section IX.3
A component fails when applied stress S exceeds its strength R. Both are random variables. Reliability = P(R > S). This is the probabilistic basis for design margins.
R = ∫₋∞^∞ f_S(s) · F_R(s) ds
= ∫₋∞^∞ f_S(s) · P(R > s) ds
(R−S) ~ N(µ_R−µ_S, σ_R²+σ_S²)
Reliability = Φ[(µ_R−µ_S) / √(σ_R²+σ_S²)]
= Φ[z_margin]
Accelerated Life Testing — Compressing Time to Failure
ALT subjects products to stresses (temperature, voltage, vibration, humidity) higher than normal use conditions to induce failures faster, then models the stress-life relationship to extrapolate reliability at use conditions. The core challenge: accelerate only the same failure mechanisms that would occur in service.
The Three Primary Life-Stress Models
Most widely used ALT model
Derived from the Arrhenius equation for chemical reaction rates. Valid when the dominant failure mechanism is thermally activated — oxidation, corrosion, electromigration, diffusion, creep.
T_use = 55°C = 328 K
T_test = 125°C = 398 K
AF = exp[0.7/8.617×10⁻⁵ × (1/328 − 1/398)]
= exp[8123 × (0.003049 − 0.002513)]
= exp[8123 × 0.000536]
= exp[4.354]
= 77.8×
0.3–0.5 eV: Electromigration in Al
0.5–0.7 eV: Oxide breakdown
0.7–1.0 eV: Corrosion mechanisms
1.0–1.4 eV: Si-SiO₂ interface traps
For non-thermal stress: voltage, load, pressure
Used when failure mechanism is driven by mechanical stress, voltage, or other non-thermal accelerants. L(S) follows a power law relationship with the stress level S.
Test voltage: V_test = 100V
Power law exponent: n = 4
AF = (100/50)⁴ = 2⁴ = 16×
Extends Arrhenius to include a second stress variable (humidity, voltage, vibration). Derived from quantum mechanics (reaction rate theory). Used in humidity + temperature testing (85°C/85% RH, JEDEC JESD22-A101).
| Test | Stress 1 | Stress 2 | Standard |
|---|---|---|---|
| HAST | 130°C | 85% RH | JESD22-A110 |
| 85/85 | 85°C | 85% RH | JESD22-A101 |
| THB | 85°C | 85% RH + bias | AEC-Q100 |
| HTOL | 125–150°C | Full voltage | JESD22-A108 |
HALT, HASS, and ESS — Qualitative vs Quantitative ALT
Highly Accelerated Life Test
Apply stepwise increasing stress (temperature, vibration, both combined) to failure. Goal: find the operating limit and destruct limit. Qualitative — not intended for life prediction, but for design robustness discovery.
Highly Accelerated Stress Screening
Production screen applied to every unit (or sample). Uses stress levels below HALT destruct limits to precipitate latent defects before shipment without consuming life of good units.
Environmental Stress Screening
Temperature cycling and/or random vibration screen applied post-assembly. MIL-HDBK-2164 defines profiles. Addresses infant mortality phase of bathtub curve — forces early failures to occur in factory, not in the field.
Combining Weibull with ALT — Life Data Analysis
In ALT data analysis, Weibull distribution is fitted at each stress level. The assumption is that the shape parameter β is constant across stress levels (same failure mechanism), while the scale parameter η changes with stress according to the life-stress model.
R(t,T) = exp[−(t/η(T))^β]
β = constant (same failure mechanism)
Reliability Demonstration Testing — Proving What You Claim
A reliability demonstration test (RDT) answers a specific question: "Can I claim with C% confidence that the true reliability is at least R* at time t?" It requires defining a reliability target, a confidence level, a mission time, and a test termination criterion — before running a single unit.
The Mathematics of Demonstration Testing
The fundamental statistical basis: if a sample of n units is tested and c failures are observed, the lower confidence bound on the true failure probability p at a given confidence level C is derived from the binomial distribution (or Poisson for time-terminated tests).
Required sample: n = ln(1−C) / ln(R*)
Or: n = ln(α) / ln(R*) where α = 1−C
Solve for n given C, R*, and allowed failures c
| Reliability R* | 90% Confidence | 95% Confidence | 99% Confidence |
|---|---|---|---|
| 0.900 | 22 | 29 | 44 |
| 0.950 | 45 | 59 | 90 |
| 0.990 | 230 | 299 | 459 |
| 0.999 | 2302 | 2995 | 4603 |
| 0.9999 | 23026 | 29957 | 46051 |
Time-Terminated Tests — Poisson Basis
When units are tested for a fixed time T (each), total accumulated test time = n × T. For an exponential (constant failure rate) model, the number of failures follows a Poisson distribution. This allows MTBF/failure rate demonstration.
T_total = total accumulated test time
c = observed failures
α = 1 − C (risk level)
Confidence required: 90% (α = 0.10)
Test plan: 10 units × 1,000 hr each
T_total = 10,000 hr
Result: 0 failures observed
MTBF_lower = −2×10,000 / ln(0.10)
= −20,000 / (−2.303)
= 8,686 hr
Producer & Consumer Risk — The OC Curve for Reliability
Probability that a product with reliability below the requirement passes the test. A false accept.
Probability that a product with reliability above the requirement fails the test. A false reject.
Ratio between the MTBF that should be accepted (θ₁) and the MTBF that should be rejected (θ₀).
Distribution Functions — Complete Reliability Toolkit
NIST 8.1.7–8.1.9 covers the full family of distributions used in reliability engineering. Each distribution is defined by its hazard function shape — choosing the right one is not a statistical preference but a physical claim about the failure mechanism.
Exponential Distribution — NIST 8.1.7
The exponential is the only continuous distribution with the memoryless property: P(T > t+s | T > t) = P(T > s). A component that has survived to time t has the same remaining life distribution as a new component. This applies only during the useful-life phase (constant failure rate).
F(t) = 1 − e^(−λt)
R(t) = e^(−λt)
h(t) = λ (constant)
H(t) = λt
MTTF = 1/λ
Var(T) = 1/λ²
Median = ln(2)/λ = 0.693/λ
MTBF = 1/λ = 50,000 hr
R(t=8760 hr) = e^(−2×10⁻⁵ × 8760)
= e^(−0.1752) = 83.9% (1-year reliability)
R(t=40000) = e^(−0.8) = 44.9%
P(fail before 40,000 hr) = 55.1%
Common misconception: MTBF = 50,000 hr does NOT mean the component lasts 50,000 hr. It means ~63.2% fail BEFORE 50,000 hr. At t = MTBF, R(MTBF) = e⁻¹ = 36.8% survive.
Lognormal Distribution — NIST 8.1.9
If ln(T) ~ Normal(µ, σ²), then T ~ Lognormal(µ, σ). Best for failure mechanisms driven by multiplicative damage accumulation: fatigue, corrosion, crack propagation. The hazard function is unimodal — rises then decreases (IFR then DFR), making it physically realistic for degradation processes.
F(t) = Φ[(ln t−µ)/σ]
R(t) = 1 − Φ[(ln t−µ)/σ]
h(t) = f(t)/R(t) [no closed form]
MTTF = exp(µ + σ²/2)
Median = e^µ
Var(T) = e^(2µ+σ²)·(e^(σ²)−1)
µ = 10.5, σ = 0.8 (in ln-hours). Find R at 30,000 hr and MTTF.
= (10.309 − 10.5) / 0.8 = −0.239
R(30000) = 1 − Φ(−0.239) = 59.4%
MTTF = exp(10.5 + 0.8²/2) = exp(10.82) = 49,916 hr
Normal Distribution in Reliability
The Normal(µ, σ) is appropriate when failure times have a symmetric distribution — tight wear-out mechanisms where fatigue accumulates uniformly. The hazard function is strictly increasing (IFR), making it suitable for components that reliably wear out at a predictable age.
F(t) = Φ[(t−µ)/σ]
R(t) = 1 − Φ[(t−µ)/σ]
h(t) = φ(z) / [σ(1−Φ(z))] strictly IFR
MTTF = µ
B10 = µ − 1.282σ (10th percentile)
µ = 60,000 km, σ = 8,000 km. Find B10 and R at 45,000 km.
R(45,000) = 1 − Φ[(45000−60000)/8000]
= 1 − Φ(−1.875) = 97.0%
Gamma Distribution — NIST 8.1.9
Gamma(k, β) is the distribution of the sum of k independent exponential(1/β) random variables. Shape parameter k controls hazard function shape: k < 1 gives DFR, k = 1 gives exponential, k > 1 gives IFR.
F(t) = I(t/β, k) [incomplete gamma ratio]
R(t) = 1 − I(t/β, k)
MTTF = kβ
Var(T) = kβ²
Mode = (k−1)β for k ≥ 1
| Failure Mechanism | Best Distribution |
|---|---|
| Constant random failures | Exponential |
| Infant mortality / any phase | Weibull |
| Fatigue, corrosion, crack growth | Lognormal |
| Symmetric, tight wear-out | Normal |
| Sum of k failure events | Gamma |
| Unknown — fit all, use AIC/BIC | Probability plot comparison |
Parameter Estimation — MLE, Rank Regression & Censored Data
Fitting a reliability distribution to field or test data is a statistical inference problem. Two main methods: Maximum Likelihood Estimation (MLE) — the NIST-preferred method for accuracy and confidence interval generation — and Rank Regression — graphical, intuitive, and useful for small samples. Both must handle censored data correctly.
Censoring — The Core Challenge of Reliability Data
The exact failure time tᵢ is known. Contributes f(tᵢ) to the likelihood. The ideal case — often impractical in life testing.
Unit survived to time cᵢ (end of test or withdrawal). We know T > cᵢ but not exact failure time. Most common type.
Unit already failed before first inspection at time dᵢ. We know T < dᵢ. Common in inspection data.
Failure in interval [Lᵢ, Rᵢ] — inspected OK at Lᵢ, failed at Rᵢ. Very common in periodic inspection.
Maximum Likelihood Estimation (MLE) — NIST 8.2.6
MLE finds the parameter values that make the observed data most probable. For mixed censored data with r failures and (n−r) censored units:
Log-likelihood: ℓ(θ) = Σᵢ ln f(tᵢ) + Σⱼ ln R(cⱼ)
Maximise ℓ(θ) by solving: ∂ℓ/∂θ = 0 (numerically)
∂ℓ/∂η: −rβ/η + (β/η^(β+1))Σ tᵢᵝ = 0
→ Solve numerically (Newton-Raphson or EM algorithm)
- Asymptotically unbiased and efficient
- Handles all censoring types correctly
- Provides Fisher information for confidence intervals
- Can be used with covariates (regression models)
- Standard in Minitab, ReliaSoft Weibull++
95% CI on R(t): use log-log transform
θ = ln(−ln R̂(t))
Var(θ) ≈ [Σ dᵢ/(nᵢ(nᵢ−dᵢ))] / [ln R̂(t)]²
CI: R̂(t)^exp(±1.96√Var(θ))
Kaplan-Meier Estimator — Non-Parametric Survival (NIST 8.2.1)
The Kaplan-Meier estimator computes the empirical survival function without assuming any parametric form. Essential for exploratory analysis. Correctly handles right-censored data (suspended items).
where:
tᵢ = ordered failure times
dᵢ = deaths (failures) at tᵢ
nᵢ = units at risk just before tᵢ
(includes censored units still alive)
(† = censored/suspended)
t=500: n=8, d=1 R̂ = 1·(7/8) = 0.875
t=1100: n=6, d=1 R̂ = 0.875·(5/6) = 0.729
t=1400: n=5, d=1 R̂ = 0.729·(4/5) = 0.583
t=2200: n=3, d=1 R̂ = 0.583·(2/3) = 0.389
t=2700: n=2, d=1 R̂ = 0.389·(1/2) = 0.194
t=3200: n=1, d=1 R̂ = 0.194·(0/1) = 0.000
Competing Failure Modes & Stress-Strength Interference
Real systems fail from multiple distinct mechanisms — corrosion, fatigue, overload — acting simultaneously in competition. A single Weibull distribution fitted to mixed data gives misleading results. Understanding competing failure modes and probabilistic stress-strength interaction is essential for design and maintenance decisions.
Competing Failure Modes — The Series System of Mechanisms
If a unit can fail by any of k independent modes, the system survives only if all modes survive. This is a series reliability model on the failure mechanisms:
R_sys(t) = R₁(t) · R₂(t) · … · Rₖ(t) (if modes are independent)
F_sys(t) = 1 − ∏ᵢ [1 − Fᵢ(t)]
h_sys(t) = h₁(t) + h₂(t) + … + hₖ(t) ← hazard functions ADD
For exponential modes: λ_sys = λ₁ + λ₂ + … + λₖ
Mixed Weibull Populations — Bimodal Failure Data
f(t) = p·f₁(t) + (1−p)·f₂(t)
R(t) = p·R₁(t) + (1−p)·R₂(t)
where p = fraction from subpopulation 1
F₁(t) = Weibull(β₁, η₁) [infant mortality]
F₂(t) = Weibull(β₂, η₂) [wear-out]
10% of assemblies have a solder defect (β₁=0.6, η₁=200 hr), 90% are good (β₂=3.5, η₂=12,000 hr).
At t = 100 hr:
F₁(100) = 1−exp[−(100/200)^0.6] = 0.325
F₂(100) ≈ 0.000
F_mix(100) = 0.10×0.325 + 0.90×0 = 3.25%
At t = 8000 hr:
F₁(8000) ≈ 0.997
F₂(8000) = 1−exp[−(8000/12000)^3.5] = 0.116
F_mix(8000) = 0.10×0.997 + 0.90×0.116 = 20.4%
Stress-Strength Interference Model — NIST 8.1.11
R = ∫₋∞^∞ f_S(s) · [1 − F_R(s)] ds
= ∫₋∞^∞ f_S(s) · P(R > s) ds
(R−S) ~ N(µ_R−µ_S, σ_R²+σ_S²)
Reliability = Φ[z]
z = (µ_R − µ_S) / √(σ_R² + σ_S²)
z = "reliability index" β
S ~ N(µ_S=350 MPa, σ_S=30 MPa)
z = (500 − 350) / √(40² + 30²)
= 150 / √2500 = 150/50 = 3.0
Reliability = Φ(3.0) = 99.865%
P(failure) = 1,350 ppm
Safety factor = µ_R/µ_S = 500/350 = 1.43
→ Safety factor 1.43 → 1,350 ppm failure
→ z = 3.0 is the real risk metric
Safety Factor vs z: A high deterministic safety factor with high variability may give worse reliability than a lower safety factor with tight distributions. The reliability index z accounts for both mean margins AND variability — it is the true engineering measure of safety.
Maintainability & Availability — Repairable Systems
Most real-world systems are repairable. Reliability alone is insufficient; engineering must also quantify maintainability (ease and speed of repair) and availability (net fraction of time the system is operational). This section covers NIST 8.4 and renewal theory fundamentals.
Three Levels of Availability
Design-level — ideal conditions
Considers only corrective maintenance. Ignores PM time, logistics, supply delays. The theoretical maximum.
Operations — CM + PM included
Includes corrective and preventive maintenance downtime. Does not include logistics/administrative delays.
Real-world — all delays included
Includes logistics delay time (LDT) and administrative delay time (ADT). The real-world user experience.
Steady-State Availability — Markov Model Derivation
UP → DOWN at rate λ (failure)
DOWN → UP at rate µ = 1/MTTR
A(t) = µ/(λ+µ) + [λ/(λ+µ)]·e^(−(λ+µ)t)
Steady-state (t → ∞):
A(∞) = µ/(λ+µ) = MTBF/(MTBF+MTTR)
A(∞) = 1000/1004 = 99.60%
MTBF → 2,000 hr (2× reliability improvement):
A = 2000/2004 = 99.80% (+0.20%)
MTTR → 2 hr (2× maintainability improvement):
A = 1000/1002 = 99.80% (+0.20%)
→ Equal gain! Compare investment costs.
Optimal PM Interval — Cost Minimisation
C_P = planned PM cost, C_F = corrective failure cost. Optimise long-run cost rate:
M(t) = ∫₀ᵗ R(u)du [expected life to t]
Solve dC(t)/dt = 0 → find t*
β=2.5, η=3,000 hr. C_P=£500 (planned), C_F=£8,000 (failure + downtime cost).
F(1800) = 1−exp[−(1800/3000)^2.5] = 0.231
Cost rate with PM ≈ £1.30/hr
Run-to-failure: £8,000/MTTF ≈ £3.00/hr
→ PM saves ~57% of maintenance cost rate
Renewal Theory — HPP vs NHPP (NIST 8.3)
Repairable systems restored to "as good as new" follow a Homogeneous Poisson Process (HPP). Partially-repaired systems follow a Non-Homogeneous Poisson Process (NHPP) with time-dependent intensity ρ(t).
E[N(t)] = λt
Var[N(t)] = λt
Test: cumulative failures vs t is linear
E[N(t)] = λt^β
β < 1: improving (reliability growth)
β = 1: HPP (constant)
β > 1: worsening (reliability decay)
MLE: β̂ = n / Σᵢ ln(T/tᵢ)
β̂ = 12 / [Σᵢ ln(2000/tᵢ)]
= 12 / 21.4 ≈ 0.560
β̂ = 0.56 < 1 → reliability is growing
Projected failures at T=4,000 hr:
λ̂ = n/T^β = 12/2000^0.56 ≈ 0.265
E[N(4000)] = 0.265×4000^0.56 ≈ 18.2
MIL-HDBK-189C: Plot cumulative failures vs ln(t) on log-log paper. A straight line confirms the Power Law NHPP. Slope = β. Standard for reliability growth tracking during development testing.
Statistical Distributions
A distribution is not just a formula. It is a model of how data behaves: where values cluster, how tails behave, what kinds of outcomes are possible, and what assumptions your downstream analysis is making.
This page is designed as a world-class reference and teaching system: an 8-distribution visual studio, a 30-family continuous catalog, a 9-family discrete catalog, and a selector guide that tells users which distribution to choose and under what conditions.
Essential Distributions Studio — Visual, Formula-Driven, Example-Led
NIST/SEMATECH emphasizes that distribution choice should be supported by graphics and goodness-of-fit checks, including probability plots for competing families. This studio front-loads the distributions engineers use most often and connects each one to a graph, formula, parameter meaning, and actual engineering use case.
Normal distribution
The normal distribution is the default model for many physical measurements when variation comes from many small additive sources. It is symmetric, bell-shaped, and fully determined by μ and σ.
Real engineering example
Coating thickness across a stable roll-to-roll process often looks approximately normal when the process is centered and major disturbances are absent. That is why capability analysis and Z-based defect estimates often start here.
Condition for use
Use it for continuous measurements when the histogram is approximately symmetric, the tails are not wildly heavy, and the normal probability plot is reasonably straight.
Lognormal distribution
A variable is lognormal when its logarithm is normally distributed. Values are strictly positive and the distribution is right-skewed, often with a long tail.
Real engineering example
Cycle times, repair times, particle sizes, and supplier lead times often show lognormal behavior because many multiplicative factors stretch the upper tail.
Condition for use
Use it when values cannot be negative and the upper tail stretches farther than the lower side; especially when multiplicative factors drive the data.
Weibull distribution
Weibull is the workhorse of life-data analysis because its shape parameter β changes the hazard behavior. That makes it useful for infant mortality, random failure, and wear-out.
Real engineering example
Cycles-to-failure of tabs, seal fatigue life, or motor bearing failure times are often modeled with Weibull because the failure pattern changes across the life cycle.
Condition for use
Use it for life / failure data when the hazard is not obviously constant and you need a flexible reliability model tied to physics of failure.
Exponential distribution
The exponential distribution models waiting times when the event rate is constant. It is memoryless, so the future does not depend on how long you have already waited.
Real engineering example
If rare unscheduled line stoppages occur independently at a roughly constant average rate, time-between-stoppages is often modeled exponentially.
Condition for use
Use it for interarrival times and random-failure periods only when the hazard is approximately constant. If hazard changes with age, move to Weibull.
Binomial distribution
The binomial distribution models the number of successes or defectives in a fixed number of independent yes/no trials with the same probability p.
Real engineering example
If you inspect 20 welds and each weld is either acceptable or defective, the number of defectives in the sample is binomial.
Condition for use
Use it when you have a fixed number of independent trials, each trial has only two outcomes, and the probability of success/defect is constant.
Poisson distribution
The Poisson distribution models counts of rare events per unit area, time, volume, or opportunity when events occur independently at a constant average rate λ.
Real engineering example
Pinholes per square meter, scratches per panel, voids per electrode sheet, or complaints per day often start with a Poisson model.
Condition for use
Use it for counts of events per fixed opportunity when events are independent and the average rate is reasonably stable.
Student's t distribution
The t distribution is used when estimating a mean from a small sample and the population standard deviation is unknown. It has heavier tails than the normal distribution.
Real engineering example
Suppose you have only 8 peel-strength results from a pilot line and need a confidence interval for the mean. That interval is built with a t critical value, not a Z critical value.
Condition for use
Use it when the sample is small and population sigma is unknown; it is a reference distribution for inference, not usually the raw data model itself.
Chi-square distribution
The chi-square distribution is built from sums of squared standard normal variables. It appears in variance confidence intervals, chi-square tests, and goodness-of-fit problems.
Real engineering example
If you want a confidence interval for process variance, or you need a chi-square goodness-of-fit test for counts in categories, chi-square is the reference distribution.
Condition for use
Use it whenever squared deviations and sample variance are central to the question, such as variance intervals and goodness-of-fit tests.
How to choose rigorously: NIST recommends comparing competing distributions with graphics such as probability plots and checking whether the selected model is consistent with the process mechanism and the observed tail behavior.
Continuous Distribution Catalog — 30 Families in Selector Studio
This catalog uses the same click-to-learn approach as the Visual Studio. Select any continuous family to see the formula, symbol explanations, characteristics, use conditions, and a larger visual preview.
Gamma
Positive right-skewed data such as waiting times or accumulated damage.
k = shape, θ = scale, Γ(k) = gamma function that generalizes factorial.
Strictly positive, right-skewed, flexible body and tail. As k increases, the curve becomes less skewed and more bell-like.
Time to absorb moisture to a threshold, service duration, or rainfall-like waiting quantity.
Discrete Distribution Catalog — 9 Families in Selector Studio
Select any discrete family to view its formula, symbol meanings, characteristics, use conditions, and a larger visual preview.
Bernoulli
Single pass/fail trial.
p = success probability.
Only two outcomes are possible. It is the atomic building block for binomial-type models.
One weld acceptable or not; one part passes or fails.
Selector Guide — Which Distribution Should I Use?
Start with the data type, then the mechanism, then the shape. This is the practical decision flow quality engineers need.
Continuous measurement, symmetric histogram
Start with Normal. Confirm with a histogram and normal probability plot.
Continuous, positive only, strong right skew
Check Lognormal, Gamma, Weibull, or Log-logistic. Use process mechanism to decide.
Time-to-failure or cycles-to-failure
Start with Weibull. Use Exponential only if the hazard appears constant. Consider Lognormal when multiplicative degradation dominates.
Pass/fail counts in fixed sample size
Use Binomial. If sampling is without replacement from a finite lot, use Hypergeometric.
Defects per unit / event counts per time
Use Poisson for rare-event counts. If variance is much larger than the mean, consider Negative Binomial.
Need confidence interval for mean with small n
Use the t distribution for the inferential step, even if the underlying raw process data are approximately normal.
Need variance interval or goodness-of-fit test
Use Chi-square. For ANOVA or variance-ratio tests, use F.
Bounded proportion from 0 to 1
Use Beta or a transform-normal bounded family such as Johnson SB when shape flexibility is needed.
Best-practice workflow
1) Plot the data. 2) Use process knowledge to narrow the candidate families. 3) Compare competing fits with probability plots or fit statistics. 4) Choose the simplest defensible model that matches both the data and the mechanism.
Design of Experiments (DOE)
A practical guide to DOE — from foundational concepts through full factorial, fractional factorial, Taguchi, and mixture designs. Every concept is illustrated with real worked examples. Pioneered by Sir Ronald A. Fisher in the 1920s and extended by Taguchi, Box, Plackett & Burman — DOE remains the most powerful process optimisation tool available to quality engineers.
What is Design of Experiments?
DOE is the simultaneous study of several process variables. Rather than changing one factor at a time, you combine multiple factors in one study — drastically reducing the amount of testing required while gaining far deeper process understanding. It is primarily a logic tool, not an advanced mathematics tool.
Why NOT One-Factor-At-A-Time (OFAT)?
- ▸ Change Temperature → measure
- ▸ Change Pressure → measure
- ▸ Change Speed → measure
- Cannot detect interactions between factors
- Wastes runs. Misleading conclusions possible.
- ▸ All combinations tested together
- ▸ Same data used for multiple factors
- ▸ Detects interactions between factors
- ▸ Fewer total runs for same information
- ▸ Builds a predictive model of the process
The 9 Steps for Analysis of Effects
Every experiment in this module follows these nine analytical steps. Steps 3–6 are skipped for unreplicated experiments, attribute data, and Taguchi S/N ratio analyses — a half-normal plot is used instead.
The 6 Objectives of DOE
Key Concepts & Vocabulary
DOE has its own precise vocabulary. Understanding these terms is essential — both for exam questions and for reading DOE results correctly.
| Term | Definition | Example |
|---|---|---|
| Factor | A controllable input variable (X) that may affect the response. Also called independent variable. | Temperature, Pressure, Vendor, Catalyst concentration |
| Level | The specific setting or value used for a factor in an experiment. Two-level designs use High (+) and Low (−). | Temperature: Low = 580°F, High = 600°F |
| Response | The output (Y) being measured and improved. Also called dependent variable. | Bond strength, Yield %, Weight loss, Hardness |
| Run / Treatment | A unique combination of factor levels. Each run may be performed more than once. | A+ B− = High Temp + Vendor Y |
| Replication | An independent repeat of a run that includes a completely new setup. Provides estimate of inherent variation. | Running A+B+ three times from scratch |
| Repeat | Repetition of a run WITHOUT a new setup. Not the same as replication — does not estimate experimental error independently. | Running the same conditions back-to-back without reset |
| Full Factorial (2k) | All possible combinations of factor levels. 2 factors × 2 levels = 4 runs (22). 3 factors = 8 runs (23). | 22 design: 4 unique treatments |
| Main Effect | The average change in response when moving a factor from its low to its high level, averaged across all levels of other factors. | E(A) = Ȳ(A+) − Ȳ(A−) = +2.05 units |
| Interaction | When the effect of one factor depends on the level of another factor. If interactions are significant, the interaction plot is more meaningful than the main effect plots. | Temperature effect is +5.1 with Vendor X but −1.0 with Vendor Y |
| Confounding / Alias | When two effects are indistinguishable from each other because they produce identical sign patterns in the design matrix. | In a ½ fraction of a 23, C is confounded with AB |
| Resolution | Describes the severity of confounding. Resolution III: main effects aliased with 2-factor interactions. Resolution V: 2-factor interactions not aliased with each other. | Res III = screening only; Res V = can estimate all interactions |
| Randomization | Running trials in random order to protect against unknown time-related trends or disturbances. The "insurance policy" against misleading results. | Draw numbered cards from a hat to determine run order |
| Blocking | Grouping experimental runs to account for a known source of variation that cannot be randomized (e.g., different batches of raw material). | Run half the trials with Batch 1, half with Batch 2 |
| Center Points | Runs at the midpoint of all factor levels (coded value = 0). Used to detect nonlinearity/curvature and increase degrees of freedom. | If Temp range is 580–600°F, center point = 590°F |
| Residual | The difference between the actual observed response and the value predicted by the model. Used to validate model assumptions. | Residual = Observed − Predicted |
| Inherent Variation | The random background noise of a process. In DOE = "experimental error." In SPC = "common cause variation." | The natural process scatter that is always present |
Quantitative vs Qualitative Factors
Levels can be set along a continuous measurement scale. Preferred because they allow interpolation and optimization across the range. Example: Temperature (580–600°F), Time (45–90 sec), Concentration (10%–20%).
Levels are discrete categories — a finite number of options with no natural numeric order. Example: Vendor (X vs Y), Machine type (A vs B), Operator (Shift 1 vs Shift 2). Cannot interpolate between levels.
Coded Values — The +1 / −1 System
DOE encodes factor levels as −1 (low), 0 (center), and +1 (high). This allows the same mathematical framework to work for any factor regardless of its physical units.
Statistical Foundations for DOE
Hypothesis Testing — Type I and Type II Errors
Every DOE conclusion is a hypothesis test. Understanding error types and risks is fundamental to interpreting results correctly.
| Decision | H₀ is TRUE (no real effect) | H₀ is FALSE (real effect exists) |
|---|---|---|
| Accept H₀ (fail to reject) | ✓ Correct — Probability = 1 − α | ✗ Type II Error — Probability = β (miss a real effect) |
| Reject H₀ | ✗ Type I Error — Probability = α (false alarm) | ✓ Correct — Probability = 1 − β (Power) |
Claiming a significant effect when there isn't one. A false alarm. Typical α = 0.05 means you'll incorrectly claim significance 5 times in 100.
Missing a real effect — declaring no significance when a real difference exists. Power = 1 − β. Increase sample size to reduce β.
One-Tail vs Two-Tail Tests
Normal Probability Plots — Recognising Patterns
If data is normally distributed, points fall on a straight line. Deviations from the line reveal the distribution's character. The "pencil test": if a pencil covers all the points, the data is approximately normal.
Dean & Dixon Outlier Test
Used to detect outliers in normally distributed data before running a DOE. Data must be sorted smallest to largest. The formula used depends on sample size.
| n | Test Statistic for Smallest Value | Test Statistic for Largest Value | Decision Rule |
|---|---|---|---|
| 3 to 7 | r₁₀ = (X₂ − X₁) / (Xₙ − X₁) | r₁₀ = (Xₙ − Xₙ₋₁) / (Xₙ − X₁) | If r_calc > r_crit → outlier at chosen α |
| 8 to 10 | r₁₁ = (X₂ − X₁) / (Xₙ₋₁ − X₁) | r₁₁ = (Xₙ − Xₙ₋₁) / (Xₙ − X₂) | |
| 11 to 13 | r₂₁ = (X₃ − X₁) / (Xₙ₋₁ − X₁) | r₂₁ = (Xₙ − Xₙ₋₂) / (Xₙ − X₂) | |
| 14 to 30 | r₂₂ = (X₃ − X₁) / (Xₙ₋₂ − X₁) | r₂₂ = (Xₙ − Xₙ₋₂) / (Xₙ − X₃) |
Worked Example (n = 10): Data: 1, 3, 6, 7, 8, 9, 10, 11, 12, 23. For largest value: r₁₁ = (23 − 12)/(23 − 3) = 11/20 = 0.550. Critical value r₁₁ at α=0.05 = 0.477. Since 0.550 > 0.477, the value 23 IS an outlier at 95% confidence. The smallest value 1 gives r₁₁ = 0.182 < 0.477 — not an outlier.
Analysis of Variance (ANOVA)
ANOVA partitions the total variation in a dataset into components from different sources. It tests whether three or more group means are equal — a generalisation of the t-test. It produces an F-statistic: the ratio of between-group variance to within-group variance.
One-Way ANOVA — Testing One Factor
Tests whether a single factor (with 3+ levels) significantly affects the response. Assumptions: normality, independence, equal variances, interval data.
| Source | SS | df | MS | F Calculated | F Critical | Decision |
|---|---|---|---|---|---|---|
| Between groups | 46.8 | 2 | 23.4 | 10.97 | 3.89 | Reject H₀ |
| Within groups (error) | 25.6 | 12 | 2.1 | — | — | — |
| Total | 72.4 | 14 | — | — | — | — |
Two-Way ANOVA — Testing Two Factors + Interaction
Extends one-way ANOVA to test two factors simultaneously AND their interaction. Example: Press (2 levels) × Dwell Time (3 levels).
| Source | SS | df | MS | F Calculated | F Critical (α=0.05) | Decision |
|---|---|---|---|---|---|---|
| Rows (Press) | 1.4 | 1 | 1.4 | 0.74 | 4.75 | Fail to reject — Press NOT significant |
| Columns (Dwell time) | 46.3 | 2 | 23.2 | 12.21 | 3.89 | Reject H₀ — Dwell time IS significant |
| Rows × Columns (Interaction) | 3.5 | 2 | 1.8 | 0.95 | 3.89 | Fail to reject — No significant interaction |
| Within (error) | 23.3 | 12 | 1.9 | — | — | — |
| Total | 74.5 | 17 | — | — | — | — |
2-Factor Full Factorial — Completely De-mystified
A 2² full factorial is the simplest true experiment. Two factors, each at two levels. Four unique combinations. Run them all — then the mathematics tells you exactly which factors matter, how much, and whether they interact. No guessing. No one-factor-at-a-time (OFAT) blindness.
A plastics engineer is investigating weld line strength (MPa) in injection-moulded parts. Weld lines form where two flow fronts meet and are a known weak point. Two factors are suspected to influence strength: Melt Temperature and Injection Speed.
Goal: maximise weld line strength. Budget: 12 shots total. Each of the 4 combinations is run 3 times (replicated).
Step 1 — The Design Matrix & Experimental Data
Run all 4 combinations in random order (to prevent time-trend bias). Replicate each 3 times. Record the weld line strength for each shot. These are the actual results from the study:
| Run | A (Temp) | B (Speed) | Coded A | Coded B | Rep 1 (MPa) | Rep 2 (MPa) | Rep 3 (MPa) | Mean Ȳ | s² (Variance) |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 230°C | 40 mm/s | −1 | −1 | 28.4 | 27.9 | 28.8 | 28.37 | 0.205 |
| 2 | 260°C | 40 mm/s | +1 | −1 | 33.1 | 34.0 | 33.5 | 33.53 | 0.203 |
| 3 | 230°C | 80 mm/s | −1 | +1 | 31.2 | 30.5 | 31.8 | 31.17 | 0.423 |
| 4 ★ | 260°C | 80 mm/s | +1 | +1 | 38.6 | 39.2 | 38.9 | 38.90 | 0.090 |
Step 2 — Visualise the Design Space
Plot the four treatment means on a 2D square. Each corner is one combination. The response values immediately reveal the pattern — and hint at whether an interaction exists.
Step 3 — Calculate the Three Effects
Every 2² factorial has exactly three estimable effects: Main Effect A, Main Effect B, and Interaction AB. The formula is always the same: Effect = (average of high-level runs) − (average of low-level runs).
Ȳ(A+) = (33.53 + 38.90)/2 = 36.22
Ȳ(A−) = (28.37 + 31.17)/2 = 29.77
Ȳ(B+) = (31.17 + 38.90)/2 = 35.03
Ȳ(B−) = (28.37 + 33.53)/2 = 30.95
Ȳ(++) + Ȳ(−−) = (38.90+28.37)/2 = 33.64
Ȳ(+−) + Ȳ(−+) = (33.53+31.17)/2 = 32.35
Step 4 — Test for Statistical Significance
An effect that looks large might just be noise. The decision limit (DL) separates real effects from random variation. Any effect whose absolute value exceeds DL is statistically significant.
= √((0.205+0.203+0.423+0.090)/4)
= √(0.230) = 0.480 MPa
= 0.480 × √(4/12)
= 0.480 × 0.577 = 0.277 MPa
= (3 − 1) × 4 = 8 df
= 2.306 × 0.277
= ±0.639 MPa
| Effect | Calculated Value | |Value| | Decision Limit | Significant? | Engineering Conclusion |
|---|---|---|---|---|---|
| A — Temperature | +6.45 MPa | 6.45 | ±0.639 | ✓ YES | Temperature is the dominant factor. Run at 260°C. |
| B — Injection Speed | +4.08 MPa | 4.08 | ±0.639 | ✓ YES | Speed matters. Run at 80 mm/s. |
| AB — Interaction | +1.29 MPa | 1.29 | ±0.639 | ✓ YES | Synergy: A+B+ together gives extra benefit. |
Step 5 — The Interaction Plot (Most Important Graph in DOE)
When an interaction is significant, the main effects alone are misleading. Plot the response at each combination — one line per level of Factor B. Non-parallel lines = interaction. Crossing lines = strong interaction where the best level of A depends on B.
Step 6 — The Prediction Equation & Optimal Settings
C_A = E(A)/2 = 6.45/2 = 3.225
C_B = E(B)/2 = 4.08/2 = 2.040
C_AB = E(AB)/2 = 1.29/2 = 0.645
Grand mean = (28.37+33.53+31.17+38.90)/4 = 32.99
Ŷ = 32.99 + 3.225A + 2.040B + 0.645AB
= 32.99 + 3.225 + 2.040 + 0.645
= 38.90 MPa ✓ (matches Run 4)
= 32.99 + 1.613 + 2.040 + 0.323
= 36.97 MPa
Run at 260°C melt temperature and 80 mm/s injection speed. Both main effects are significant and positive. The positive interaction (AB = +1.29) means the two factors work better together than the sum of their individual effects — there is a genuine synergy at the high-high combination. Setting A=+1, B=+1 gives the maximum predicted strength of 38.90 MPa — a 37% improvement over the worst combination (28.37 MPa at A−B−).
3-Factor Experiments — Full, Half, Quarter & Plackett-Burman
Adding a third factor multiplies complexity but unlocks far more information. A 2³ full factorial estimates 7 effects from 8 runs. When resources are limited, fractional designs cut runs in half (or more) by making smart aliasing trade-offs. This tab works one engineering study through all four design types so you can see exactly what each gives you — and what each costs you.
A process engineer is investigating solder joint shear strength (N) on a circuit board assembly line. Three process factors are suspected. The goal is to identify which factors matter and set them to maximise strength. Response: joint shear strength (N). Objective: Maximise.
Weak solder joints cause field failures. One-factor-at-a-time testing found that increasing temperature helped — but only sometimes. That inconsistency is the signature of an interaction. DOE will find it.
Design Matrix & Data — Full Factorial
| Std Order | A (Temp) | B (Speed) | C (Flux) | AB | AC | BC | ABC | Y₁ (N) | Y₂ (N) | Ȳ |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | − | − | − | + | + | + | − | 41.2 | 40.8 | 41.00 |
| 2 | + | − | − | − | − | + | + | 49.6 | 50.2 | 49.90 |
| 3 | − | + | − | − | + | − | + | 43.1 | 42.5 | 42.80 |
| 4 | + | + | − | + | − | − | − | 55.8 | 56.4 | 56.10 |
| 5 | − | − | + | + | − | − | + | 44.3 | 43.9 | 44.10 |
| 6 | + | − | + | − | + | − | − | 52.1 | 51.7 | 51.90 |
| 7 | − | + | + | − | − | + | − | 45.6 | 46.2 | 45.90 |
| 8 ★ | + | + | + | + | + | + | + | 60.3 | 61.1 | 60.70 |
Calculating All 7 Effects
For any effect, the formula is: Effect = (average of Ȳ where that column = +) − (average of Ȳ where that column = −). Use the sign column for each effect:
| Effect | + Runs (means) | Avg(+) | − Runs (means) | Avg(−) | Effect Value | |Effect| |
|---|---|---|---|---|---|---|
| A — Temperature | 49.90, 56.10, 51.90, 60.70 | 54.65 | 41.00, 42.80, 44.10, 45.90 | 43.45 | +11.20 N | 11.20 |
| B — Speed | 42.80, 56.10, 45.90, 60.70 | 51.38 | 41.00, 49.90, 44.10, 51.90 | 46.73 | +4.65 N | 4.65 |
| C — Flux Type | 44.10, 51.90, 45.90, 60.70 | 50.65 | 41.00, 49.90, 42.80, 56.10 | 47.45 | +3.20 N | 3.20 |
| AB — Temp × Speed | 41.00, 56.10, 44.10, 60.70 | 50.48 | 49.90, 42.80, 51.90, 45.90 | 47.63 | +2.85 N | 2.85 |
| AC — Temp × Flux | 41.00, 49.90, 45.90, 60.70 | 49.38 | 42.80, 56.10, 44.10, 51.90 | 48.73 | +0.65 N | 0.65 |
| BC — Speed × Flux | 41.00, 49.90, 45.90, 60.70 | 49.38 | 42.80, 56.10, 44.10, 51.90 | 48.73 | −0.25 N | 0.25 |
| ABC — 3-way | 49.90, 42.80, 44.10, 60.70 | 49.38 | 41.00, 56.10, 51.90, 45.90 | 48.73 | +0.15 N | 0.15 |
A half fraction runs 4 of the 8 full factorial runs. We choose which 4 by defining a generator: C = AB. This means column C is the same as column AB — so we cannot tell C apart from the AB interaction. This is called aliasing.
A ↔ BC (A is aliased with BC)
B ↔ AC (B is aliased with AC)
C ↔ AB (C is aliased with AB)
Half Fraction Design Matrix (Runs 1,4,6,7 from Full Factorial)
| Run | A | B | C=AB | Y₁ (N) | Y₂ (N) | Ȳ | Note |
|---|---|---|---|---|---|---|---|
| 1 | − | − | + | 44.3 | 43.9 | 44.10 | A−B−C+ |
| 2 | + | − | − | 49.6 | 50.2 | 49.90 | A+B−C− |
| 3 | − | + | − | 43.1 | 42.5 | 42.80 | A−B+C− |
| 4 ★ | + | + | + | 60.3 | 61.1 | 60.70 | A+B+C+ |
= 55.30 − 43.45 = +11.85 N
= 51.75 − 47.00 = +4.75 N
= 52.40 − 46.35 = +6.05 N
A quarter fraction uses ¼ of the full factorial runs. For 3 factors, a quarter fraction would be only 2 runs — not useful. Quarter fractions become practical at 5+ factors: a 2⁵ full factorial needs 32 runs, but a 2⁵⁻² needs only 8 runs.
Generator 1: D = AB
Generator 2: E = AC
Defining relation: I = ABD = ACE = BCDE
B ↔ AD ↔ CDE
C ↔ AE ↔ BDE
D ↔ AB ↔ ABCDE→...
E ↔ AC ↔ BCDE→...
| Design | Factors | Runs | Resolution | What you can estimate | What's aliased |
|---|---|---|---|---|---|
| Full 2ᵏ | k | 2ᵏ | Full | All main effects AND all interactions | Nothing — complete information |
| Half 2ᵏ⁻¹ | k | 2ᵏ/2 | III or IV | All main effects (if Res IV); some 2FIs | Some 2FIs aliased with each other (Res IV) or with main effects (Res III) |
| Quarter 2ᵏ⁻² | k | 2ᵏ/4 | III | All main effects (assuming 2FIs negligible) | Main effects aliased with 2FIs — screening only |
| Plackett-Burman | up to N−1 | 12, 20, 24… | III | All main effects | Each main effect partially confounded with ALL 2FIs not involving it |
Plackett-Burman designs are non-geometric screening designs: the run count is a multiple of 4 (not a power of 2). The 12-run PB can screen up to 11 factors — far more efficient than any 2ᵏ fractional design. The trade-off: each main effect is partially confounded with every two-factor interaction not containing that factor.
PB12 — Applied to Our Solder Study (Extended to 5 Factors)
We extend the solder study by adding 2 more factors: D = Preheat Time (30s vs 60s) and E = Board Orientation (flat vs angled). Now 5 factors. Full factorial = 32 runs. PB12 = 12 runs.
The PB12 is constructed by cycling this first row: + + − + + + − − − + −. Each subsequent row is a cyclic right-shift. Row 12 is all minuses.
| Run | A (Temp) | B (Speed) | C (Flux) | D (Preheat) | E (Orient) | F* | G* | H* | J* | K* | L* | Y (N) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | + | + | − | + | + | + | − | − | − | + | − | 53.2 |
| 2 | − | + | + | − | + | + | + | − | − | − | + | 44.8 |
| 3 | + | − | + | + | − | + | + | + | − | − | − | 58.1 |
| 4 | − | + | − | + | + | − | + | + | + | − | − | 42.3 |
| 5 | − | − | + | − | + | + | − | + | + | + | − | 43.1 |
| 6 | − | − | − | + | − | + | + | − | + | + | + | 40.5 |
| 7 | + | − | − | − | + | − | + | + | − | + | + | 51.9 |
| 8 | + | + | − | − | − | + | − | + | + | − | + | 55.6 |
| 9 | + | + | + | − | − | − | + | − | + | + | − | 59.4 |
| 10 | − | + | + | + | − | − | − | + | − | + | + | 46.2 |
| 11 | + | − | + | + | + | − | − | − | + | − | + | 57.3 |
| 12 | − | − | − | − | − | − | − | − | − | − | − | 39.8 |
Effect of any factor = (mean of Y where that column is +) − (mean of Y where that column is −):
| Factor | + Runs | Mean(+) | Mean(−) | Effect Estimate | Screening Decision |
|---|---|---|---|---|---|
| A — Temperature | 1,3,7,8,9,11 | 55.92 | 43.12 | +12.80 N | ✓ INCLUDE — large, positive |
| B — Speed | 1,2,4,8,9,10 | 50.25 | 48.47 | +5.15 N | ✓ INCLUDE — moderate |
| C — Flux Type | 2,3,5,9,10,11 | 51.48 | 47.22 | +4.26 N | Borderline — follow up |
| D — Preheat Time | 1,3,4,6,10,11 | 49.73 | 49.03 | +0.70 N | Not significant — set by convenience |
| E — Orientation | 1,2,5,7,9,11 | 51.32 | 47.48 | +3.84 N | Moderate — check in follow-up |
Choosing Your Design — Decision Framework
Screening & Fractional Factorial Designs
When you have many potential factors, running a full factorial is impractical — a 2⁷ design requires 128 runs. Screening designs let you study 5–15+ factors in far fewer runs by deliberately aliasing (confounding) higher-order interactions with main effects. The goal is to identify the vital few factors that drive most of the variation, then follow up with a focused optimisation study.
Half-Fraction: 2⁷⁻⁴ Screening Design — 8 Runs for 7 Factors
A plastics injection moulding team suspects 7 process variables affect warpage. A full 2⁷ requires 128 runs — weeks of production time. A 2⁷⁻⁴ Resolution IV design needs only 8 runs.
| Factor | Label | Low (−1) | High (+1) |
|---|---|---|---|
| Melt Temperature | A | 220°C | 260°C |
| Injection Speed | B | 60 mm/s | 100 mm/s |
| Hold Pressure | C | 40 MPa | 80 MPa |
| Hold Time | D | 5 s | 15 s |
| Cooling Time | E | 10 s | 25 s |
| Gate Size | F | Small | Large |
| Mould Temp | G | 30°C | 60°C |
The 2⁷⁻⁴ design uses a base 2³ design in A, B, C — then assigns D=AB, E=AC, F=BC, G=ABC. This gives Resolution IV: all main effects are free of two-factor interactions.
| Run | A | B | C | D=AB | E=AC | F=BC | G=ABC | Warpage (mm) |
|---|---|---|---|---|---|---|---|---|
| 1 | −1 | −1 | −1 | +1 | +1 | +1 | −1 | 0.42 |
| 2 | +1 | −1 | −1 | −1 | −1 | +1 | +1 | 0.61 |
| 3 | −1 | +1 | −1 | −1 | +1 | −1 | +1 | 0.38 |
| 4 | +1 | +1 | −1 | +1 | −1 | −1 | −1 | 0.55 |
| 5 | −1 | −1 | +1 | +1 | −1 | −1 | +1 | 0.47 |
| 6 | +1 | −1 | +1 | −1 | +1 | −1 | −1 | 0.58 |
| 7 | −1 | +1 | +1 | −1 | −1 | +1 | −1 | 0.44 |
| 8 | +1 | +1 | +1 | +1 | +1 | +1 | +1 | 0.72 |
Calculating Main Effect Estimates
Each main effect = (average of high runs − average of low runs). For factor A (Melt Temperature):
| Factor | Effect Estimate | Abs. Effect | Verdict |
|---|---|---|---|
| A — Melt Temperature | +0.188 | 0.188 | ★ Active |
| B — Injection Speed | +0.023 | 0.023 | Inert |
| C — Hold Pressure | −0.053 | 0.053 | Inert |
| D — Hold Time | +0.148 | 0.148 | ★ Active |
| E — Cooling Time | −0.118 | 0.118 | Marginal |
| F — Gate Size | +0.018 | 0.018 | Inert |
| G — Mould Temp | +0.033 | 0.033 | Inert |
Plackett-Burman Designs
Plackett-Burman (PB) designs are Resolution III screening designs that study up to N−1 factors in N runs, where N is a multiple of 4 (12, 20, 24, 28…). They are more economical than fractional factorials for large factor counts but have complex aliasing — every main effect is partially aliased with every 2-factor interaction not involving that factor.
| Design | Runs | Max Factors | Resolution | Best Used For |
|---|---|---|---|---|
| PB-12 | 12 | 11 | III | Rapid screening; 2FI negligible assumption |
| PB-20 | 20 | 19 | III | Large screening studies |
| 2⁴⁻¹ | 8 | 4 | IV | 4-factor screening; 2FI estimable with follow-up |
| 2⁵⁻² | 8 | 5 | III | 5-factor screening; main effects only |
| 2⁶⁻² | 16 | 6 | IV | 6-factor study; cleaner aliasing than PB |
| 2⁷⁻³ | 16 | 7 | IV | 7-factor screening with good resolution |
| 2⁷⁻⁴ | 8 | 7 | IV | Maximum economy; 7 factors in 8 runs |
Design Selection Decision Guide
- ▸ You want clean, interpretable aliasing
- ▸ You may need Resolution IV or V
- ▸ Factor count is modest (4–8 factors)
- ▸ You anticipate a follow-up optimisation study
- ▸ You have 9–19 factors to screen
- ▸ Resources are very limited
- ▸ 2-factor interactions are expected to be small
- ▸ You only need to identify the vital few factors
- Always randomise run order to protect against lurking time trends.
- Add 2–4 centre points to check for curvature without inflating run count.
- Use a half-normal plot to visually separate active effects from noise.
- If a 2FI is important, upgrade to Resolution V or run a follow-up fold-over.
- Screen first, optimise second — never skip directly to RSM on 8+ factors.
Taguchi Methods
Genichi Taguchi developed a system for improving quality by designing processes that are robust — insensitive to noise factors like temperature drift, humidity, and raw material variation. His philosophy: it is cheaper to design robustness in than to control every noise factor in production.
Signal-to-Noise (S/N) Ratios
The S/N ratio is the primary Taguchi response metric. A higher S/N = a more robust product/process. The formula depends on the optimization objective.
| Objective | S/N Formula | Use When | Example |
|---|---|---|---|
| Smaller is Better | S/N = −10 log(Σy²/n) | Defects, contamination, noise, corrosion, error | Minimise weight loss in corrosion test |
| Larger is Better | S/N = −10 log(Σ(1/y²)/n) | Strength, yield, throughput, efficiency | Maximise bond strength, chemical yield |
| Nominal is Best | S/N = 10 log(ȳ²/s²) | Dimensional tolerances, target values | Hit target wall thickness of 3.0mm ± 0.1 |
| Ordered Categorical | S/N based on scores | Attribute data with ranked categories | Defect severity: none / minor / major / critical |
Taguchi Orthogonal Arrays
Taguchi developed standardised balanced designs called orthogonal arrays (L4, L8, L9, L12, L16…). The notation L₈(2⁷) means: 8 runs, up to 7 factors, each at 2 levels.
| Array | Runs | Max Factors | Levels | Has Interaction Table? | Best Use |
|---|---|---|---|---|---|
| L4 | 4 | 3 | 2 | Yes | Quick 3-factor screen; plastic sealing example |
| L8 | 8 | 7 | 2 | Yes (27) / No (14×24) | Standard 2-level screening; steel heat-treat example |
| L9 | 9 | 4 | 3 | No | 3-level factors; plastic processing with 4 factors |
| L12 | 12 | 11 | 2 | No | Main effects only; interactions roughly distributed to all columns |
| L16 | 16 | 15 | 2 | Yes | Large 2-level screening |
Accuracy vs Precision — Taguchi's Starting Point
Mixture Designs
Mixture designs are used when the factors are components of a mixture that must sum to a constant (typically 100% or 1.0). The response depends on the proportions of ingredients, not their absolute amounts. Common in chemicals, food, pharmaceuticals, and polymer formulation.
Key constraint: x₁ + x₂ + x₃ + … = 1. Because of this constraint, standard factorial designs cannot be used directly — you cannot independently vary all components. The feasible experimental region is a simplex (triangle in 3D, tetrahedron in 4D).
| Design Type | Points Included | Model Fitted | Use When |
|---|---|---|---|
| Simplex Design | Vertices only (pure components) | Linear | First screening — assume no blend effects |
| Simplex Centroid | Vertices + midpoints + centroid | Quadratic / Cubic | When blend synergism or antagonism is likely |
| Simplex Lattice | Evenly spaced grid across simplex | Polynomial (degree q) | Space-filling coverage; complex response surfaces |
| Extreme Vertices | Constrained vertices + centroid | Quadratic / special cubic | When components have upper/lower bounds (real formulations) |
Blown Film Example: A polymer film is made from three components (A, B, C) that must total 100%. The team runs a three-component quadratic simplex design and measures tensile strength. The model identifies the optimal blend ratio that maximises strength — something impossible to find with OFAT or standard factorial designs.
DOE Quick Reference — Exam Summary
Design Selection Guide
Key Formulas at a Glance
| Quantity | Formula | Notes |
|---|---|---|
| Main Effect E(A) | E(A) = Ȳ(A+) − Ȳ(A−) | Average response at high level minus average at low level |
| Std dev of experiment | sₑ = √(Σs²/k) | k = number of runs; s² = variance per run |
| Std dev of effects | sEff = sₑ × √(4/n) | n = total number of trials |
| Degrees of freedom | df = (obs/run − 1) × runs | If obs/run − 1 = 0, use multiplier of 1 |
| Decision limit | DL = t(α/2, df) × sEff | Effects outside ±DL are statistically significant |
| F-test (variances) | F = s²_larger / s²_smaller | Larger variance always in numerator → one-tail test |
| Nonlinearity effect | E(NL) = Ȳ_center − Ȳ_grand | Significant → linear model invalid; need ≥3 levels |
| Residual | Res = Y_observed − Y_predicted | Used for residual analysis in unreplicated designs |
Common Pitfalls to Avoid
| Trap | Correct Understanding |
|---|---|
| Repeat vs Replication | Repeat = same conditions, no new setup (does NOT estimate experimental error). Replication = independent new setup (DOES estimate error). |
| When interaction is significant | The interaction plot is MORE important than the main effect plots. Main effects describe averages; the interaction describes the joint effect. |
| Hierarchy rule | If an interaction AB is significant, include both main effects A and B in the model — even if A or B alone are not significant. |
| Significant nonlinearity | If center points show significant nonlinearity, the linear model is invalid and you cannot interpolate. Must repeat with ≥3 levels. |
| Variation vs Mean analysis | A factor can be insignificant for the mean but critically important for reducing variation. Always run both analyses. |
| Resolution III designs | Main effects are aliased with 2-factor interactions. You can identify which factors matter, but you cannot separate main effects from interactions. |
| Randomisation purpose | Randomisation protects against unknown time-related trends. It is the "insurance policy" — not optional. |
| OFAT advantage claimed | OFAT CANNOT detect interactions between factors. This is a fundamental limitation, not a minor one. DOE is always better when interactions are possible. |
| Factor C (ramp time) in yield example | C was not significant for the mean, but was critical for reducing variation. The "diamond factor" — rare and extremely valuable. |
DOE Procedural Checklist (10 Practical Rules)
Design for Six Sigma (DFSS)
DFSS is not an improvement methodology — it is a design methodology. Where DMAIC fixes a broken process, DFSS builds the right process from scratch. Used when you are creating something new: a product, a service, a manufacturing line. The goal is to design quality in, not inspect it out.
What is DFSS — and when do you use it?
DFSS answers one question: "How do we build a product that is right first time, every time, at the right cost?" It is not a repair kit. It is a design philosophy applied before a single part is cut.
- Designing a completely new product or service
- Existing process cannot meet new requirements
- Entering a new market or technology domain
- Customer requirements are not yet fully understood
- Target sigma level is ≥ 4.5σ from the start
- An existing process is underperforming
- Root cause is unknown but process exists
- Incremental improvement is the goal
- Product design is already locked
- Defect rate needs reduction in current production
The Four DFSS Methodologies — Side by Side
DFSS is not one framework — it is a family. Different industries and organisations use different variants. All share the same core philosophy.
| Methodology | Phases | Best For | Origin |
|---|---|---|---|
| DMADV | Define · Measure · Analyse · Design · Verify | New product or process design — the most widely taught | GE, Motorola |
| IDOV | Identify · Design · Optimise · Validate | Hardware-heavy design; aerospace, automotive | Six Sigma Academy |
| DMADOV | Define · Measure · Analyse · Design · Optimise · Verify | Complex multi-stage designs needing explicit optimisation loop | Honeywell |
| CDOV | Concept · Design · Optimise · Verify | Product platform design, systems engineering | Creveling |
Which should you use? DMADV is the best starting point — it maps cleanly to the Six Sigma belt structure, has the richest toolset documentation, and is recognised across industries. This module teaches DMADV throughout, with notes on where the others differ.
DFSS vs DMAIC — The Core Difference
You have a process. It is producing defects. You investigate, find root causes, implement solutions. Improvement happens on an existing platform.
You have a customer need. Nothing exists yet. You translate that need into requirements, generate concepts, select and optimise the best one, then validate it meets the requirements.
The 70% rule: It is widely cited that 70–80% of a product's quality and cost is determined at the design stage. DFSS is the methodology that addresses this window — before tooling is cut, before supply chains are locked, before the cost of change becomes prohibitive.
The DMADV Roadmap — Phase by Phase
Each phase of DMADV has a clear deliverable, a gate review question, and a defined set of tools. You cannot progress to the next phase without answering the gate question. This is what keeps DFSS honest.
01
What you do: Establish project scope, business case, customer segments, and high-level requirements. Define what success looks like in measurable terms.
02
What you do: Translate Voice of Customer into Critical to Quality (CTQ) characteristics. Benchmark competitors. Establish target performance levels with measurable specifications.
03
What you do: Generate multiple design concepts. Use structured methods to evaluate and select the best. Identify critical design parameters and their relationships to CTQs.
04
What you do: Develop the detailed design. Run DOE to optimise critical parameters. Apply tolerance design and Design for Manufacture/Assembly (DFM/DFA). Predict capability.
05
What you do: Validate the design against customer requirements using prototypes and pilot runs. Confirm predicted capability with real data. Hand off to production with full control plan.
Voice of Customer — From Feedback to Specification
VOC is the most underinvested step in most organisations. Teams rush to design solutions before truly understanding the problem. DFSS forces you to slow down here — because every hour spent understanding customers saves ten hours of redesign later.
Step 1 — Gather VOC Data
- Customer interviews (structured)
- Focus groups
- Field observation (Gemba)
- Prototype feedback sessions
- Warranty & complaint data
- Online reviews mining
- Sales team feedback
- Regulatory requirements
- Teardown analysis
- Patent landscape
- Benchmarking studies
- Industry standards review
Step 2 — Kano Model: Not All Requirements Are Equal
The Kano model sorts customer requirements into three categories. Knowing which category each requirement falls into prevents over-engineering the basics and missing the delighters.
Expected basics. Their presence doesn't delight — their absence causes immediate rejection. Example: a car must start reliably.
More is better. Directly proportional to satisfaction. Example: fuel economy — customers always want more.
Not expected, but creates strong positive reaction. Example: automatic parking — customers didn't ask, but love it.
Step 3 — CTQ Tree: Translate Words into Numbers
A CTQ tree converts vague customer language into specific, measurable engineering requirements. Each branch goes from customer need → driver → specification.
"I need to know the pump is working correctly"
Alarm reliability
Alarm response ≤ 2 seconds, 100% of the time
Step 4 — QFD: Linking Customer Needs to Design Parameters
Quality Function Deployment (QFD) — also called the House of Quality — ensures every engineering decision can be traced back to a customer requirement. It prevents the classic trap of designing what is technically elegant rather than what is actually needed.
| Customer Need | Importance (1–5) | Design Parameter | Relationship | Target |
|---|---|---|---|---|
| Light weight | ⭐⭐⭐⭐⭐ 5 | Enclosure material density | Strong (9) | ≤ 1.5 g/cm³ |
| Accurate dosing | ⭐⭐⭐⭐⭐ 5 | Pump mechanism tolerance | Strong (9) | ±0.5% dose accuracy |
| Long battery life | ⭐⭐⭐⭐ 4 | Motor efficiency | Medium (3) | ≥ 72 hr at standard rate |
| Alarm is audible | ⭐⭐⭐⭐ 4 | Speaker output power | Strong (9) | ≥ 75 dB at 1 m |
Concept Design — Generating and Selecting the Best Idea
This is where most engineers spend too little time. The quality of your final design is bounded by the quality of your concept space. If you evaluate only one concept, you are not designing — you are just executing an assumption.
Morphological Chart — Systematic Concept Generation
A morphological chart forces you to decompose the design problem into independent sub-functions and generate alternatives for each. Combining one option from each row creates a unique concept.
| Sub-function | Option A | Option B | Option C |
|---|---|---|---|
| Power source | Rechargeable Li-ion | Disposable alkaline | Mains powered |
| Pump mechanism | Peristaltic | Syringe driver | Rotary gear |
| Display type | LCD numeric | OLED graphic | LED indicator only |
| Alarm | Audible buzzer | Vibration + audible | Wireless to receiver |
| Housing material | ABS plastic | Polycarbonate | Aluminium alloy |
The above chart yields 3⁵ = 243 possible concepts. You don't evaluate all of them — you use engineering judgment to select 3–5 promising combinations for formal comparison.
Pugh Concept Selection — Structured Comparison Against a Datum
The Pugh matrix evaluates concepts against criteria using a datum (reference concept, often the current design or market leader). Scores: + (better), − (worse), S (same).
| Criterion | Weight | Datum (Concept A) | Concept B | Concept C | Concept D |
|---|---|---|---|---|---|
| Weight | 5 | D | + | S | + |
| Battery life | 4 | D | S | + | − |
| Dose accuracy | 5 | D | + | S | + |
| Alarm clarity | 4 | D | S | + | S |
| Manufacturability | 3 | D | − | S | + |
| Weighted score | — | 0 | +14 | +13 | +11 |
The Pugh matrix does not give you the answer — it structures your thinking. Concept B scores highest, but notice its manufacturability weakness. The right response is not to blindly select B, but to ask: "Can we redesign B to address manufacturability while keeping its weight and accuracy advantages?"
Transfer Functions — Linking Design to CTQ
A transfer function is a mathematical relationship: CTQ = f(design parameters). You must establish this before running experiments. Without it, you cannot predict the effect of design changes.
Y = f(RPM, L, A, η)
Each parameter becomes a factor in the DOE. The transfer function tells you which factors matter most.
Design Optimisation — Finding the Best Settings
Once you have a chosen concept and transfer functions, you optimise. This means running designed experiments to find the factor settings that simultaneously maximise performance and minimise sensitivity to variation.
The Two-Step Optimisation Strategy (Taguchi)
Find the factor settings that make the output least sensitive to noise (uncontrollable variation). Use Signal-to-Noise ratio as the optimisation metric. Fix these settings first.
With variation minimised, use a scaling factor (a factor that affects mean but not variance) to move the mean to the target. This preserves the robustness gained in Step 1.
Signal-to-Noise Ratios — Choosing the Right One
| Characteristic | S/N Formula | When to use | Example |
|---|---|---|---|
| Smaller-the-Better | −10·log(Σy²/n) | Defect counts, vibration, shrinkage — zero is ideal | Dimensional deviation, leakage rate |
| Larger-the-Better | −10·log(Σ1/y²/n) | Strength, yield, life — more is always better | Tensile strength, battery life |
| Nominal-the-Best | 10·log(µ²/σ²) | Target value with symmetric tolerance | Shaft diameter, fill volume, resistance |
Response Surface Methodology (RSM)
When factors are continuous and you need to find an optimal point (not just compare levels), RSM maps the response across the design space. It answers: "At exactly what values of A and B is Y maximised?"
2ᵏ factorial + star points (±α) + centre points. Fits a full quadratic model. Best for 2–5 continuous factors. Rotatable — equal prediction variance at equal distance from centre.
Midpoints of cube edges + centre points. Never tests extreme corners — safer when extreme combinations are physically dangerous or impossible. Fewer runs than CCD for k ≥ 3.
The RSM optimum is not the same as "maximise the CTQ." You optimise Value minus Cost. A material that gives 5% better strength but costs 40% more may not be the right choice. Always include cost in the optimisation objective.
Tolerance Design and Variation Management
Tolerances are not free. Too tight — manufacturing cost explodes. Too loose — the product fails in the field. Tolerance design finds the optimal balance using statistical methods rather than engineering gut feel.
Tolerance Stack-Up Analysis
When multiple components assemble together, their individual dimensional variations combine. The question is: what is the probability the assembly falls within its specification?
Guarantees 100% of assemblies work, but assumes all parts are at their worst-case limits simultaneously. Very conservative — drives unnecessarily tight component tolerances.
Accounts for the fact that all parts being at worst-case simultaneously is extremely unlikely. Allows looser component tolerances for the same assembly yield. Requires knowledge of σᵢ per component.
Propagation of Variance — The Design Engineer's Formula
If the CTQ (Y) is a function of multiple input variables (X₁, X₂, ...), how does variation in the inputs propagate to variation in Y?
The partial derivative (∂Y/∂Xᵢ) is the sensitivity coefficient — how much Y changes per unit change in Xᵢ. Squared and multiplied by the variance of Xᵢ.
Practical insight: The sensitivity coefficient squared means that the dominant source of variation in Y is often one or two inputs with high sensitivity — not all inputs equally. Focus tolerance investment on the highest-sensitivity parameters.
Monte Carlo Simulation for Tolerance Verification
When the transfer function is complex or non-linear, analytical propagation is difficult. Monte Carlo simulation draws random values from each input distribution, computes Y, and builds up a Y distribution from thousands of trials.
- Define distributions for each input (X₁, X₂, ...) — mean and std dev from capability data
- Randomly sample one value from each input distribution
- Compute Y using the transfer function
- Record the Y value. Repeat 10,000+ times.
- The resulting Y distribution gives you predicted Cpk, % out-of-spec, and percentiles
Monte Carlo answers the question your tolerance stack-up cannot: "What is the actual predicted yield of this assembly design, given real component capability data?" Use it before committing to tooling.
Verify — Confirming the Design Works in the Real World
Verification is not the last step — it is the proof that all previous steps were done correctly. A strong Verify phase should produce no surprises. If it does, it means the Analyse or Design phases were incomplete.
Verification vs Validation — Know the Difference
"Did we build it right?"
Confirms the design meets its specifications. Compares actual measurements to design targets. Typically done on prototypes and pre-production units.
"Did we build the right thing?"
Confirms the design meets customer needs in real use conditions. Typically done with real users in real environments. Answers the VOC question from Phase 1.
Capability Confirmation — The Ppk Requirement
The pilot run is your first real capability data. Minimum requirement: Ppk ≥ 1.67 for new designs going to production (some industries require ≥ 2.00). Calculate Ppk — not Cpk — because Ppk includes all sources of long-term variation.
| Index | Formula | What it tells you | Target |
|---|---|---|---|
| Cp | (USL−LSL)/(6σ̂) | Potential: does the spec window fit the process? | ≥ 2.00 for new design |
| Cpk | min(Cpu, Cpl) | Short-term actual: centred and capable? | ≥ 1.67 for new design |
| Ppk | min(Ppu, Ppl) using s_total | Long-term actual: including all drift and shifts | ≥ 1.33 in production |
Design Scorecard — Closing the Loop
Every CTQ identified in Measure must be verified in this phase. The design scorecard maps each requirement to its measured result.
| CTQ | Target | Tolerance | Measured | Ppk | Status |
|---|---|---|---|---|---|
| Dose accuracy | 0% deviation | ±0.5% | ±0.31% | 1.82 | ✓ Pass |
| Weight | 750 g | ≤ 800 g | 763 g | — | ✓ Pass |
| Alarm response | 1.2 s | ≤ 2.0 s | 1.4 s | 2.1 | ✓ Pass |
| Battery life | 80 hr | ≥ 72 hr | 77 hr | — | ⚠ Monitor |
DFSS Toolbox — When to Use What
| Phase | Tool | Purpose | Output |
|---|---|---|---|
| Define | Project Charter | Scope, timeline, team, business case | Signed charter document |
| SIPOC | High-level process map | Scope boundaries | |
| VOC methods | Capture customer language before interpreting it | Raw VOC statements | |
| Measure | Kano model | Classify requirements by type | Kano chart |
| CTQ tree | Translate VOC to measurable specs | CTQ specifications with LSL/USL | |
| QFD / House of Quality | Link customer needs to engineering parameters | Prioritised design parameters | |
| Analyse | Morphological chart | Systematic concept generation | Concept alternatives |
| Pugh matrix | Structured concept selection | Winning concept with rationale | |
| Design FMEA | Identify design failure risks early | Risk register + mitigation actions | |
| Design | Screening DOE | Identify the vital few factors | Significant factors list |
| Taguchi / Robust design | Minimise sensitivity to noise | Robust parameter settings | |
| RSM / CCD | Find optimal factor settings | Contour plots, optimal point | |
| Tolerance design | Allocate tolerances statistically | Component tolerance targets | |
| Verify | Pilot run Ppk study | Confirm capability in production | Ppk ≥ 1.33 |
| MSA / GR&R | Confirm measurement system is adequate | %GR&R ≤ 10% | |
| Design scorecard | Close the loop on every CTQ | Pass/fail per requirement |
Full Project Walkthrough: Designing a Smart Water Meter
Follow one product through the complete DMADV process — from customer complaint to production-ready design. This is the kind of project a Black Belt would lead over 6–9 months.
A utility company wants to replace 500,000 mechanical water meters with smart digital meters over 5 years. Current meters have a 12% annual replacement rate due to reading errors, jamming, and battery failure. The project team must design a new smart meter that customers trust and engineers can manufacture to ≥ 4.5σ.
Business case: 12% replacement rate × 500,000 meters × £85/replacement = £5.1M/year avoidable cost. Reducing to 2% saves £4.1M/year. Project charter signed. Team: 1 Black Belt, 2 Green Belts, design engineer, manufacturing engineer, customer service lead.
New meter design only — no installation process
9 months to pilot, 18 months to full launch
Annual replacement rate ≤ 2% within 3 years
VOC gathered from 80 interviews (householders, plumbers, meter readers, utility managers). Top themes:
| Customer Voice | Kano Type | CTQ Specification |
|---|---|---|
| "I need to trust the reading is accurate" | Must-Be | Reading accuracy ±0.5% of actual volume |
| "It should last without maintenance" | Must-Be | Battery life ≥ 10 years at standard transmission rate |
| "I want to see my usage on my phone" | Performance | Data transmission ≤ 15 min latency, 99.5% uptime |
| "No leaks around the meter body" | Must-Be | IP68 rated — 1 m immersion for 30 min, zero leakage |
| "Easy to read without bending down" | Delighter | Remote reading via app — no physical access needed |
Three concepts generated from morphological chart, then evaluated in Pugh matrix:
| Concept | Flow sensor | Comms | Battery | Pugh score |
|---|---|---|---|---|
| A — Ultrasonic (datum) | Ultrasonic | LoRaWAN | Li-thionyl | 0 (datum) |
| B — Magnetic | Magnetic | NB-IoT | Li-thionyl | −7 |
| C — Ultrasonic + NB-IoT | Ultrasonic | NB-IoT | Li-SOCl₂ | +18 ✓ Selected |
Key insight from DFMEA: Ultrasonic transducer bond failure identified as top risk (RPN 280). Mitigation: change from adhesive bond to mechanical clamp with O-ring seal. RPN reduced to 48 after redesign.
DOE results: L9 Taguchi OA run on 4 factors (transducer gap, signal frequency, temperature compensation algorithm, housing wall thickness). Two CTQs measured: reading accuracy and signal strength.
| Factor | Effect on Accuracy | Effect on Signal | Optimal Setting |
|---|---|---|---|
| Transducer gap | Significant ✓ | Not significant | 8.5 mm ± 0.2 mm |
| Signal frequency | Significant ✓ | Significant ✓ | 1.0 MHz |
| Temp. compensation | Significant ✓ | Not significant | Algorithm v3 (quadratic) |
| Wall thickness | Not significant | Significant ✓ | 3.5 mm (min weight) |
Tolerance design: Monte Carlo simulation (10,000 runs) with production Cpk data from transducer supplier predicts assembly accuracy Ppk = 1.87 — exceeding the 1.67 target. Transducer gap tolerance tightened from ±0.5 mm to ±0.2 mm based on sensitivity analysis.
Pilot run: 200 units manufactured at supplier. Full measurement on all CTQs.
| CTQ | Target | Pilot Result | Ppk | Status |
|---|---|---|---|---|
| Reading accuracy | ±0.5% | ±0.28% avg | 1.93 | ✓ Pass |
| Battery life (projected) | ≥ 10 yr | 12.3 yr (accelerated test) | — | ✓ Pass |
| Transmission latency | ≤ 15 min | 4.2 min avg | — | ✓ Pass |
| IP68 seal integrity | Zero failures | 0/200 failures | — | ✓ Pass |
Project outcome: Design approved for full production. Projected annual replacement rate: 1.8% — below the 2% target. Estimated annual saving vs current state: £4.3M. Full deployment over 5 years. DFSS project closed.
DFSS Quick Reference
| Phase | Gate Question | Key Deliverable | Common Mistake |
|---|---|---|---|
| Define | Is this the right problem? | Signed project charter | Scope too broad — fix the scope first |
| Measure | Do we understand the customer? | CTQ specifications with LSL/USL | Going straight to solutions before completing VOC |
| Analyse | Is this the best concept? | Selected concept with rationale | Evaluating only one concept — not a selection |
| Design | Does the design meet targets? | Optimised design with predicted Cpk | Optimising mean without addressing variation |
| Verify | Is it ready for production? | Ppk ≥ 1.33 on all CTQs | Verifying on prototype, not production tooling |
10 Rules That Separate Good DFSS from Bad DFSS
- VOC before solutions. You cannot design the right thing if you haven't confirmed what "right" means to the customer.
- Measurable CTQs only. "Reliable" is not a CTQ. "Zero failures in 10 years at 95% confidence" is.
- At least 3 concepts. One concept is not a selection — it is an assumption with extra steps.
- Transfer functions before experiments. Know what you are testing and why before running a single trial.
- Optimise variation before mean. A process on target with high variance will drift off target. A robust process stays on target.
- Tolerance design is not the last step. Do it during Design, not after all decisions are made.
- Ppk, not Cpk, for verification. Cpk is a short-term study. Production will never be as controlled as a capability study.
- Design FMEA before prototype. Find failure modes on paper, not in the field.
- Gate reviews are not approval ceremonies. Each gate question must have a data-backed answer — not a slide.
- DFSS ends at design handoff, not project close. Track production Ppk for 3 months post-launch to confirm predictions.
Advanced: Strategic Experimentation & Value Engineering
This section covers H.E. Cook's DFSS as Strategic Experimentation (SE) approach — a powerful extension that translates experimental results directly into financial projections. Used by teams who need to connect engineering decisions to boardroom metrics: price, market share, and cash flow.
The Three Fundamental Metrics
Cook's insight: in any competitive market, three conditions are always true about your current product. Use them as your strategic compass.
Improve attributes customers actually value
Reduce variable cost through design choices
Compress development cycle with DFSS
Your U must be ≥ your best competitor's U. Improve value, reduce cost, and speed up innovation simultaneously.
Value Curves — Quantifying What Customers Will Pay
The value curve V(g) answers: "If we improve this attribute by X%, how much more will the customer pay?" This converts engineering decisions into price and demand projections.
Interior dimensions, shaft diameter. Ideal is a specific target value. Value decreases if too high or too low.
Defects, vibration, noise. Ideal is zero. V(0) = maximum. Value decreases monotonically.
Fuel economy, battery life, strength. Ideal is infinity. V increases with attribute — diminishing returns.
From Experiment to Cash Flow — The Lambda Framework
Lambda (λ) coefficients connect experimental results to financial outcomes. Each λ tells you: "What is the projected change in value, cost, or cash flow if this factor changes from baseline to its experimental level?"
X is the design matrix, Y is the vector of experimental outcomes, λ̂ gives the projected effect of each factor on each strategic outcome
The full SE methodology — including Monte Carlo cash-flow simulation, Cournot-Bertrand pricing, and the DV survey method — is mathematically rigorous and beyond most DFSS projects. It is most valuable in oligopoly markets where small value improvements translate to large market share shifts. Reference: H.E. Cook, Design for Six Sigma as Strategic Experimentation (ASQ Quality Press).
Military & Defense Quality Standards
Key U.S. military and NATO defense quality standards — with full coverage of MIL-STD-1916 (DoD Preferred Methods for Acceptance of Product), including all sampling tables, worked examples, and switching rules.
MIL-STD-1916 — DoD Preferred Methods for Acceptance of Product
Published 1 April 1996. The fundamental philosophy shift: away from AQL-based detection (sampling to find defects) toward prevention-based quality systems (SPC, process control, continuous improvement).
The core philosophy (Foreword §7): "Contractors are responsible for establishing their own manufacturing and process controls. Contractors are expected to use recognized prevention practices such as process controls and statistical techniques." Sampling inspection alone does not control or improve quality — it is redundant when effective process controls exist.
Two Acceptance Paths
Submit a prevention-based quality system as alternate to sampling. Must demonstrate:
- Documented quality system plan
- Process focus (SPC, FMEA, PDCA evidence)
- Objective evidence of effectiveness
- Cpk: Critical≥2.00, Major≥1.33, Minor≥1.00
Use the prescribed sampling plans indexed by Verification Level and Code Letter. Three plan types:
- Table II — Attributes (lot/batch)
- Table III — Variables (lot/batch)
- Table IV — Continuous attributes
Verification Levels (VL-I through VL-VII)
VL prescribes the level of significance of a characteristic. VL-VII = highest effort (most critical), VL-I = lowest. Specified in the contract or product specifications.
| VL | Significance | Attributes n (CL-A lot) | Variables n (CL-A lot) |
|---|---|---|---|
| VII (Tightened T) | Highest / Critical | 3072 | 113 |
| VII | Critical | 1280 | 87 |
| VI | Very high | 512 | 64 |
| V | High | 192 | 44 |
| IV | Moderate | 80 | 29 |
| III | Standard | 32 | 18 |
| II | Below standard | 12 | 9 |
| I (Reduced R) | Minimum | 5 | 4 |
Critical Characteristic Requirements (§4.4)
For each critical characteristic, the contractor MUST implement an automated screening or fail-safe manufacturing operation AND apply sampling plan VL-VII to verify performance. When a critical nonconformance is found at any phase:
- Immediately prevent delivery to Government
- Notify Government representative
- Identify the cause
- Take corrective action
- Screen ALL available units
Zero tolerance on critical characteristics. No AQL exists for critical characteristics in MIL-STD-1916 — the acceptance criterion is zero nonconformances, reinforced by automated screening.
📋 Key Definitions (§3)
Critical Characteristic
Must be met to avoid hazardous conditions OR to assure tactical function of major systems (aircraft, tank, missile).
Major Characteristic
Must be met to avoid failure or material reduction of usability. One step below critical.
Minor Characteristic
Departure not likely to reduce usability materially. Least stringent.
Verification Level (VL)
VL-VII = highest sampling effort. VL-I = lowest. Set by contract.
Production Interval
Period of continuous sampling assumed homogeneous quality. Normally a single shift, max one day.
Cpk Thresholds (§4.1.2b)
Critical: ≥2.00 Major: ≥1.33 Minor: ≥1.00 — required for alternate acceptance method.
New to acceptance sampling? The Sampling Theory tab explains OC curves, AQL, RQL, producer/consumer risk, and the mathematics behind these tables — read it first for the full picture.
MIL-STD-1916 Sampling Tables
Three matched plan types — all indexed by VL and Code Letter. The Code Letter (CL) is determined from lot size using Table I.
Table I — Code Letters by Lot Size and VL
| Lot Size | VL-VII | VL-VI | VL-V | VL-IV | VL-III | VL-II | VL-I |
|---|---|---|---|---|---|---|---|
| 2–170 | A | A | A | A | A | A | A |
| 171–288 | A | A | A | A | A | A | B |
| 289–544 | A | A | A | A | A | B | C |
| 545–960 | A | A | A | A | B | C | D |
| 961–1,632 | A | A | A | B | C | D | E |
| 1,633–3,072 | A | A | B | C | D | E | E |
| 3,073–5,440 | A | B | C | D | E | E | E |
| 5,441–9,216 | B | C | D | E | E | E | E |
| 9,217–17,408 | C | D | E | E | E | E | E |
| 17,409–30,720 | D | E | E | E | E | E | E |
| 30,721+ | E | E | E | E | E | E | E |
Table II — Attributes Sampling (Zero Acceptance)
Acceptance criterion: zero nonconformances in the sample. If any found → reject lot.
| CL | T (Tightened) | VII | VI | V | IV | III | II | I | R (Reduced) |
|---|---|---|---|---|---|---|---|---|---|
| A | 3072 | 1280 | 512 | 192 | 80 | 32 | 12 | 5 | 3 |
| B | 4096 | 1536 | 640 | 256 | 96 | 40 | 16 | 6 | 3 |
| C | 5120 | 2048 | 768 | 320 | 128 | 48 | 20 | 8 | 3 |
| D | 6144 | 2560 | 1024 | 384 | 160 | 64 | 24 | 10 | 4 |
| E | 8192 | 3072 | 1280 | 512 | 192 | 80 | 32 | 12 | 5 |
Table III — Variables Sampling (k and F Criteria)
| CL | T | VII | VI | V | IV | III | II | I | R |
|---|---|---|---|---|---|---|---|---|---|
| Sample sizes (nv) | |||||||||
| A | 113 | 87 | 64 | 44 | 29 | 18 | 9 | 4 | 2 |
| B | 122 | 92 | 69 | 49 | 32 | 20 | 11 | 5 | 2 |
| C | 129 | 100 | 74 | 54 | 37 | 23 | 13 | 7 | 2 |
| D | 136 | 107 | 81 | 58 | 41 | 26 | 15 | 8 | 3 |
| E | 145 | 113 | 87 | 64 | 44 | 29 | 18 | 9 | 4 |
| k values (one- or two-sided) | |||||||||
| A | 3.51 | 3.27 | 3.00 | 2.69 | 2.40 | 2.05 | 1.64 | 1.21 | 1.20 |
| E | 3.76 | 3.51 | 3.27 | 3.00 | 2.69 | 2.40 | 2.05 | 1.64 | 1.21 |
| F values (two-sided double spec only) | |||||||||
| A | .136 | .145 | .157 | .174 | .193 | .222 | .271 | .370 | .707 |
| E | .128 | .136 | .145 | .157 | .174 | .193 | .222 | .271 | .370 |
Variables Acceptance Criteria (§5.2.2.2.3)
(USL − x̄) / s ≥ k
QU = (U − x̄) / s
Switching Rules — Normal / Tightened / Reduced
Inspection intensity is not fixed — it responds to demonstrated supplier quality history. Good history earns reduced sampling. Poor performance triggers tightened inspection.
Switching Rules — Detailed Criteria
| Transition | Trigger (Lot/Batch) | Additional Requirement |
|---|---|---|
| Normal → Tightened | 2 lots withheld within last 5 lots | — |
| Tightened → Normal | 5 consecutive lots accepted | Cause for nonconformances corrected |
| Normal → Reduced | 10 consecutive lots accepted | Steady production rate + Govt. approval |
| Reduced → Normal | Any 1 lot withheld | OR: irregular production, unsatisfactory QS |
| Discontinuation | Stays tightened (repeated fails) | Govt. may halt all acceptance |
When sampling restarts after discontinuation, it begins at tightened inspection — not normal. Switching procedures are applied independently for each group of characteristics or individual characteristic.
Worked Examples from MIL-STD-1916 Appendix
Example 1 — Attributes Sampling (Wing Nuts, VL-IV)
Inspection for missing thread. VL-IV specified. Table II, attributes plan. Lot sizes vary.
| Lot # | Lot Size | CL | Sample n | NCRs Found | Disposition | Stage | Action |
|---|---|---|---|---|---|---|---|
| 1 | 5,000 | D | 160 | 2 | Withhold | N | Start at normal VL-IV |
| 2 | 900 | A | 80 | 0 | Accept | N | — |
| 3 | 3,000 | C | 128 | 1 | Withhold | N | 2/5 fail → switch to Tightened |
| 4 | 1,000 | B | 256 | 0 | Accept | T | — |
| 5 | 1,000 | B | 256 | 0 | Accept | T | — |
| 6 | 900 | A | 192 | 0 | Accept | T | — |
| 7 | 2,000 | C | 320 | 0 | Accept | T | — |
| 8 | 2,500 | C | 320 | 0 | Accept | T | 5 consec. pass → back to Normal |
| 9 | 3,000 | C | 128 | 0 | Accept | N | — |
| 10 | 5,000 | D | 160 | 0 | Accept | N | — |
Example 2 — Variables, Single-Sided Spec (VL-I)
Maximum operating temperature = 209°F on a circuit board relay. Lot of 40 units. VL-I specified, CL-A → nv = 4, k = 1.64 (from Table III).
Step 2 — x̄ = (197+188+184+205) ÷ 4 = 193.5 °F
Step 3 — s = √[Σ(xᵢ−x̄)² ÷ (n−1)] = √(265÷3) = 9.399
Step 4 — Quality Index Q = (USL − x̄) ÷ s
Q = (209 − 193.5) ÷ 9.399 = 1.649
Step 5 — Compare Q ≥ k: 1.649 ≥ 1.64 ✅
Example 3 — Variables, Double-Sided Spec (VL-I)
Same relay batch. Temperature must stay within 180–209°F. Same 4 measurements: 197, 188, 184, 205. Both QL ≥ k and F criterion must be satisfied.
= (193.5 − 180) / 9.399
= 1.436
vs k = 1.64 → ✗ FAIL
= (209 − 193.5) / 9.399
= 1.649
vs k = 1.64 → ✅ PASS
Table F value at VL-I, CL-A = 0.370
Check: F ≤ F_table → 0.324 ≤ 0.370 ✅ PASS
Example 4 — Continuous Sampling (Spot Welds, VL-II)
CL-C, VL-II → i=116 (clearance number), f=1/48 (sampling frequency).
| Item # | Action | Stage |
|---|---|---|
| 1 | Start 100% screening. i=116. | N |
| 8 | Found defective unit — reset counter. | N |
| 124 | 116 consecutive conforming units cleared → begin sampling f=1/48 | N |
| 9,697 | 200 consecutive conforming sampled → switch to Reduced f=1/68 | R |
| 13,982 | Production interval tripled → CL-C to CL-E, f=1/136 | R |
| 16,290 | Nonconforming unit found → switch to Normal, restart screening i=228 | N |
| 16,518 | 228 consecutive conforming cleared → sampling f=1/96 | N |
Key Military & Defence Standards — Deep Reference
Beyond MIL-STD-1916, six standards define how defence contractors predict, test, and manage reliability and safety. Each one has a direct commercial equivalent — knowing both is essential for cross-sector work.
MIL-HDBK-217F — Reliability Prediction of Electronic Equipment
Published 1991. The DoD's framework for predicting failure rates of electronic components and systems during design. Two prediction methods exist — choose based on design maturity.
Used in early design when full stress analysis isn't possible. Requires: component quantities, generic quality level, and use environment. Quick and conservative.
Used for detailed design when actual operating stresses are known. More accurate but requires thermal, electrical, and environmental stress data per component.
Base failure rate: λ_b = 0.0012 failures/10⁶ hours
Temperature factor: πT = 2.8 (85°C junction temp)
Environment factor: πE = 4.0 (GM Ground Mobile)
Quality factor: πQ = 1.0 (MIL-R-11 qualified)
─────────────────────────────────────────────
λ_p = 0.0012 × 2.8 × 4.0 × 1.0 = 0.01344 failures/10⁶ hrs
MTBF = 1 / λ_p = 74.4 million hours (single resistor)
πE is the dominant multiplier — ground mobile environment is 4× more harsh than ground benign. Reducing operating temperature from 85°C → 55°C would cut πT from 2.8 → 1.4, halving the failure rate.
MIL-STD-1629A — FMECA (Failure Mode Effects & Criticality Analysis)
The military extension of commercial FMEA. Adds a quantitative Criticality Number and a Criticality Matrix that plots every failure mode visually by severity and probability. Required on all major defence system acquisitions.
| I — Catastrophic | Death / system loss |
| II — Critical | Severe injury / major damage |
| III — Marginal | Minor injury / minor damage |
| IV — Negligible | No injury / negligible damage |
β = conditional prob of loss
α = failure mode ratio
λp = part failure rate
t = operating time
Each failure mode is plotted by severity category (x-axis) vs criticality number Cm (y-axis). Modes in the upper-left require immediate design action.
| Failure Mode | Severity | β | α | λp ×10⁻⁶ | t (hrs) | Cm | Priority |
|---|---|---|---|---|---|---|---|
| Seal leak → loss of pressure | I | 0.9 | 0.35 | 4.2 | 2000 | 2.646 | 🔴 Redesign |
| Caliper piston stick | II | 0.7 | 0.20 | 3.1 | 2000 | 0.868 | 🟡 Action |
| Brake fade under load | III | 0.5 | 0.30 | 2.8 | 2000 | 0.840 | 🔵 Monitor |
| Warning light false trigger | IV | 1.0 | 0.15 | 5.0 | 2000 | 1.500 | 🟢 Accept |
Note: A Severity I mode always demands action regardless of Cm value. High Cm on a Severity IV mode (warning light) is acceptable — it's a nuisance, not a safety hazard.
MIL-STD-882E — System Safety
The DoD system safety standard. Required for all acquisitions. Defines hazard identification, risk assessment, and risk management for hardware, software, and human factors. Risk = f(Severity, Probability).
| Probability | Cat I Catastrophic |
Cat II Critical |
Cat III Marginal |
Cat IV Negligible |
|---|---|---|---|---|
| A — Frequent | 1 High | 2 High | 5 Med | 10 Low |
| B — Probable | 2 High | 3 High | 6 Med | 11 Low |
| C — Occasional | 3 High | 4 Med | 7 Low | 14 Low |
| D — Remote | 4 Med | 8 Low | 12 Low | 16 Low |
| E — Improbable | 5 Med | 9 Low | 13 Low | 17 Low |
High risk (red) = Unacceptable — programme stop until mitigated. Medium = Acceptable with senior approval. Low = Acceptable with programme manager approval.
The On-Board Inert Gas Generation System (OBIGGS) prevents fuel tank explosions by replacing ullage with nitrogen-enriched air. Under MIL-STD-882E, a failure of OBIGGS is Severity Cat I (catastrophic — fuel tank explosion). Probability was classified as D (Remote) given redundant sensors and pre-flight checks. Risk rating: 4 (Medium). The programme invested in a secondary inerting monitor to reduce probability to E (Improbable), moving the risk to 5 (Medium) — still requiring senior approval. This drove the system architecture decision to add the backup monitor.
MIL-STD-810H — Environmental Engineering & Laboratory Tests
The definitive environmental testing standard — now used extensively in commercial product ruggedisation (laptops, phones, industrial equipment) not just defence. 29 test methods covering every environmental stress a product might encounter.
| Method | Test | Typical Conditions | Real-World Stress |
|---|---|---|---|
| 500.6 | Low Pressure (Altitude) | 70,000 ft equivalent | Aircraft cargo bay, unpressurised |
| 501.7 | High Temperature | +71°C storage, +49°C operating | Desert deployment, vehicle interior |
| 502.7 | Low Temperature | −51°C storage, −32°C operating | Arctic operations, stratospheric |
| 507.6 | Humidity | 95% RH, 30 days cycling | Tropical jungle, ship deck |
| 509.7 | Salt Fog | 5% NaCl, 96 hrs | Naval/maritime environment |
| 510.7 | Sand & Dust | 1.06 g/m³ dust concentration | Middle East desert, helicopter downwash |
| 514.8 | Vibration | Tailored PSD per platform | Vehicle road, aircraft turbulence |
| 516.8 | Shock | Half-sine, sawtooth, trapezoidal | Rough handling, explosive nearby |
MIL-STD-785B — Reliability Programme for Systems & Equipment
The lifecycle reliability management standard. Defines the tasks, reviews, and evidence a contractor must demonstrate across programme phases from concept through production.
FRACAS (Failure Reporting, Analysis, and Corrective Action System) is the heart of MIL-STD-785B. Every failure in test or field must be formally reported, root-caused, and corrective action verified — creating a closed feedback loop that drives reliability growth throughout the programme.
AS9100D — Aerospace Quality Management System
ISO 9001 + 60+ aerospace-specific requirements. The entry ticket for Boeing, Airbus, Lockheed Martin, Northrop Grumman, and most tier-1 primes. Mandatory for the civil aerospace supply chain globally.
- First Article Inspection (FAI) per AS9102
- Foreign Object Damage/Debris (FOD) prevention
- Key Characteristics (KC) identification and control
- Configuration management requirements
- Counterfeit parts prevention (clause 8.1.4)
- On-time delivery as a quality metric
- AS9100D — Design & manufacture
- AS9110C — MRO / maintenance organisations
- AS9120B — Distributors / stockists
- Audited by IAQG-accredited CBs (BSI, Bureau Veritas, etc.)
- Certificate validity: 3 years with annual surveillance
Military Standards Quick Reference
| Standard | Topic | Commercial Equivalent | Status |
|---|---|---|---|
| MIL-STD-1916 | DoD Preferred Acceptance Methods | ISO 2859 / ANSI Z1.4 | Active (1996) |
| MIL-STD-785B | Reliability Program Mgmt | IEC 60300-2 | Active |
| MIL-HDBK-217F | Electronic Reliability Prediction | IEC TR 62380, Telcordia SR-332 | Active (frozen) |
| MIL-STD-1629A | FMECA | AIAG FMEA, SAE J1739 | Active |
| MIL-STD-105E | Attribute Acceptance Sampling | ANSI/ASQ Z1.4, ISO 2859 | Cancelled → use Z1.4 |
| MIL-STD-414 | Variables Acceptance Sampling | ANSI/ASQ Z1.9, ISO 3951 | Cancelled → superseded by 1916 |
| MIL-STD-45662A | Calibration Systems | ISO/IEC 17025, ISO 10012 | Cancelled → use ISO 17025 |
| MIL-STD-882E | System Safety | IEC 61508, SAE ARP4761 | Active |
| MIL-STD-810H | Environmental Testing | IEC 60068, RTCA DO-160 | Active |
| AS9100D | Aerospace QMS | ISO 9001 + Aerospace CSR | Active (Rev D) |
| AQAP-2110 | NATO Quality Assurance | ISO 9001 + NATO CSR | Active (Ed. 3) |
MIL-STD-1916 supersedes MIL-STD-414 and MIL-STD-1235 (single/multi-level continuous sampling). The key difference from MIL-STD-105E: 1916 has a zero-acceptance criterion (Ac=0 always) versus 105E's AQL-based accept numbers. 1916 is philosophically aligned with prevention and SPC; 105E was detection-based.
Acceptance Sampling Theory — Errors, AOQ, AOQL, ATI & Dodge-Romig
The mathematics behind acceptance sampling — understanding what happens to quality as lots pass through a sampling plan, and the trade-offs between producer and consumer risk.
Type I & Type II Errors — Producer's Risk vs Consumer's Risk
| Actual: Good Lot | Actual: Bad Lot | |
|---|---|---|
| Decision: Accept | ✅ Correct | ✗ Type II Error (β) |
| Decision: Reject | ✗ Type I Error (α) | ✅ Correct |
| Type I Error (α) | Type II Error (β) | |
|---|---|---|
| Name | Producer's risk | Consumer's risk |
| What happens | Good lot rejected — producer loses | Bad lot accepted — consumer receives defectives |
| Fire alarm analogy | False alarm — inconvenience | Missed fire — disaster |
| Control method | Fixed at pre-determined level (1%, 5%, 10%) | Controlled to <10% by appropriate sample size |
| Simple definition | Innocent declared guilty | Guilty declared innocent |
As α (producer's risk) increases (e.g. 0.01→0.05), β (consumer's risk) goes down — they trade off against each other. To reduce BOTH Type I and II errors simultaneously: increase the sample size.
RQL / LTPD — Rejectable Quality Level
The defect rate we want to reject a high proportion of the time (controlled by β, the consumer's risk).
Example: β = 0.10, RQL = 8% means: we would expect to accept lots with 8% defectives only 10% of the time maximum. Equivalently: 90% of lots at RQL quality will be rejected.
The OC Curve has three zones:
- ✅ Acceptable quality zone — near AQL, high P(accept)
- ⚠️ Indifferent zone — between AQL and RQL, intermediate P(accept)
- ✗ Rejectable quality zone — near RQL/LTPD, low P(accept)
Increasing n (sample size) steepens the OC curve — narrows the indifferent zone and brings it closer to the ideal step function.
Interactive OC Curve — See How n and Ac Shape Acceptance Probability
Adjust the sample size (n) and acceptance number (Ac) to see how the Operating Characteristic curve changes. A steeper curve gives sharper discrimination between good and bad lots — but costs more to inspect.
AOQ, AOQL & ATI Formulas
The average quality of outgoing product, accounting for the fact that rejected lots are screened 100% and returned perfect.
p = incoming defect rate, Pₐ = probability of acceptance, N = lot size, n = sample size
The maximum (worst) AOQ for a given sampling plan — the peak of the AOQ curve. As incoming quality deteriorates beyond AOQL, AOQ actually improves because more lots get rejected and 100% screened.
The Dodge-Romig sampling plan uses AOQL as its design criterion.
Total average number of pieces inspected per lot, combining the sample (from accepted lots) and 100% screening (from rejected lots).
ATI increases sharply as incoming quality deteriorates — minimising ATI is the design goal of Dodge-Romig.
Worked Example — AOQ Calculation
Sampling plan: N = 1,000, n = 80, Ac = 3. Incoming lot has 2% defectives.
AOQ = p × Pₐ = 0.02 × 0.921 = 0.0184 (1.84%)
ATI = n + (1−Pₐ)(N−n) = 80 + (1−0.921)(1000−80) = 80 + 0.079×920 = 80 + 72.7 = 152.7 pieces/lot
Interpretation: The average outgoing quality is 1.84% defective — slightly better than incoming (2%) because 8% of lots are 100% screened and returned perfect.
Inspection Levels — ANSI/ASQ Z1.4
| Level | Sample size | When to use |
|---|---|---|
| Level I | Smaller n | Less discrimination needed — use when lower risk, trusted supplier |
| Level II | Standard n | Default / normal use — used unless otherwise specified |
| Level III | Larger n | Greater discrimination — use for critical characteristics or new suppliers |
| S-1 to S-4 | Small n | Special levels — small sample sizes when large sampling risks are acceptable. S-4 > S-3 > S-2 > S-1 in sample size. |
Sample size relationship: n(Level III) > n(Level II) > n(Level I). A larger sample size steepens the OC curve — better discrimination between good and bad lots, but higher inspection cost. The relationship between lot size and sample size is defined in Table I (code letters A–R).
Dodge-Romig Sampling Plans
| Attribute | MIL-STD-105 / ANSI Z1.4 | Dodge-Romig |
|---|---|---|
| Basis | AQL — protects the producer | LTPD (consumer's risk) or AOQL — protects the consumer |
| Sampling types | Single, Double, Multiple | Single and Double only |
| Primary design goal | Ensure high-quality lots are accepted at a defined rate | Minimise ATI — minimise total inspection effort for a given quality protection level |
| Requires | AQL specification | Estimate of process average (from recent data). If unknown, use largest table value. |
| Example | AQL=1.5%, N=1000 → n=80, Ac=3 | AOQL=3%, N=1000, Process avg=1.5% → n=44, c=2, LQL=11.8% |
Dodge-Romig is the preferred plan when the consumer wants assurance that the outgoing quality will not exceed a stated limit (AOQL) regardless of incoming quality — ideal for critical product or safety-related items.
FMEA & RPN — Failure Mode & Effects Analysis
FMEA is the discipline of imagining every way something can go wrong — before it does. Two distinct types: Design FMEA catches failures born in the blueprint; Process FMEA catches failures born on the shop floor.
What is FMEA and Why Does It Matter?
FMEA forces you to think about failure proactively — before a customer finds it in the field, before a recall, before someone gets hurt. It is the bridge between design intent and production reality.
The core question for every item on the FMEA: "In what ways could this fail, what happens when it does, and what are we doing about it?"
10 = Safety/regulatory
10 = Inevitable
10 = No detection
DFMEA vs PFMEA — Two Different Questions
"Is the design itself capable of meeting its intended function under all expected use conditions?"
Owner: Design Engineering. Done during concept/development phase. Corrective actions = design changes.
"Can the manufacturing process consistently produce a conforming part without creating a defect?"
Owner: Manufacturing Engineering. Done pre-launch. Corrective actions = process controls, poka-yokes.
The RPN trap. Two different failure modes can share the same RPN yet have radically different risk profiles. S=10, O=1, D=1 (RPN=10) is a potential safety catastrophe; S=2, O=5, D=1 (RPN=10) is inconsequential. Always act on high Severity first, regardless of RPN.
🔑 When to Do FMEA
New design or product
DFMEA during concept phase when changes are cheap. PFMEA before production launch.
Design/process changes
Update affected FMEA rows whenever a change is made — even "minor" changes.
Field failure or warranty
Use FMEA to document and prevent recurrence. Add new failure modes discovered.
PPAP requirement
DFMEA (if design owner) + PFMEA both required for Level 3 PPAP submission.
Design FMEA (DFMEA) — Catching Failures in the Blueprint
DFMEA asks: "Even if we manufacture this perfectly, does the design itself do what it's supposed to do under all conditions?" It lives in the design engineer's world — materials, tolerances, geometry, load cases, wear-out mechanisms, edge cases.
DFMEA is required when the supplier owns the product design. If you are making a part to a customer drawing, a PFMEA is sufficient. If you designed the part, you must also do a DFMEA. Required for PPAP Level 3 when design responsibility is with the supplier.
The DFMEA Thought Process
- 1
Define the Function
For each component, state its intended function precisely. "Transmit torque of 50±2 Nm from input shaft to output shaft without slippage under 100,000 cycles at 80°C."
- 2
Identify Failure Modes
How could this component fail to perform its function? Examples: fracture, deformation, corrosion, excessive wear, loss of insulation, contact intermittent.
- 3
Determine Effects
What does the next-higher assembly experience when this fails? What does the customer ultimately experience? Rate Severity 1–10.
- 4
Find Root Causes
Design-level causes: insufficient material strength, wrong tolerance stack, inadequate surface finish spec, missing environmental protection, wrong material selection.
- 5
List Current Design Controls
Prevention: Design guidelines, material specs, analysis (FEA, fatigue). Detection: Prototype testing, DVP&R, simulation, inspection. Rate Occurrence and Detection.
- 6
Take Action & Re-evaluate
Implement design changes. Update specs, drawings, test plans. Recalculate RPN. Verify effectiveness.
Worked Example — EV Battery Cell Aluminium Casing
Function: Aluminium casing must contain electrolyte, withstand 50 bar internal pressure during thermal runaway, and maintain electrical isolation from adjacent cells for the 15-year vehicle life.
| Function | Failure Mode | Effect on Customer | S | Root Cause | O | Detection Control | D | RPN |
|---|---|---|---|---|---|---|---|---|
| Contain electrolyte | Casing crack / leak | Electrolyte contact → fire risk → vehicle loss | 10 | Wall thickness < 0.8 mm at weld seam; fatigue from thermal cycling | 3 | FEA fatigue analysis; 1,000-cycle pressure test at DVP stage | 2 | 60 |
| Withstand 50 bar pressure | Burst / catastrophic rupture | Thermal runaway propagation → vehicle fire | 10 | Insufficient yield strength spec; wrong alloy grade selected | 2 | Burst test per UL 2580; FEA pressure simulation; material cert review | 1 | 20 |
| Maintain electrical isolation | Dielectric breakdown | Cell-to-cell short → fire / BMS fault | 9 | Coating thickness < 20 µm at edges; holiday defects in anodising | 4 | Hi-pot test 100% incoming; SEM cross-section at sampling frequency | 3 | 108 |
| 15-year corrosion resistance | Pitting corrosion at weld | Gradual electrolyte seep → premature capacity fade | 6 | Wrong filler wire alloy in laser weld; porosity from humidity contamination | 4 | Salt spray test per ISO 9227; weld procedure qualification | 4 | 96 |
DFMEA Corrective Action Priority for this example: Row 3 (RPN=108, S=9) and Row 1 (RPN=60, S=10) are both flagged. The dielectric breakdown row is prioritised because S=9 AND the combined RPN is highest. Actions: increase anodising spec to ≥25 µm, add 100% hi-pot in design verification, update drawing callout.
Process FMEA (PFMEA) — Catching Failures on the Shop Floor
PFMEA asks: "Even with a perfect design, how could our manufacturing process build it wrong?" It lives in the process engineer's world — machines, operators, tooling, fixtures, parameters, environment, and measurement systems.
PFMEA is always required. Whether you own the design or not, you always own your process. PFMEA is linked directly to the Process Flow Diagram and drives the Control Plan — these three documents must be consistent with each other.
PFMEA is Linked to Three Documents
PFMEA Thought Process
- 1
Map Every Process Step
Start from the Process Flow Diagram. Each operation becomes one or more PFMEA rows. Be specific: "Laser weld casing" not just "welding."
- 2
State the Process Function
What is this step supposed to achieve? "Weld casing at 1.5 kW, 3.5 m/min to achieve ≥ 0.8 mm penetration with ≤ 0.1 mm porosity."
- 3
Identify Failure Modes
Ways the process step could go wrong: under-weld, over-weld, porosity, misalignment, wrong parameters, incorrect part seated, fixture worn.
- 4
Assess Effects
What is the impact on the next operation? On the final customer? Rate Severity. Separate internal (scrap/rework) from external (field failure).
- 5
Find Process Causes
Process-level causes (not design): machine wear, incorrect setup, operator error, wrong material lot, ambient temperature change, gage drift.
- 6
List Controls → Rate O and D
Prevention: SPC, poka-yoke, maintenance plan, training. Detection: 100% visual, CMM check, functional test, SPC chart. Rate Occurrence and Detection honestly.
Worked Example — Laser Weld Station (Battery Cell Casing)
Process Step: Laser weld aluminium casing lid to body. Process parameters: Power = 1.5 kW, Speed = 3.5 m/min, Focus offset = 0 mm. Key characteristic: weld penetration ≥ 0.8 mm, porosity ≤ 0.1 mm dia.
| Process Function | Failure Mode | Effect on Customer | S | Cause | O | Current Control | D | RPN |
|---|---|---|---|---|---|---|---|---|
| Weld penetration ≥ 0.8 mm | Under-penetration (< 0.8 mm) | Casing leak in field → electrolyte contact → fire | 10 | Laser power drift below threshold; focus offset shift; contaminated optic | 4 | SPC on laser power; cross-section destructive test 1/shift; weekly lens cleaning | 5 | 200 |
| Porosity ≤ 0.1 mm dia. | Excess weld porosity | Reduced seal strength → gradual leak → capacity fade | 7 | Surface contamination (oil, moisture); shielding gas flow low; wrong travel speed | 5 | X-ray inspection 100% per lot; shielding gas flow alarm; pre-clean station | 3 | 105 |
| Weld path alignment ±0.1 mm | Weld off-seam | Incomplete seal → early leak in service | 9 | Fixture wear > 0.05 mm; incorrect seam-tracking calibration | 2 | Vision system seam-tracker; fixture Cmk ≥ 1.67 validated monthly | 2 | 36 |
| Heat input to cell ≤ 80°C | Thermal damage to cell chemistry | Premature capacity degradation; early cell death | 8 | Excessive weld speed reduction; multiple re-welds; coolant system failure | 2 | Thermocouple on fixture; weld parameter lockout; coolant flow alarm | 2 | 32 |
Row 1 (RPN=200, S=10) demands immediate action. Recommended actions: ① Install inline laser power monitoring with automatic stop if power deviates >2% for >50 ms. ② Increase cross-section check from 1/shift to 1/100 units for 4 weeks until process is validated. ③ Add daily optics cleaning to PM schedule. Target RPN after actions: S=10, O=2, D=2 → RPN=40.
Interactive RPN Calculator
Drag the sliders to set Severity, Occurrence, and Detection. RPN and Action Priority update instantly. Remember: S = 9 or 10 always requires action, regardless of RPN.
Severity / Occurrence / Detection Rating Scales (AIAG PFMEA)
| S | Effect | Criteria |
|---|---|---|
| 10 | Hazardous — no warning | Safety issue, regulatory non-compliance. Failure without warning. |
| 9 | Hazardous — with warning | Safety issue. Failure with warning before occurrence. |
| 8 | Very High | System inoperable, loss of primary function. |
| 7 | High | System operable, reduced performance. Customer dissatisfied. |
| 6 | Moderate | System operable, comfort item inoperable. Customer discomfort. |
| 5 | Low | System operable, comfort item reduced performance. |
| 4 | Very Low | Fit/finish defect noticed by most customers (70%). |
| 3 | Minor | Fit/finish defect noticed by average customers (50%). |
| 2 | Very Minor | Defect noticed only by discriminating customers (25%). |
| 1 | None | No discernible effect whatsoever. |
| O | Probability | Approximate Rate |
|---|---|---|
| 10 | Very High | ≥ 1 in 2 |
| 9 | Very High | 1 in 3 |
| 8 | High | 1 in 8 |
| 7 | High | 1 in 20 |
| 6 | Moderate | 1 in 80 |
| 5 | Moderate | 1 in 400 |
| 4 | Moderate | 1 in 2,000 |
| 3 | Low | 1 in 15,000 |
| 2 | Low | 1 in 150,000 |
| 1 | Remote | ≤ 1 in 1,500,000 |
| D | Ability to Detect | Typical Control |
|---|---|---|
| 1 | Almost Certain | Proven poka-yoke — physically impossible to pass |
| 2 | Very High | 100% automated gauge with alarm & stop |
| 3 | High | 100% automated gauge, no automatic stop |
| 4 | Moderately High | SPC with immediate reaction plan |
| 5 | Moderate | SPC — operator reacts to out-of-control signal |
| 6 | Low | 100% manual inspection — variable attribute |
| 7 | Very Low | Random or double sampling only |
| 8 | Remote | Visual inspection only, no documented method |
| 9 | Very Remote | No detection control — will be found by end user |
| 10 | No Control | No inspection. Defect certain to reach customer. |
Detection scale is counter-intuitive: D=1 is best (certain to detect before customer), D=10 is worst (no control). The inverse scale trips people up constantly — lower Detection score means better controls. A poka-yoke that makes a defect physically impossible to produce gets D=1.
AIAG-VDA FMEA 2019 — What Changed and Why It Matters
The 2019 AIAG-VDA FMEA Handbook supersedes both AIAG FMEA 4th Edition and VDA Volume 4. It represents the most significant overhaul of automotive FMEA methodology in 25 years.
The Core Problem with Classic RPN
S=10, O=1, D=1 gives RPN=10
S=2, O=5, D=1 also gives RPN=10
The first case is a potential safety catastrophe. The second is trivial. Classic RPN treats them identically.
An Action Priority (AP) Table replaces the single RPN number. It uses a three-dimensional lookup — S, O, and D — to determine priority rather than simple multiplication.
S=9/10 always → High AP, regardless of O or D values.
Action Priority (AP) Categories
| Priority | Action Required | What it Means |
|---|---|---|
| High (H) | Mandatory action | Team MUST identify appropriate actions to improve prevention and/or detection. Management review required. Escalate if no actions identified. |
| Medium (M) | Action recommended | Team SHOULD identify improvement actions. Management discretion on whether to escalate. Document rationale if no action taken. |
| Low (L) | At team discretion | Team should consider improvement if easily achievable. Document rationale if no action taken. |
Key Changes in AIAG-VDA 2019 vs Classic FMEA
| Topic | Classic AIAG FMEA 4th Ed. | AIAG-VDA 2019 |
|---|---|---|
| Risk metric | Single RPN number (S×O×D) | Action Priority (AP) table — 3-dimensional |
| Severity 9/10 | May have low RPN, ignored | Always = High AP, always requires action |
| Process | 5 steps | 7 steps (added Planning & Preparation, Documentation) |
| Prevention vs Detection | Single "Current Controls" column | Separate: Prevention Controls + Detection Controls |
| New FMEA type | Not present | MSR — Monitoring & System Response (functional safety) |
| Failure chain | Mode → Effect → Cause | Structure Analysis → Function Analysis → Failure Analysis |
| Standard | AIAG FMEA 4th Ed. (2008) | AIAG-VDA FMEA Handbook (2019, joint) |
Transition Note: Many automotive OEMs are migrating to AIAG-VDA 2019 format and will begin requiring it in new PPAP packages. However, the traditional S×O×D RPN approach remains valid for non-automotive applications and is still widely used in military standards (MIL-STD-1629A), medical devices (ISO 14971), and aerospace (SAE ARP4761). When in doubt, confirm the customer's required FMEA format before starting.
Design & Development — ISO 9001:2015 §8.3
ISO 9001:2015 Section 8.3 establishes requirements for controlling the design and development of products and services. 80% of product costs are fixed at the design stage — making rigorous design control the highest-leverage quality activity.
ISO 9001:2015 §8.3 Structure
| Clause | Requirement | Key points |
|---|---|---|
| §8.3.1 | General | Establish, implement, and maintain a design and development process appropriate to ensure products meet requirements |
| §8.3.2 | Planning | Determine stages and controls, reviews, responsibilities, interfaces; consider nature, duration, and complexity |
| §8.3.3 | Inputs | Functional and performance requirements; statutory and regulatory requirements; previous similar designs; standards; potential failure consequences (FMEA, QFD, DFX, DFSS) |
| §8.3.4 | Controls | Reviews — evaluate results vs requirements (§8.3.4a); Verification — outputs meet inputs (§8.3.4c); Validation — product meets intended use (§8.3.4d) |
| §8.3.5 | Outputs | Meet input requirements; specify characteristics for provision; include acceptance criteria; identify critical characteristics |
| §8.3.6 | Changes | Identify, review, and control changes; review effects on constituent parts and already-delivered products |
Design Review vs Verification vs Validation
Evaluate ability of design results to meet requirements. Typically at 30%, 60%, 90% milestones. Multi-disciplinary for complex products. Areas: objectives, assumptions, alternatives, risks, budget, safety, maintainability.
Ensure design outputs meet design input requirements. "Are we building it right?" Checks design-to-spec conformance.
Ensure products meet requirements for intended use. "Are we building the right thing?" Tests against real-world customer use.
Design for X (DFX) — Design Excellence Disciplines
80% of product costs are fixed at the design stage. DFX disciplines optimise a specific aspect of the product. Note they sometimes conflict — integrated product development teams balance competing objectives.
| DFX | Full name | Primary objective | Key actions |
|---|---|---|---|
| DFM | Design for Manufacturing | Reduce manufacturing cost and difficulty | Reduce parts count; minimise fasteners; use standard parts (lower cost, shorter lead time, more reliable) |
| DFA | Design for Assembly | Ease and speed of assembly | Reduce parts; self-locating features; single-direction assembly |
| DFMaint | Design for Maintainability | Reduce downtime and maintenance cost | Easy access to serviceable parts; standardised replacement parts; reduced skill level; easy fault detection |
| DFR | Design for Reliability | Extend product useful life | Design for useful life; consider infant mortality and wear-out; remove weaknesses via FMEA; stress and derating |
| DFC | Design for Cost | Minimise total lifecycle cost | Use standard components; optimise tolerances; design for reuse and modularity |
| DFLog | Design for Logistics | Ease transport, storage, and tracking | Easy transport and storage; barcodes/traceability; standardisation; reusable packaging |
| DFEnv | Design for Environment | Minimise environmental impact | Design for repair, reuse, recycling; minimise hazardous materials; easy disassembly |
Design for Six Sigma (DFSS) — Methodologies
DFSS applies to new product/process design where no existing process exists to improve. Unlike DMAIC (which improves existing processes), DFSS builds quality in from concept.
| Define | Process/design goals; identify CTQs |
| Measure | Measure CTQ aspects; establish baseline |
| Analyse | Analyse designs; identify best alternatives |
| Design | Detail design of product or process |
| Verify | Verify the chosen design meets requirements |
| Define | Goals and customer needs |
| Measure | CTQs and performance gaps |
| Analyse | Design alternatives |
| Design | Detail the design |
| Optimise | Refine — parameter and tolerance design |
| Verify | Verify and validate the design |
| Identify | Voice of Customer; translate to CTQs |
| Design | Detail design of product or process |
| Optimise | Analyse and optimise design alternatives |
| Verify | Verify the chosen design |
IDOV explicitly starts with VOC — most customer-centric of the three DFSS methodologies
Technical Drawings, Tolerances & GD&T
Technical drawings are the universal language between design and manufacturing. The quality engineer must read drawings, understand tolerances, and interpret GD&T symbols — directly tested in the engineering practice.
1st Angle vs 3rd Angle Projection
| Attribute | 1st Angle (Europe / ISO) | 3rd Angle (USA / ASME) |
|---|---|---|
| Object position | Object in the first quadrant | Object in the third quadrant |
| View relationship | Object between observer and projection plane | Projection plane between observer and object |
| Projection plane | Non-transparent | Transparent |
| Top view placement | Top view placed below front view | Top view placed above front view |
| Standards | ISO / BS / DIN | ASME / ANSI |
Title Block Contents
| Mandatory | Additional |
|---|---|
| Organisation name/logo | Bill of materials |
| Drawing title & number | Notes & zone references (e.g. A5, B3) |
| Sheet & revision number | Finish / Weight / Heat treatment |
| Approvals (Prepared/Checked/Approved) | General tolerances |
| Units, scale, projection symbol | Surface roughness |
Engineering Drawing Line Types
| Line type | Purpose |
|---|---|
| Construction (light thin) | Auxiliary construction, projection lines |
| Outline (thick continuous) | Visible boundary of the object |
| Hidden (thin dashed) | Edges not visible from the current view |
| Centreline (chain) | Axis of symmetry, hole centres, pitch circles |
| Dimension line | Shows extent of a dimension with arrowheads |
| Break line (zigzag) | Object continues beyond drawn portion |
| Cutting plane (thick chain) | Defines plane of a section view |
| Hatch / Section lines | Material cross-section in section views |
Dimensioning Methods & Tolerance Fit Types
Dimensioning Methods
| Method | Description | Risk |
|---|---|---|
| Chain | Dimensions placed end-to-end | Tolerance accumulation / stack-up |
| Parallel | Multiple dimension lines all from same datum; no accumulation | More space required |
| Running | Parallel style but superimposed on one line; origin point marked | Can be harder to read |
MMC, LMC & Fit Types
| Term | Definition | Example |
|---|---|---|
| MMC | Maximum material within tolerance | Smallest hole, largest pin |
| LMC | Least material within tolerance | Largest hole, smallest pin |
| Clearance fit | Always space between mating parts | Sliding bearings |
| Interference fit | Parts always interfere — press/shrink fit | Press fits, permanent assembly |
| Transition fit | May be clearance or interference depending on actual dims | Locating fits |
GD&T — Geometric Dimensioning & Tolerancing (ASME Y14.5)
GD&T is a symbolic language (ASME Y14.5-2009) that defines geometry according to functional limits. It provides a universal language between supplier, checker, and buyer — eliminating ambiguity in conventional ± tolerances.
Datum Reference Frame & Degrees of Freedom
A Datum is a perfect theoretical point, line, or plane. A Datum Feature is the physical surface where the datum is located. Three perpendicular datum planes constrain all 6 degrees of freedom:
| Datum | DOF constrained | Running total |
|---|---|---|
| Primary (A) | 3 (3 rotations) | 3 of 6 |
| Secondary (B) | 2 (2 translations) | 5 of 6 |
| Tertiary (C) | 1 (last translation) | 6 of 6 — fully constrained |
GD&T Characteristic Categories
| Category | Characteristics (ASME Y14.5) |
|---|---|
| Form | Flatness, Straightness, Circularity, Cylindricity |
| Orientation | Angularity, Perpendicularity, Parallelism |
| Location | True Position, Concentricity, Symmetry |
| Runout | Circular Runout, Total Runout |
| Profile | Profile of a Line, Profile of a Surface |
Flatness example: A glass sheet 1000×500mm with flatness 0.2mm means the entire surface must lie within two parallel planes separated by 0.2mm — independent of the ±5mm size tolerance.
Robust Design & Signal-to-Noise Ratios
Robust design improves quality by minimising the effects of variation without eliminating the causes. Taguchi's SNR ratios identify control factor settings that make the product insensitive to noise.
Control Factors vs Noise Factors
| Type | Definition | Examples |
|---|---|---|
| Control Factors | Can be set and controlled by the engineer | Welding: electrode type, position, preheat |
| Outer Noise | Consumer use conditions — difficult/expensive to control | Temperature, humidity, vibration, UV |
| Inner Noise | Product deterioration over time | Rusting, oxidation, wear, degradation |
| Between-Product Noise | Piece-to-piece variation | Dimensional variation, material property variation |
Three Design Stages (Taguchi)
Select the best design concept from alternatives using feasibility and technology benchmarking.
Identify control factor settings that maximise SNR — make the product insensitive to noise. Uses orthogonal arrays. This is where Taguchi's method adds the most value.
Tighten tolerances only where necessary — reduces cost by avoiding unnecessarily tight tolerances everywhere.
Signal-to-Noise Ratio — What It Means Visually
SNR is the ratio of useful signal (what you want) to noise (what you don't want). A higher SNR means the product's response is dominated by the intended behaviour, not by variation. Taguchi's insight: maximise SNR always — regardless of whether the goal is smaller, larger, or on-target.
Three SNR Formulas — With Visual Context
The goal is always to maximise SNR. Taguchi unified three different engineering objectives into one consistent framework by choosing formulas where the maximum SNR always corresponds to the desired outcome.
Ideal = 0 or minimum. Wear, defects, contamination, shrinkage, response time.
Ideal = maximum. Tensile strength, yield, fuel efficiency, pull force, adhesion.
Specific target value. Dimensions, resistance, weight, temperature, voltage output.
Key insight: All three SNR formulas use log base 10 (decibels). A higher SNR always means a more robust product. The sign convention ensures that maximising SNR always corresponds to the engineering objective — this is Taguchi's elegant unification of the three cases.
Risk Management
A structured approach to identifying, analyzing, and responding to uncertainty. Covers risk definitions (ISO 31000 & ISO 9000:2015), the full 5-step risk management process, qualitative and quantitative analysis tools including the Probability & Impact Matrix, and response strategies for both negative and positive risks.
Risk: Definitions & Key Concepts
Risk has two authoritative definitions in the quality engineering world. Understanding the nuance between them — and how they relate to opportunities and issues — is fundamental to every practitioner.
Effect of uncertainties on objectives
The enterprise risk management standard. Broad definition applicable to any organization at any level — strategic, operational, project, or product.
Effect of uncertainty
The quality management vocabulary standard. An effect is a deviation from the expected — positive or negative. Risk is characterized by potential events, consequences, and their likelihood of occurrence.
| Term | Definition | Key Distinction |
|---|---|---|
| Risk | Effect of uncertainty on objectives; can be positive or negative | Future event — has not yet occurred |
| Opportunity | A positive risk — uncertainty with a favorable effect on objectives | You want to maximize these; exploit them |
| Issue | A risk that has already occurred | No longer a future uncertainty — it is a current problem requiring immediate response |
| Threat | A negative risk — uncertainty with an unfavorable effect on objectives | You want to minimize, transfer, or avoid these |
| Risk Appetite | The amount and type of risk an organization is willing to pursue or accept | Set by leadership; informs prioritization thresholds |
| Residual Risk | The risk remaining after risk responses have been implemented | Even after mitigating, some risk always remains |
Risk vs. Issue: Risk = future potential event. Issue = risk that has materialised. Once a risk occurs, it transitions to an issue and requires a workaround or corrective action, not a contingency plan.
Why Take Risk?
There is always a balance between risk and reward. Managing risk means finding the optimal point — not eliminating all risk.
Generally true — but not always. Higher risk does not guarantee higher reward. Smart risk management seeks better returns per unit of risk taken.
The goal is more rewards with less risk — achieved through systematic identification, analysis, and response planning.
ISO 9000:2015 — Key Nuances
- ▸An effect is a deviation from the expected — positive or negative
- ▸Risk is often characterized by reference to potential events and consequences, or a combination
- ▸Risk is often expressed as a combination of consequences × likelihood
- ▸The word "risk" is sometimes used only for negative consequences — but ISO 9000 explicitly includes positive effects
Risk Management: The 5-Step Process
Risk management is the identification, assessment, and prioritization of risks (positive or negative) followed by coordinated and economical application of resources to minimize, monitor, and control the probability and/or impact of unfortunate events — or to maximize the realization of opportunities.
| Step | Process | Key Activities | Output |
|---|---|---|---|
| 1 | Plan Risk Management | Define risk terms; define roles & responsibilities; select tools & templates; establish how to identify, analyze, respond, monitor risks | Risk Management Plan |
| 2 | Identify Risks | Systematic, methodic group process involving management, employees, customers, and other stakeholders; use brainstorming, FMEA, SWOT, Ishikawa | Risk Register |
| 3 | Analyze Risks | Qualitative (P&I Matrix — quick, subjective) and/or Quantitative (EMV, Monte Carlo, Decision Tree — detailed, analytic); prioritize risks | Prioritized Risk List |
| 4 | Plan Risk Response | For negative risks: Avoid, Mitigate, Transfer, Accept. For positive risks: Exploit, Enhance, Share, Accept. Assign risk owners. | Risk Response Plan |
| 5 | Monitor & Control Risks | Periodically review risk register; identify new risks; close resolved risks; conduct risk audits; handle unexpected risks with workarounds | Updated Risk Register; Workarounds |
Risk Management is iterative, not sequential: Although presented as 5 steps, risk management is a continuous loop. New risks emerge throughout a project or product lifecycle. The risk register is a living document that must be reviewed regularly — not created once and filed away.
Step 1 Detail: Plan Risk Management
- ✦ Risk-related terms and definitions
- ✦ Roles and responsibilities (Risk Owner concept)
- ✦ Tools and templates for risk management
- ✦ Probability & impact scales to be used
- ✦ Risk thresholds (what score triggers action)
- ✦ Identify risks (who, when, tools)
- ✦ Analyze risks (qualitative and/or quantitative)
- ✦ Plan risk responses (owners, strategies)
- ✦ Monitor and control risks (frequency, triggers)
Step 2: Identify Risks
Risk identification is a systematic and methodic process best performed in a group environment. A wide range of stakeholders participate — management, employees, customers, and other interested parties. The output is a Risk Register listing all identified risks.
Key Characteristics
- ▸Systematic and methodic — not ad hoc
- ▸Best done in a group environment
- ▸Involves wide range of stakeholders
- ▸Identifies both positive and negative risks
- ▸Iterative — risks can emerge at any time
Who Participates?
Tools for Risk Identification
| Tool | Type | How It's Used for Risk ID | Best For |
|---|---|---|---|
| Brainstorming | Group technique | Most common approach; free-form idea generation in a group; facilitator captures all risks without judgment | All risk identification; starting point for any risk session |
| Ishikawa Diagram | Cause & Effect | Systematically explores causes across categories (Man, Machine, Method, Material, Environment, Measurement) | Process risks; identifying root-cause risk categories |
| Flow Diagram | Process mapping | Map the process; identify each step where something could go wrong — inputs, outputs, handoffs, decision points | Operational and process risks; supply chain risk |
| SWOT Analysis | Strategic tool | Strengths, Weaknesses, Opportunities, Threats; internal and external risk identification | Strategic and organizational risk; positive risks (opportunities) |
| FMEA | Failure analysis | Systematically identifies failure modes and their effects; each failure mode is a potential risk | Product/process design risks; manufacturing risks |
| Checklist / Historical Data | Historical reference | Review lessons learned from previous projects/products; use industry-standard risk checklists | Repeatable processes; established product lines |
| Expert Interviews / Delphi | Expert elicitation | Individual or structured group interviews; Delphi uses iterative anonymous surveys to converge on consensus | Novel technologies; unique or high-stakes projects |
The Risk Register
The risk register is the primary output of the Identify Risks process. It is a living document that is updated throughout all subsequent risk management steps.
| Risk Register Field | Description |
|---|---|
| Risk ID | Unique identifier for each risk |
| Risk Description | Clear statement of the risk event and its potential cause and effect |
| Risk Category | Classification (Technical, Schedule, Cost, Scope, External, etc.) |
| Probability Score | Likelihood of occurrence (added during Analyze step) |
| Impact Score | Consequence severity if risk occurs (added during Analyze step) |
| Risk Score | Probability × Impact (added during Analyze step) |
| Risk Owner | Person responsible for monitoring and responding to this risk |
| Response Strategy | Planned approach (Avoid/Mitigate/Transfer/Accept or Exploit/Enhance/Share/Accept) |
| Response Actions | Specific actions to implement the chosen strategy |
| Status | Active / Closed / Occurred (became an Issue) |
Step 3: Analyze Risks
Risk analysis prioritizes identified risks so that resources and attention can be focused on the highest-priority items. There are two main approaches: qualitative and quantitative.
Quick and easy to perform. Uses descriptive or ordinal scales. Subjective by nature but valuable for initial prioritization when data is limited or time is short.
- ✦ Fast and cost-effective
- ✦ Subjective judgment
- ✦ Uses rating scales (Low/Medium/High or 1–9)
- ✦ Primary tool: Probability & Impact Matrix
- ✦ Good for all risks as initial screen
Detailed and time-consuming. Uses numerical data to produce a statistical analysis of risk impact. Analytic, data-driven, and defensible.
- ✦ Requires real data or estimates
- ✦ Objective and numeric
- ✦ Tools: EMV Analysis, Monte Carlo, Decision Tree
- ✦ Used for high-priority risks (from qualitative screen)
- ✦ Provides probability distributions of outcomes
Quantitative Analysis Tools
| Tool | Description | When to Use |
|---|---|---|
| Expected Monetary Value (EMV) | EMV = Probability × Impact ($). Calculates the expected financial value of a risk. Positive EMV = opportunity; Negative EMV = threat. Sum all EMVs to get overall risk exposure. | Cost/benefit decisions on risk responses; comparing alternative responses; setting contingency reserves |
| Monte Carlo Analysis | Computer simulation that runs the project/process thousands of times with randomly sampled input values. Produces a probability distribution of outcomes (cost, schedule, etc.). | Complex projects with many interacting risks; when you need confidence intervals on outcomes |
| Decision Tree | Diagram showing decisions, chance events (with probabilities), and outcomes (with values). Calculate EMV at each branch to determine best decision path. | Go/no-go decisions; make-or-buy; alternative response strategies; multi-stage decisions under uncertainty |
| Sensitivity Analysis | Determines which risk variable has the most impact on outcomes. Often visualized as a Tornado Diagram — bars sorted by impact magnitude. | Identifying which risks deserve the most attention; resource prioritization |
FMEA vs. P&I Matrix — Key Comparison
| Aspect | FMEA (Risk Priority Number) | Probability & Impact Matrix |
|---|---|---|
| Formula | RPN = Severity × Occurrence × Detection | Risk Score = Probability × Impact |
| Dimensions | 3 dimensions (adds Detection) | 2 dimensions (no Detection factor) |
| Impact / Severity | Severity (1–10 scale) | Impact (similar concept; often 1–9 scale) |
| Probability | Occurrence (1–10 scale) | Probability (1–9 or Low/Med/High) |
| Detection | Detectability score (1–10, inverse) | Not included |
| Primary Use | Product/process failure analysis | Project/process risk prioritization |
| Context | Design and process engineering | General risk management |
Probability & Impact (P&I) Matrix
The Probability and Impact Matrix is the primary qualitative risk analysis tool. It evaluates each risk on two dimensions — likelihood of occurrence and potential consequence — then combines them into a risk score used for prioritization.
Higher scores = higher priority risks requiring more immediate attention and resource allocation.
Sample Probability Scale
| Category | Score | Description |
|---|---|---|
| Very High | 9 | Risk event expected to occur |
| High | 7 | Risk event more likely than not to occur |
| Probable | 5 | Risk event may or may not occur (50/50) |
| Low | 3 | Risk event less likely than not to occur |
| Very Low | 1 | Risk event not expected to occur |
Sample Impact Scale (by Project Objective)
| Objective | Very Low (1) | Low (3) | Moderate (5) | High (7) | Very High (9) |
|---|---|---|---|---|---|
| Cost | Insignificant | <10% cost impact | 10–20% cost impact | 20–40% cost impact | >40% cost impact |
| Schedule | Insignificant | <5% schedule slip | 5–10% schedule slip | 10–20% schedule slip | >20% schedule slip |
| Scope | Barely noticeable | Minor areas impacted | Major areas impacted | Changes unacceptable to client | Product becomes useless |
| Quality | Barely noticeable | Minor functions impacted | Client must approve reduction | Quality reduction unacceptable | Product becomes useless |
P&I Matrix — Numerical (1–9 Scale)
| Prob ↓ / Impact → | 1 (Very Low) | 3 (Low) | 5 (Moderate) | 7 (High) | 9 (Very High) |
|---|---|---|---|---|---|
| 9 (Very High) | 9 | 27 | 45 | 63 | 81 ★ |
| 7 (High) | 7 | 21 | 35 | 49 | 63 |
| 5 (Moderate) | 5 | 15 | 25 | 35 | 45 |
| 3 (Low) | 3 | 9 | 15 | 21 | 27 |
| 1 (Very Low) | 1 ☆ | 3 | 5 | 7 | 9 |
Exam Example: A risk has Very Low probability (score = 1) but Very High impact (score = 9). Risk Score = 1 × 9 = 9. This falls in the yellow zone — medium priority. Compare to a risk with Moderate probability (5) and Moderate impact (5) = score of 25 — which is higher priority despite neither dimension being extreme.
Step 4: Plan Risk Response
Risk response planning determines how to decrease the possibility of negative risks affecting objectives and how to increase the possibility of positive risks helping objectives. Strategies differ depending on whether the risk is negative (threat) or positive (opportunity).
Goal: Reduce the probability, impact, or both of a negative event affecting your objectives.
Goal: Increase the probability, impact, or both of a positive event benefiting your objectives.
Accept applies to both sides: Accept is the only strategy that appears in both negative and positive risk response tables — but its meaning differs. For threats, Accept means tolerating the risk (passive or active). For opportunities, Accept means welcoming the benefit if it naturally occurs without actively pursuing it.
Step 5: Monitor & Control Risks
Risk monitoring and control is an ongoing process throughout the entire project or product lifecycle — not just a phase-end activity. The goal is to keep the risk register current, ensure response plans are being executed, and handle unexpected risks as they arise.
Core Activities
- ✦ Regularly review identified risks — are they still relevant?
- ✦ Identify and add new risks that emerge
- ✦ Remove risks that are no longer relevant or have been resolved
- ✦ Track triggers (warning signs) that indicate a risk is about to occur
- ✦ Verify that risk response plans are actually being implemented
- ✦ Confirm effectiveness of implemented responses
- ✦ Document lessons learned for future risk management
- ✦ May be conducted by an independent auditor
Handling Unexpected Risks: Workarounds
A workaround is an unplanned response to a risk event that was not identified or not expected. When a risk materializes as an issue and no contingency plan exists, a workaround is improvised to minimize the impact.
- ▸ Used to deal with unexpected risks to reduce their impact
- ▸ Workarounds should be documented — they become lessons learned and may identify new risks
- ▸ Distinguished from contingency plans: contingency = planned in advance; workaround = improvised on the fly
Risk vs. Issue vs. Workaround — Key Distinctions
| Concept | Timing | Response Type | Documentation |
|---|---|---|---|
| Risk | Future — has not yet occurred | Contingency Plan (planned in advance) | Risk Register |
| Issue | Present — the risk has now occurred | Execute contingency plan (if one exists) or workaround | Issue Log / Risk Register update |
| Workaround | Present — unidentified risk has occurred | Improvised, unplanned response | Document for lessons learned; update risk register |
| Contingency Plan | Created in advance (during Plan Risk Response) | Pre-defined actions triggered when a specific risk occurs | Risk Register / Risk Response Plan |
Risk Monitoring is Continuous: Risks change over time. A low-probability risk can become high-probability as circumstances change. A risk can be closed when conditions change such that it can no longer occur. New risks can emerge at any stage. Regular risk review meetings are best practice.
Risk Management — Quick Reference & Exam Summary
Key formulas, mnemonics, and comparison tables for rapid reference.
"Please Identify All Risk Management" — steps in order
Negative vs. Positive Risk Strategies — Side by Side
| Negative Risk (Threat) | Description | Positive Risk (Opportunity) | Description |
|---|---|---|---|
| Avoid | Eliminate the risk entirely — change the plan | Exploit | Ensure the opportunity definitely happens |
| Mitigate | Reduce probability and/or impact | Enhance | Increase probability and/or impact |
| Transfer | Shift financial impact to a third party | Share | Share the opportunity with a third party |
| Accept | Tolerate the risk (passive or active) | Accept | Welcome it if it occurs — without actively pursuing |
Common Pitfalls to Avoid
| Trap | Correct Understanding |
|---|---|
| Thinking all risks are negative | ISO 9000:2015 explicitly includes positive risks (opportunities). "Positive risk" is not an oxymoron. |
| Confusing Risk with Issue | Risk = future potential event. Issue = risk that has already materialized. They require different responses. |
| Thinking Transfer eliminates risk | Transfer moves the financial consequence to a third party — the risk event can still occur. It's not Avoid. |
| Confusing FMEA RPN with P&I Matrix score | FMEA uses 3 factors (including Detection). P&I Matrix uses only 2 (Probability × Impact, no Detection). |
| Thinking Qualitative is always done before Quantitative | True in practice — qualitative is used to screen/prioritize. But both can be used on different risks depending on data availability. |
| Passive vs. Active Acceptance | Passive = no plan, just deal with it if it happens. Active = create contingency plan in advance for the accepted risk. |
| Workaround vs. Contingency Plan | Contingency plan = pre-planned response for an identified risk. Workaround = improvised response for an unexpected/unidentified risk. |
Risk Management Tools Summary
| Tool | Step Used | Type | Key Feature |
|---|---|---|---|
| Brainstorming | Identify | Group technique | Most common identification tool |
| Ishikawa Diagram | Identify | Cause & Effect | Organizes causes by category (6M) |
| SWOT Analysis | Identify | Strategic | Captures positive risks (Opportunities) |
| FMEA | Identify / Analyze | Failure analysis | RPN = Severity × Occurrence × Detection |
| P&I Matrix | Analyze (Qualitative) | Risk prioritization | Risk Score = Probability × Impact; color-coded zones |
| EMV Analysis | Analyze (Quantitative) | Financial analysis | EMV = P(%) × Impact($); sum across all risks |
| Monte Carlo | Analyze (Quantitative) | Simulation | Probability distribution of project outcomes |
| Decision Tree | Analyze (Quantitative) | Decision analysis | Visual branching of decisions and outcomes |
| Risk Register | All steps | Living document | Central repository for all risk information |
About the Author
Mahesh Babu Nelakurthi
I currently work as a Senior Quality and Reliability Engineer at Ultium Cells LLC — a GM and LG joint venture at the forefront of America's push toward electric mobility. In advanced, high-volume manufacturing, the stakes of getting quality right are real and immediate. That environment teaches you quickly that the most dependable approach is to think in first principles: go back to what is actually known, build your reasoning from there, and trust the data to show you where the process is telling the truth.
That experience also reinforced a conviction I have held for a long time: that the fundamentals matter most. Not because advanced methods are unimportant — but because the right basic question, asked precisely, almost always points to the answer. Quality Datalabs is built around that idea: a resource grounded in first principles, free to use, and written for engineers who want to understand the why behind every decision.
"We stand on the shoulders of giants. Deming, Juran, Shewhart, Taguchi, Ishikawa — they spent lifetimes building the foundations. That knowledge belongs to all of us."
Quality engineering knowledge has too often been locked behind expensive certifications, paywalled journals, and five-day seminars. This reference was built to change that — to make the full depth of quality engineering accessible to every engineer, at every level of their career.
Every Cpk we compute, every control chart we plot, every FMEA we run — these are acts of responsibility. Somewhere at the end of the supply chain is a person who will use what we make. They trust us, without knowing us, to have done the work properly.
As we stand on the shoulders of giants, we have a responsibility to be better, to strive continuously for quality products reaching the customer. That responsibility is not a burden — it is the privilege of the profession.
Have a suggestion, found an error, or want to contribute? Reach out — this reference grows through the community it serves.
🔗 Connect on LinkedIn📬 Send an Enquiry
Found an error, have a question about the content, or want to suggest a new topic? Fill in the form — I read every submission and will get back to you directly.
* Required fields. Your email is used only to reply to your enquiry and is never shared.
Live Calculator
24 interactive calculators covering Six Sigma, Probability, Reliability, GR&R, DOE, SPC, and Sampling. Enter values — results update instantly. No data leaves your browser.
Six Sigma & Process Capability
Convert between sigma levels, DPMO, and capability indices. Enter any combination — all results update instantly.
DPMO ↔ Sigma Level
Convert between defects per million opportunities and sigma level
Cp / Cpk / Ppk Calculator
Process capability from specification limits and process statistics
Z-Score ↔ Probability
Standard normal conversions — enter Z or probability
Sample Size for Capability Study
Minimum n to estimate Cpk with specified confidence
Probability
Classical probability rules, conditional probability, Bayes, and common distributions. Enter values and see step-by-step working.
Basic Probability Rules
Union, intersection, conditional — enter P(A) and P(B)
Bayes' Theorem
Update probability given new evidence — posterior from prior
Binomial Distribution
Probability of exactly k successes in n independent trials
Poisson Distribution
Count events per unit — defects per part, failures per hour
Reliability Engineering
MTBF, MTTR, availability, Weibull B-life, system reliability, and stress-strength interference — with live results.
MTBF / MTTR / Availability
Core reliability metrics from failure and repair data
Weibull B-Life & R(t)
Survival probability and B-life for any Weibull distribution
System Reliability
Series, parallel, or k-out-of-n configurations — up to 5 components
Stress-Strength Interference
Reliability when both stress and strength are random variables
GR&R / Measurement System Analysis
Gauge Repeatability & Reproducibility — enter variance components to get %GR&R, ndc, and AIAG acceptance guidance.
%GR&R from Variance Components
AIAG MSA 4th Ed. — enter EV, AV, PV standard deviations
%GR&R — Range Method (Quick)
From operator averages and range averages — AIAG short form
Design of Experiments
Number of runs, resolution, and design properties for full factorial, fractional factorial, Plackett-Burman, and Taguchi designs.
Experiment Run Calculator
How many runs for your design type? Enter factors and levels.
Main Effect & S/N Ratio
Factor effect magnitude and Taguchi Signal-to-Noise ratio
Statistical Process Control
Control limits for variables and attribute charts — enter your process data to get UCL, LCL, and center line instantly.
Control Limits Calculator
X̄-R, X̄-s, p, np, c, u charts — select type and enter data
Capability from Control Chart
Estimate Cp, Cpk from R̄ or s̄ without raw data
Sampling & Confidence Intervals
AQL sampling plans, confidence intervals for means and proportions, and reliability demonstration sample sizes.
AQL Sample Size — ANSI Z1.4
Lot-based acceptance sampling — single sampling plan
Confidence Intervals
For mean (t-interval) and proportion (Wilson score)
Reliability Demonstration
Zero-failure test: sample size to prove R* at confidence C