CUSUM exp

Paradigm Conceptual Model Algorithms Technical Considerations

CUSUM is a set of statistical procedures used in quality control. CUSUM stands for Cumulative Sum of Deviations.

In any ongoing process, be it manufacture or delivery of services and products, once the process is established and running, the outcome should be stable and within defined limits near a benchmark. The situation is said to be In Control

When things go wrong, the outcomes depart from the defined benchmark. The situation is then said to be Out of Control

In some cases, things go catastrophically wrong, and the outcomes departure from the benchmark in a dramatic and obvious manner, so that investigation and remedy follows. For example, the gear in an engine may fracture, causing the machine to seize. An example in health care is the employment of an unqualified fraud as a surgeon, followed by sudden and massive increase in mortality and morbidity.

The detection of catastrophic departure from the benchmark is usually by the Shewhart Chart, not covered on this site. Usually, some statistically improbable outcome, such as two consecutive measurements outside 3 Standard Deviations, or 3 consecutive measurements outside 2 Standard Deviations, is used to trigger an alarm that all is not well.

In many instances however, the departures from outcome benchmark are gradual and small in scale, and these are difficult to detect. Examples of this are changes in size and shape of products caused by progressive wearing out of machinery parts, reduced success rates over time when experienced staff are gradually replaced by novices in a work team, increases in client complaints to a service department following a loss of adequate supervision.

CUSUM is a statistical process of sampling outcome, and summing departures from benchmarks. When the situation is in control, the departures caused by random variations cancel each other numerically. In the out of control situation, departures from benchmark tend to be unidirectional, so that the sum of departures accumulates until it becomes statistically identifiable.

This panel explains and support the calculations from CUSUMMean_Pgm.php. This is for CUSUM for Normally distributed means (location) , which is probably the most commonly used CUSUM model in quality assessment

The first reason is that most measurements are Normally distributed, or are approximately Normally distributed
The second reason is that, data from many other distributions can be transformed into approximately Normal distribution, which can then take advantage of this model. Numerical transformation are explained in Transformation_Exp.php, which also provides a link to algorithms for transformations.
Finally, the Normal distribution mean model is visually simple, and conceptually easy to understand.

The default parameters and data is an example using computer generated data. It represents a CUSUM process set up to monitor the amount of sugar being placed in packages for sale by an automated dispenser

Mean (in control) is the benchmark, the mean value expected when the system is in control.
In our example, we expect that each packet contains 100g of sugar if the process is in control and working properly
Standard Error (SE) in control is the expected Standard Error when the situation is under control, and even when the system is out of control. Please note the term Standard Error is used because sometimes CUSUM uses the mean of a number of samples for input, so the Standard Deviation (SD) needs to be adjusted as SE = SD / sqrt(n). In most cases when individual samples are monitored n=1 and SE = SD
In our example, we use individual samples for our CUSUM, and we expect the Standard Deviation (and therefore Standard Error) to be 10g
Mean (Out of control) is the mean value when the system is out of control. Really it defines the departure from benchmark the CUSUM is designed to detect
In our example, if the dispenser wears out, we expect it to gradually increase the amount it dispenses. We would like to be notified if the dispenser regularly dispenses 102g or more, and increase of 2g, one fifth of a Standard Error
From in control and out of control parameters, the reference value is calculated. In our example, k = 1
Averaged Run Length (ARL) is the expected averaged number of samples required to trigger one false alarm when the situation is actually under control. Statistically it represents the False Positive Rate, or α the probability of Type I Error.
In our example, we set our ARL to 100, FPR=0.1, α=0.1
From k and ARL, the decision interval is calculated to be 63.5
The data represents the weights of the sugar as they are dispensed. These are computer generated random numbers with Standard Deviation of 10. The first 30 values had a mean of 100, and the rest a mean of 102

Resulting CUSUM plot is as shown to the left. Please note the following

CUSUM starts at a value that is half of the decision interval

With each new measurement X, the CUSUM is amended so that

_t

_t-1

When CUSUM crosses the value 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM values remained under the decision interval h in the first 50 values, increases in the second 50 and repeatedly exceeds the decision interval.

This panel explains and support the calculations from CUSUMSD_Pgm.php. This is a CUSUM for detection of changes in variability in terms of the Standard Deviation. Although it is not as commonly carried out as CUSUM for means, it is nevertheless important

Firstly, the earliest changes when a system becomes out of control is that the output becomes more variable. When this happens, the Standard Deviation of the measurements may well increase before the mean value of the output.
Secondly, a change in the variability of output usually alters the mean values as well. When a change in mean is detected, it is important to check that Standard Deviation, so that the faults in the system can be better identified.

This mathematics involved in actually the ratio of variances X = SD² / SD_{in control}², rather than a difference from the in control Standard Deviation. However, the parameters and the data are usually stated in terms of the Standard Deviation. The default parameters and data is an example using computer generated data. It represents a CUSUM process set up to monitor the amount of sugar being placed in packages for sale by an automated dispenser. Instead of monitoring the mean values however, this CUSUM monitors the variability in term of the Standard Deviation

Standard Deviation (in control) is the benchmark, the expected Standard Deviation of the measurement expected when the system is in control.
In our example, we expect that the Standard Deviation is 10g if the process is in control and working properly
Standard Error (out of control) defines the departure from in control value that the CUSUM id designed to detect
In our example, the CUSUM is designed to detect an increase to 10.5 (0.5/10 = 5% increase in the in control Standard Deviation)
Sample size (m) is the sample size that will be used to estimate Standard Deviation on the go.
In our example this is 10. This means that the Standard Deviation will be calculated on every 10 cases, and this used for CUSUM
From in control and out of control parameters, and the sample size, the reference value k is calculated. In our example, k = 1.05
Averaged Run Length (ARL) is the expected averaged number of samples required to trigger one false alarm when the situation is actually under control. Statistically it represents the False Positive Rate, or α, the probability of Type I Error.
In our example, we set our ARL to 100, FPR=0.1, α=0.1
From k and ARL, the decision interval is calculated to be 3.05
The data represents the Standard Deviations of the sugar, each Standard Deviation value calculated from 10 consecutive measurements. The 100 values therefore represents the data from 1000 samples of sugar. The data are computer generated random numbers. The first 50 centred around 10 and the rest around 10.5

Resulting CUSUM plot is as shown to the left. Please note the following

CUSUM starts at a value that is half of the decision interval

When CUSUM is less than 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM values remained under the decision interval h in the first 50 values, increases in the second 50 and repeatedly exceeds the decision interval.

Two CUSUM procedures related to changes in proportions are provided by StatTools, the Binomial distribution and the Bernoulli distribution. This panel discusses the Binomial distribution, and supports the calculations in CUSUMBinomial_Pgm.php.

The Binomial distribution examines the probability of having the number of positive cases (N⁺) in a group with sample size m.

The sensitivity and the stability of CUSUM depends on the sample size m, the smaller the group, the more sensitive and unstable it is.

The default and example is an audit of Caesarean Section in an obstetric unit

The Proportion in control p0 is our benchmark. We expect that 20% (0.2) of our deliveries would be by Caesarean Section
The Proportion out of control p1 is 25% (0.25). We design the CUSUM to detect an increase from 20% to 25%
CUSUM is carried out by counting the number of positive cases (N⁺) in each group. This number is a function of proportion and the sample size. In our example, we examine data in groups of 10 (m=10), so the expected number of positives are
- When in control, N⁺ = p0 * m = 0.2 * 10 = 2
- When out of control, N⁺ = p1 * m = 0.25 * 10 = 2.5
The Average Run Length is the average number of groups observed before an alarm even if the situation is in control. This is equivalent to the False Positive Rate. In our example we assigned ARL=100, a false positive rate of 1% or α=0.01
From proportions in and out of control, group sample size, and the ARL, two parameters are calculated, and from our example parameters, the Reference Value, k = 2, and the Decision interval h = 11.5
The Data is a single columns, each row contains the number of positives (N⁺) in a group size m (in our example m=10). The data in our example are 20 randomly generated integers, the first 10 centred around 2, and the second 10 centred around 2.5. This data represents a total of 200 deliveries, divided into 20 consecutive groups of 10, and each number represents the number of Caesarean Sections in that group.
The CUSUM starts as with the value half of decision interval (h/2). With each new group
If CUSUM crosses 0 value, it is default to 0
If CUSUM crosses the Decision Interval h, the alarm is triggered, and the CUSUM starts again with the value of h/2

Resulting CUSUM plot is as shown to the left. Please note the following

CUSUM starts at a value that is half of the decision interval

With each new group the number of positives is N⁺, the CUSUM is amended so that

_t

_t-1

⁺

When CUSUM crosses the value 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM values remained under the decision interval h in the first 10 values (groups), increases in the second 10 and eventually exceeds the decision interval.

Two CUSUM procedures related to changes in proportions are provided by StatTools, the Binomial distribution and the Bernoulli distribution. This panel discusses the Bernoulli distribution, and supports the calculations in CUSUMBernoulli_Pgm.php.

The Bernoulli distribution is a special case of the Binomial Distribution. It examines the probability of each individual case having a value of true (1, +) or false (-, 0).

The advantages of using the Bernoulli distribution for CUSUM is that the CUSUM value can be calculated with every case, based on whether the case is true or false. It is therefore more responsive to changes as it does not have to wait for the collection of a group of cases before a proportion can be calculated.

The disadvantage of doing CUSUM using the Bernoulli distribution is that the model is highly sensitive to any change, so that short term variations may cause mark changes to CUSUM and trigger false alarms. An example is monitoring adverse surgical outcomes, when most of the dangerous operations are carried out on a particular day by a senior surgeon, so that the adverse outcomes peaks one day a week rather than being averaged over the whole week, causing a false alarm to be triggered

The default and example is an audit of Caesarean Section in an obstetric unit

The Proportion in control is our benchmark. We expect that 20% (0.2) of our deliveries would be by Caesarean Section
The Proportion out of control is 25% (0.25). We design the CUSUM to detect an increase from 20% to 25%
The average number of observations before signal an alarm (ANOS) is similar in concept to the average run length (arl) in other CUSUMs, and is the average number of observations before an alarm even if the situation is in control. This is equivalent to the False Positive Rate. In our example we assigned ANOS=100, a false positive rate of 1% or α=0.01
From proportions in and out of control, and the ANOS, two parameters are calculated
- The Reference Value, γB, is equivalent to k in other CUSUM programs, except that it is used differently to construct the CUSUM. From our example, γB=0.2243
- The Decision Interval, h, is similar to other CUSUM models, used to trigger an alarm
The Data is a single columns, with 0 or 1, representing false/true, -/+, no/yes. The data in our example are 60 randomly generated, the first 30 with a 20% frequency of 1s, and the second 30 25% of 1s.
The CUSUM starts as with the value half of decision interval (h/2). With each new case
- If the case is 1 (true, +, yes),CUSUM_t = CUSUM_t-1 + γB - 1/γB
- If the case is 0 (false, -, no),CUSUM_t = CUSUM_t-1 - 1/γB
If CUSUM crosses 0 value, it is default to 0
If CUSUM crosses the Decision Interval h, the alarm is triggered, and the CUSUM starts again with the value of h/2

Resulting CUSUM plot is as shown to the left. Please note the following

CUSUM starts at the value h/2

When CUSUM is less than 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM has a saw tooth appearance, as it either increases with 1 or decreases with 0 with every case

CUSUM values increases in the second 30 cases and eventually exceeds the decision interval.

Poisson distribution concerns events occurring within a defined environment, such as the number of cells in a volume of fluid, or number of tadpoles in a pond

The most common environment however is time, so most counts are in terms of a unit of time. In health care, commonly used counts are number of complaints received by a hospital in a month, the number of falls in an age care facility per month, number of adverse incidents in an Intensive Care Unit a week, and so on.

The classical case is described by Poisson himself, on the number of soldiers in Napoleon's army that were accidentally kicked to death by horses.

When monitoring Poisson distributed counts, it is important that the environment is clearly defined and remains constant throughout. For this reason, evaluations can only take place at set intervals as the intervals must be long enough for some events to occur.

For this reason, the newer method in the CUSUM for Exponential Distributed Data Explained Page is use, where the interval between events, the inverse of Poisson count, is increasingly used, as evaluation can take place after each event.

The panel, however supports the algorithm of CUSUM for Poisson distributed counts, in the program CUSUM and Shewhart Charts for Poisson Distributed Counts Program Page.

The default and example is an audit of the number of complaints per month from patients and relatives in a hospital

The Mean Count in control λ0 is our benchmark and derived from experience. We expect that 3 complaints a month. However, if supervision and care deteriorates then complaints may increase
The Mean count out of control λ1 is set at 5. Our CUSUM is designed to trigger an alarm if the number of complaints increases to 5 per month or beyond
The Average Run Length is the average number of episodes, in this example, the number of months observed before an alarm even if the situation is in control. This is equivalent to the False Positive Rate. In our example we assigned ARL=100, a false positive rate of 1% or α=0.01
From mean counts in and out of control, and the ARL, the Reference Value, k = 4, and the Decision interval h = 5.6, are calculated
The Data is a single columns, each row contains the number of events in the period. The data in our example are 20 randomly generated integers, the first 10 centred around 2, and the second 10 centred around 5. These represents the number of complaints per month over 20 months of monitoring
The CUSUM starts as with the value half of decision interval (h/2). With each new group
If CUSUM crosses 0 value, it is default to 0
If CUSUM crosses the Decision Interval h, the alarm is triggered, and the CUSUM starts again with the value of h/2

Resulting CUSUM plot is as shown to the left. Please note the following

CUSUM starts at a value that is half of the decision interval

With each month, the CUSUM is amended so that

_t

_t-1

When CUSUM crosses the value 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM values remained under the decision interval h in the first 10 values (groups), increases in the second 10 and eventually exceeds the decision interval. An alarm is triggered and the CUSUM starts again, reaching the decision interval again, and the second alarm triggered.

The panel supports the algorithm of CUSUM for Inverse Gaussian distributed measurements, in the program CUSUM for Inverse Gaussian Distribution .

The following paragraph is a summary of the description of the Inverse Gaussian Distribution, provided by the Wikipedia

The distribution, formally, is the probability distribution of the harmonic mean, 1/x, when x is normally distributed. The distribution is skewed with a long tail to the right, and the characteristics of the distribution id defined by its mean, μ, and the level of skewness, λ. Values, μ and λ are all greater than 0. Skewness is most extreme with low values of λ, and the distribution becomes Normal Gaussian when λ=∞.

λ is difficult to compute formally, but an approximation can be obtained from rearrangement of the skewness formula

²

For any set of data, the mean (μ) and skewness can be determined, so λ can also be determined. StatTools provides calculations for mean and skewness at Data Testing for Normal Distribution Program Page

Although CUSUM for Inverse Gaussian distribution can be a useful method to audit any measurements that has a positive skew (long right tail), it is particularly appropriate to use for the measurement of time in terms of duration or time to complete a task. In manufacturing, time required to produce a product or planning. In health care, the duration of the waiting list. In labour the duration of first and second stage.

Ratios also have an Inverse Gaussian distribution. Given nearly all measurements are ratios of a standard unit, all measurements can be considered Inverse Gaussian. However, where the variance is small compared with the range of interest, most measurements can be approximately Normally distributed. For example, the weight of a newborn is usually accepted as Normally distributed, but the mass of stars has an Inverse Gaussian distribution

In many circumstances, the use of the complex mathematics involved with Inverse Gaussian distribution may be unnecessary, as skewed data can be transformed by logarithm, square root, cube root, or the Box Cox transformation to become approximately Normal. StatTools provides algorithms for transformation in Numerical Transformation Program Page

Should users still wish to proceed with CUSUM for Inverse Gaussian distribution, the following discussion use the example parameters and data from CUSUM for Inverse Gaussian Distribution . This example evaluates the waiting time for patients in a clinic waiting to be seen.

Mean in control μ0 is our benchmark. We expect that patients wait about an hour before they are seen
Lambda in control λ0 is the level of our skewness. From experience, we decided that λ0=1
Mean out of control λ1 is set at 1.5. Our CUSUM is designed to trigger an alarm if the waiting time increases to 1.5 hours or more
Average Run Length is the average number of patients reviewed before an alarm is triggered, even if the situation is in control. This is equivalent to the False Positive Rate. In our example we assigned ARL=100, a false positive rate of 1% or α=0.01
From means in and out of control, λ0, and the ARL, the Reference Value, k = 1.2, and the Decision interval h = 5.6, are calculated
The Data is a single columns, each row contains the waiting time of a patient. The data in our example are 100 randomly generated values with a positive skew, the first 50 centred around 1.0, and the second 50 centred around 1.5.
The CUSUM starts as with the value half of decision interval (h/2). With each new group
If CUSUM crosses 0 value, it is default to 0
If CUSUM crosses the Decision Interval h, the alarm is triggered, and the CUSUM starts again with the value of h/2

Resulting CUSUM plot is as shown to the left. Please note the following

CUSUM starts at a value that is half of the decision interval

With each patient, the CUSUM is amended so that

_t

_t-1

When CUSUM crosses the value 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM values remained under the decision interval h in the first 50 values, increases in the second 50 and eventually exceeds the decision interval. An alarm is triggered and the CUSUM starts again, reaching the decision interval again, and the second alarm triggered.

Exponentially Distributed Measurements are data that follows the inverse of the Poisson distribution, and are sometimes referred to as the Poisson Process. The distribution is positively skewed with a long right tail.

In the Poisson distribution, the number of events occurring over an interval (usually a period of time) is the averaged count (λ). The inverse of this (β = 1 / λ), is the average interval (time) between events, which follows the exponential distribution.

Many measurements follows the exponential distribution, examples are

Time to events, the time interval between occurrences. For example, the time between adverse events in a health facility.
Other environmental qualifiers. For example, instead of number of cells in a volume of fluid, the volume of fluid required to find one cell

In many circumstances, the use of the complex mathematics involved with Exponential distribution may be unnecessary, as skewed data can be transformed by logarithm, square root, cube root, or the Box Cox transformation to become approximately Normal. StatTools provides algorithms for transformation in Numerical Transformation Program Page

Should users still wish to proceed with CUSUM for Exponential distribution, the following discussion use the example parameters and data from CUSUM for Exponentially Distributed Measurements Program Page. This example evaluates the quality of an age care facility. Instead of using the number of falls per month as a measurement of quality of care (Poisson distributed count), the time intervals between successive falls is used (Exponential or Inverse Poisson distribution).

Measurement in control β0 is our benchmark, and in our example β0 = 100. We expect that, in our nursing home, patient would not fall over more frequently than once every 100 days
Measurement out of control β1 in our example β1 = 80. If the quality of care deteriorates, falls would become more frequent, and the intervals between falls would decrease. Our CUSUM is designed to trigger an alarm if the intervals between falls decreases to 80 days or less
Average Run Length is the average number of falls found before an alarm is triggered, even if the situation is in control. This is equivalent to the False Positive Rate. In our example we assigned ARL=100, a false positive rate of 1% or α=0.01
From measurements in and out of control, and the ARL, the Reference Value, k = 0.89, and the Decision interval h = -5.55, are calculated
The Data is a single columns, each row contains the number of days in successive falls. The data in our example are 100 randomly generated values with a positive skew, the first 50 centred around 100, and the second 50 centred around 80.
The CUSUM starts as with the value half of decision interval (h/2). With each fall
If CUSUM crosses 0 value, it is default to 0
If CUSUM crosses the Decision Interval h, the alarm is triggered, and the CUSUM starts again with the value of h/2

Resulting CUSUM plot is as shown to the left. Please note the following

To detect a decrease, h is a negative value

CUSUM starts at a value that is half of the decision interval

When CUSUM crosses the value 0, it defaults to 0

When CUSUM reaches or exceeds the decision interval (c <= h), an alarm is triggered, and the CUSUM restarts at h/2

CUSUM values remained above the decision interval h in the first 50 values, decreases in the second 50 and eventually exceeds the decision interval. An alarm is triggered and the CUSUM starts again.

CUSUM for Normal, Binomial, Poisson, and Inverse Gaussian Distributions

CUSUM : Hawkins DM, Olwell DH (1997) Cumulative sum charts and charting for quality improvement. Springer-Verlag New York. ISBN 0-387-98365-1 p 47-74
Computer program to calculate CUSUM decision limits can be downloaded from http://www.stat.umn.edu/cusum/software.htm
Hawkins DM (1992) A fast accurate approximation for average run lengths of CUSUM control charts. Journal of quality technology 24:1 (Jan) p 37-43 (this is the algorithm used by StatTools in the CUSUM for Normally Distributed Means Program Page.

CUSUM for Proportions (Bernoulli) Distribution

Reynolds Jr. MR and Stoumbos ZG (1999) A CUSUM Chart for Monitoring a Proportion when Inspecting Continuously. Journal of Quality Technology vol 31: No. 1. p.87 - 108
Reynolds MR and Stoumbos Z G (2000) A general approach to modeling CUSUM charts for a proportion IIE Transactions 32:6 515-535

CUSUM for Exponential (Inverse Poisson) Distribution

Gan FF (1994) Design of Optimal Exponential CUSUM Charts. Journal of Quality Technology 26:2 p. 109-124. Program code in Fortran available at url = http://lib.stat.cmu.edu/jqt/26-2
Gan FF and Choi KP (1994) Computing Average Run Lengths for Exponential CUSUM Schemes. Journal of Quality Technology 26:2 p. 134-143

Contents of k : 10