CUSUM proportion exp

The CUSUM chart is used to detect small and persistent change, and is based on the cumulative sum of differences between sampling measurements and the mean, (thus CUSUM). This assumes that, in the "in control" state the CUSUM would hover around the expected mean level, as deviation around the mean would cancel each other out. In the "out of control" state, there will be a bias away from the expected mean, and the CUSUM will drift away from the expected mean level.

The CUSUM programs on this site follow the approach outlined in the text book by Hawkins and Olwell (see references), summarized as follows.

The user defines the "in control". The central tendency and variance is defined, according to the nature of the data In the normally distributed measurements, these are the mean and the Standard Deviation.
Using specific algorithms, the level of departure (h) from the central tendency to decide that the "out of control" alarm should be triggered is calculated. This level is abbreviated as h
h is calculated, depending on the amount of departure (k) the system is designed to detect, and the sensitivity of the detection, in terms of the averaged run length (ARL). The ARL is the expected number of observations between false alarms when the situation is "in control". Conceptually this represents the probability of Type I Error (α). An average run length of 20 is equivalent to α=0.05, ARL of 100, α=0.01.
Once the chart and the ARL are defined, sampling takes place at regular intervals. The departure from the expected is corrected by k, then added to CUSUM. If the CUSUM regressed to 0 or beyond, as it often does when the situation is "in control", the CUSUM recommences at 0.
In most cases therefore, two CUSUMs can be plotted, one for excessive increase in value, and one for excessive decrease in value (two tails). In most quality control situation however, only one of the tails is of interest.
In the programs of this site, 3 levels of h are offered in default, for ARLs of 20 (α=0.05) for yellow alert, 50 (α=0.02) for orange alert, and 100 (α=0.01) for red alert. The idea is that a yellow alert should trigger a heightened expectancy, orange alert triggers an investigation, and red alert triggers immediate response. However, these are merely recommendations, and users should define their own levels of sensitivity.

USUM charts, similar to those for normally distributed measurements, can also be constructed for proportions. However they have the following disadvantages.

Proportions are difficult to handle when repeated evaluations are carried out. The variance of a proportion is sample size dependent, and the sample size changes as data are continuously collected. This makes the definition of the "in control" mean and its variance too unstable to use.
A proportion can only be calculated after a defined sample size is reviewed, and the number of positive cases used to calculate proportion. Conflicting priorities therefore exist in determining the best sample size to use. To small a sample size produces unstable results. To large a sample size causes delay in evaluation and detection of the "out of control" state.

These serious shortcomings has made CUSUM based on Binomial distribution impractical.

CUSUM for Proportions Based on Bernoulli Distribution

The Bernoulli distribution is a special case of the Binomial Distribution. It has only two values 1 for positive and 0 for negative. The Bernoulli distribution calculates the probability that a single case will have 0 or 1, depending on the background binomial distribution. The use of the Bernoulli distribution alows the situation to be evaluated after each case is observed, making monitoring programs such as the CUSUM practicable.

This section supports the CUSUM for proportions Program Page based on the Bernoulli distribution.

Instead of calculating the proportion of a sample with positive attributes, Bernoulli calculated the probabilities of obtaining an individual with positive (1) or negative (0) attribute, based on the underlying binomial distribution of these attributes.

The use of Bernoulli distribution therefore allows a continuous CUSUM monitoring of binomially distributed proportions on a case by case basis, allowing a sensitive and robust decision that can be made whenever an out of control situation occurs.

The parameters required are as follows

The proportion when in control (p0) is the expected proportion of cases which are positive in the attribute of interest (death, Caesarean Section, complications), in other words, the bench mark. This is expressed as a number between 0 and 1 (e.g. 0.15 means 15%).

The proportion when out of control (p1) is the proportion the CUSUM is designed to detect. If p1>p0, then CUSUM is designed to detect an increase in proportion. If p1<p0, then CUSUM is designed to detect a decrease in proportion. Please note that, unlike CUSUM for means and counts, CUSUM for proportion is a one sided test, and only detects an increase or decrease at any one time.

The average number of observations before signal an alarm (ANOS) is similar in concept to the average run length (arl) in other CUSUMs, and is the average number of observations before an alarm even if the situation is in control. This represents the sensitivity of the system.

The CUSUM produced represents the accumulation of γB, an adjusted parameter

We run an obstetric unit, and we expect the Caesarean Section rate to be 20% (0.2), and we want our CUSUM program to trigger an alarm should the CS rate increased and reached 25% (0.25). From this we obtained the change reference value γB=0.2243.

We set the yellow alert at anos=20, the orange alert at anos=50, and red alert at anos=100.

The Program iterates the decision line until the required anos is reached or exceeded, these are displayed for checking purposes, but can be ignored by the user.

	Yellow	Orange	Red
ANOS	20	50	100
h	1.416	2.2797	3.1647

The first display contains the decision lines, shown to the left. A yellow alert will be trigger when the CUSUM is above 1.4, the orange alert at 2.3, and red alert at 3.2.

proportion	Yellow	Orange	Red
0.21	18	44	84
0.22	17	39	72
0.23	16	35	63
0.24	15	32	55

The full table of analysis is then presented to the right. Each row represents a theoretical level the proportion has shifted to when the situation is out of control. Column 1 represents the proportion out of control, and the rest of the table the average number of observations to signal (anos) before it reached the decision line.

For example, if the situation is out of control and the proportion shifts from the bench mark of 20%(0.20) to 24% (last line), then it will take on average 15 observations to trigger a yellow alert. If the increase is only 22% (0.22) then on average it takes 17 observations to trigger a yellow alert.

The default data averaged 20% positives (1s) in the first 20 observations, and 25% the last 40 observations. The CUSUM is seen in the plot to the right.

This plot shows the change of CUSUM based on individual observations. When a positive case in encountered CUSUM increases by Bk_i = Bk_i-1 + γB - 1/γB, and when a negative case is encountered, CUSUM decreases by Bk_i = Bk_i-1 - 1/γB. CUSUM is truncated at the zero value if it falls into the negative.

Should CUSUM be set up to detect a decrease in proportion (p1<p0), the same calculation using γB takes place, but CUSUM remains in the negative value, truncating when it increases beyond zero (0).

Hawkins DM, Olwell DH (1997) Cumulative sum charts and charting for quality improvement. Springer-Verlag New York. ISBN 0-387-98365-1 p 47-74

Reynolds Jr. MR and Stoumbos ZG (1999) A CUSUM Chart for Monitoring a Proportion when Inspecting Continuously. Journal of Quality Technology vol 31: No. 1. p.87 - 108