Related link :
Sequential Analysis Introduction and Explained Page
Sequential Probability Ratio Tests for Means Program Page
Sequential Probability Ratio Tests for Proportions Program Page
Introduction
SPRT for Means
SPRT for Proportions
References
General discussions of Sequential Analysis are covered in the Sequential Analysis Introduction and Explained Page
and quality statistics in the Quality Statistics Explained Page
and will not be repeated here.
This page discusses the Sequential Probability Ratio Test (SPRT). This is the
earliest sequential method, developed by Wald, Neyman, and Pearson and
their group during the 1930s. These methods were initially developed as a method of
quality control, and they form the basis of many subsequent and more sophisticated
developments in sequential and quality control methodologies.
The model aims to determine the quality of a batch of products by minimal sampling.
The idea is to sample the batch sequentially until a decision can be made whether
the batch conforms to specification and can be accepted or that it should be rejected.
These pages offer the 2 major models, for proportion and mean.
Common Terms and Abbreviations
- α : , also represented as alpha, or p, is the probability of Type I Error. Commonly, p<0.05
or p<0.01 is used as the criteria to reject the null hypothesis
- β : is the probability of Type II Error. Commonly, β<0.2 is used at the planning stage to
determine sample size or calculating borders for sequential analysis.
- Power : is 1 - β, a concept intuitively easier to understand, and represents the ability to detect
a difference, if its really there. A power of 0.8 (80%) is usually used as this is the same as β=0.2
- Value to accept null hypothesis : is the lower value (proportion or mean) below which the decision to
accept the null hypothesis (not significantly different from zero) can be made.
- Value to reject null hypothesis : is the higher value (proportion or mean) above which the decision to
reject the null hypothesis (significantly different from zero) can be made.
- k : was used by Wald to represent the effect size, and is equivalent to θ which is now more commonly used.
- Decision borders : are two parallel lines drawn on the sequential chart. Data are plotted as they are
sampled. Sampling continues while the plot coordinates are between the two decision lines. Sampling stops when the
plot coordinates are outside of the two lines. If it is above the rejection line then the null hypothesis is rejected.
If it is below the acceptance line then the null hypothesis is accepted.
- Truncation : This is the maximum number of samples. Sampling stops at this point even if the
coordinate still fall between the two lines. The null hypothesis is usually then accepted. In some
algorithms (such as that on these pages, Lines are drawn between the two borders to a midpoint at truncation,
and the decision to reject or accept the null hypothesis is made according to which line the last data point crosses.
Input / Output
Example
Data Input : In addition to α and power that are common inputs to all models, the following inputs are used
- Standard Deviation is the expected Standard Deviation of the samples to be taken. In quality control
this value is usually known from past history
- Mean (accept null hypothesis) mean0: is the mean value below which a decision that
the mean is null (zero) can be made. In quality control, to detect defects in production, this is the mean
of departure from expected measurements that can be accepted as normal variations.
- Mean (reject null hypothesis) mean1: is the mean value above which a decision that
the mean is not null (zero) can be made. In quality control, to detect defects in production,
this is the mean value of departure from expected values which will trigger an alert that something is amiss.
Results Input :
- The decision borders : the decision borders.
(5-wt) |
1 |
-1 |
1 |
0 |
1 |
0 |
1 |
2 |
0 |
-1 |
1 |
1 |
-1 |
1 |
0 |
We are the buying department of a grocery chain, and purchases eggs from the farmers.
We expect the standard deviation of the weight of eggs to be 1g, and we stipulated that
the mean weight of the eggs we purchased must not be more than 1 SD (1g) less than the average of
5gs. (x = 5 - wt of egg). For every batch of egg delivered we wish to measure
the eggs until we are satisfied that this bench mark is met.
Rejection Line is y = 2.7726 + 0.5 n
Acceptance Line is y = -1.5581 + 0.5 n
Expected sample size n = 5
Truncation sample size = 15
|
We set the parameters as alpha (α) = 0.05, power=0.8, expected standard
deviation = 1g, bench mark for rejection of null hypothesis (eggs too small)
1 (bench mark = 1g less than mean of 5), and the bench mark for accepting the
null hypothesis (eggs not too small) 0 (not less than 5 g)
The borders are as shown above and to the right. The border for rejection of null
hypothesis is y = 2.7726 + 0.5 n, and for acceptance of null hypothesis -1.5581 + 0.5 n,
where n = number sampled, and y = sum of x (x = 5-wt of egg). if the borders are not
crossed after 15 measurements, the null hypothesis is accepted (eggs not too small)
The plot remained between the rejection and acceptance line, until the maximum
number of samples was reached after 15 measurements, and at that stage the decisions
to stop further measurements and accept the null hypothesis (eggs not too small)
were made.
Input / Output
Example
Data Input : In addition to α and power that are common inputs to all models, the following inputs are used
- Proportion (accept null hypothesis) p0: is the proportion of positives below which a decision that
the proportion is null (zero) can be made. In quality control, to detect defective items in a batch of
products, this is the proportion of defectives below which the batch can be accepted by the client.
- Proportion (reject null hypothesis) p1: is the proportion of positives above which a decision that
the proportion is not null (zero) can be made. In quality control, to detect defective items in a batch of
products, this is the proportion of defectives above which the batch will not be accepted by the client.
Results Output :
- k : or θ, the effect size log odds ratio, where k = log((p1*(1-p0)) / (p0*(1-p1)));
- The decision borders : are calculated using α, β, and k, (see references)
The transport department noticed that less than half of teenagers pass their
driving test in their first attempt, a distressing and wasteful situation. A
suggestion was made that a training course be offered before the test, but this
requires organisation and resources, so there is a need to pre-test the course.
0 |
1 |
1 |
1 |
1 |
0 |
1 |
1 |
0 |
1 |
1 |
1 |
1 |
It was agreed that the course should result in 80% (0.8) of the candidate
to pass the driving test at the first attempt, and that it should be considered
useless if the passing rate remained the current 40% (0.4). We also set our
statistical parameters so that alpha = 0.05, power = 0.8.
Effect size = 1.79
Rejection border y = 1.5474 + 0.6131 n
Acceptance border y = -0.8696 + 0.6131 n
Max n = 18
|
The borders are shown to the right. The border to decide rejection of the
null hypothesis (that the pass rate is
80% or better) is y = 1.5474 + 0.6131 n, and to accept the null hypothesis
(that the pass rate is not better than 80%) y = -0.8696 + 0.6131 n, n being
the number of candidates reviewed, and y is the cumulative number of passes.
The data is shown above and to the left, the value 1 for those who passed the
test, and 0 those failed (the first, sixth, and ninth candidate).
The plot is shown to the right. By the 12th candidate, the border for
rejecting the null hypothesis has been reached. The study can at this stage
be terminated and the conclusion drawn that the pass rate is 80% or better.
Wald A (1947) Sequential Analysis. John Wiley and Son, Inc, New York.
This book contains the original descriptions but is now out of print, although it can be found in most university
libraries, or obtained (as I did) from interlibrary loans.
https://en.wikipedia.org/wiki/Sequential_probability_ratio_test is a complete and excellent explanation of
Wald's SPRT, including the formulae used.
|