Related link :
Poisson Test Program Page
Compare Two Counts Program Page
Sample Sizae for Comparing Two Counts, Explanations, Calculations, and Tables Page
Poisson Probability
Comparing Two Counts
References
Introduction
Example 1
Example 2
Example 3
Poisson was a French mathematician, and amongst the many contributions he made,
proposed the Poisson distribution, with the example of modelling the number
of soldiers accidentally injured or killed from kicks by horses.
This distribution became useful as it models events, particularly uncommon events.
Counts of events, based on the Poisson distribution, is a frequently encountered
model in medical research. Examples of this are number of falls, asthma
attacks, number of cells, and so on. The Poisson parameter Lambda (λ)
is the total number of events (k) divided by the number of units (n)
in the data (λ = k/n). The unit forms the basis or denominator for
calculation of the average, and need not be individual cases or research subjects.
For example, the number of asthma attacks may be based on the number of child months,
or the number of pregnancies based on the number of women years in using a particular contraceptive.
This is different to the Binomial parameter of proportion or risk where proportion
is the number of individuals classified as positive (p) divided by the total number
of individuals in the data (r = p/n). Proportion or risk must always be a number
between 0 and 1, while λ may be any positive number.
For examples, if we have 100 people, and only 90 of
them go shopping in a week then the binomial risk of shopping is 90/100 = 0.9.
However, some of the people will go shopping more than once in the week, and the total
number of shopping trips between the 100 people may be 160, and the Poisson
Lambda is 160/100 = 1.6 per 100 person week
Large Lambda (λ=k/n), say over 200, assumes an approximately normal or geometric
distribution, and the count (or sqrt(count)) can be used as a Parametric measurement. If the
events occur very few times per individual, so that individuals can be
classified as positive or negative cases, then the Binomial distribution can
be assumed and statistics related to proportions used. In between, or when
events are infrequent, the Poisson distribution is used.
A detailed discussion of the use of Poisson related tests is in the reference
listed below. Some clarification of nomenclature may be useful.
- Counts of events (e.g. number of asthma attacks recorded) are represented by k
(k1, and k2 for the 2 groups). These counts must be in terms of how many
events over a defined period or environment (e.g. in 100 attacks in 300 children over 6 months,
or 10 cells seen in 5 microlitres of fluid),
- The denominator are represented by n (n1 and n2 for the 2 groups). e.g. 1800 children months,
5 microlitres.
- The mean count, or count rate (k/n) is represented by
λ for Lambda (λ1 and λ2 for the 2 groups). Λ and
N are used in sample size and power calculations, while k and n are used in
the test of statistical significance. e.g. 100 attacks(k) in 1800 children months (n)
produces λ=100/1800 = 0.06 attacks per child month (λ)
- Commonly, the one tail test is used for Poisson distribution, testing whether
one group has more event than the other rather than whether the two groups
are different without stating which one has more events.
Looking back, the number of patient complaints per month in a ward was
2,5,4,3 in the last 4 months, averaged to 3.5 per month (λ). Since a
new ward manager arrived the number of complaint last month was 6 (k). We
want to know what was the probability that this count was no different to 3.5.
Using the Poisson Test Program Page
we found that
the probability of 6 or more complaints being the same as λ=3.5 is 0.14. If we use
p< 0.05 as a decision criteria, then we cannot confidently conclude that 6 is greater than
the mean of 3.5.
In the same ward as that in example 1,
the total number of complains over 6 months was 30 (averaged to 5 per month).
We now want to know whether this should be considered as higher than the previous average of 3.5.
If the mean (λ) is 3.5 per month, then over 6 months, the number of complaints expected is
6 x 3.5 = 21. The probability of having 30 or more complaints when the expected was
21 is 0.0374. If we use p<0.05 as the decision criteria, then we can
conclude that 30 complaints over 6 months is significantly greater than the expected average of 3.5 per month.
We wish to establish some standards for detecting
blood in the urine by microscopic examination. If the urine is normal, we
expect, on average, 3 or less red blood cells per high power field under the microscope.
We wish to establish the maximum total red cell count over 5 high power fields
acceptable for normality.
If the average (λ) is 3/hpf, then the expected total
count over 5 fields is 15. The probability and cumulative probability of
observing counts from 0 to 25 are shown in the following table.
Expected | Observed | Probability | Cumulative probability <= | Cumulative probability >= |
15 | 0 | <0.0001 | <0.0001 | >0.9999 |
15 | 1 | <0.0001 | <0.0001 | >0.9999 |
15 | 2 | <0.0001 | <0.0001 | >0.9999 |
15 | 3 | 0.0002 | 0.0002 | >0.9999 |
15 | 4 | 0.0006 | 0.0009 | 0.9998 |
15 | 5 | 0.0019 | 0.0028 | 0.9991 |
15 | 6 | 0.0048 | 0.0076 | 0.9972 |
15 | 7 | 0.0104 | 0.018 | 0.9924 |
15 | 8 | 0.0194 | 0.0374 | 0.982 |
15 | 9 | 0.0324 | 0.0699 | 0.9626 |
15 | 10 | 0.0486 | 0.1185 | 0.9301 |
15 | 11 | 0.0663 | 0.1848 | 0.8815 |
15 | 12 | 0.0829 | 0.2676 | 0.8152 |
15 | 13 | 0.0956 | 0.3632 | 0.7324 |
15 | 14 | 0.1024 | 0.4657 | 0.6368 |
15 | 15 | 0.1024 | 0.5681 | 0.5343 |
15 | 16 | 0.096 | 0.6641 | 0.4319 |
15 | 17 | 0.0847 | 0.7489 | 0.3359 |
15 | 18 | 0.0706 | 0.8195 | 0.2511 |
15 | 19 | 0.0557 | 0.8752 | 0.1805 |
15 | 20 | 0.0418 | 0.917 | 0.1248 |
15 | 21 | 0.0299 | 0.9469 | 0.083 |
15 | 22 | 0.0204 | 0.9673 | 0.0531 |
15 | 23 | 0.0133 | 0.9805 | 0.0327 |
15 | 24 | 0.0083 | 0.9888 | 0.0195 |
15 | 25 | 0.005 | 0.9938 | 0.0112 |
From column 5, we can see that the probability of observing
22 or more cells is 0.0531, and 23 or more cells 0.0327. If we use p<0.05
as unlikely, we can conclude that we are unlikely to see
a total of 23 or more cells in 5 high power fields if we expect an average of 3 per high
power field. We can therefore use observing a total of 23 or more red blood cells in
5 high power fields as the diagnostic criteria for
detecting blood in the urine.
Introduction
Example
There are 3 approaches to comparing two counts, and these are best considered in historical perspective.
- The Poisson's Test comparing two counts was initially described by Przyborowski and Wilenski
(see reference), and is known as the Conditional Test (the C Test). The test is based on the
null hypothesis that the ratio of the two count rates (λ2 / λ1) is equal to
1. Most statistical text books describes the C Test, so this test can be considered as the standard. The C Test is, however,
the least powerful of the 3, and requires a larger sample size to reject the null hypothesis compared with the others
- Krishnamoorthy and Thomson (see reference) proposed an improvement on the C Test,
where the null hypothesis is that the difference between the two count rates
(λ2 - λ1) is equal to 0. Althought computation for this
test is more complex, the advantages are that it is both more robust and more powerful, so the
sample size required is smaller than that of the C Test.
- Whitehead (see reference), in his text book on unpaired sequential analysis, provided
algorithms to determine sample sizes for non-sequential methods, and a method for comparing
two counts was also described. This test depends on a transformation of the difference into a normally distributed mean
and test the mean against the null hypothesis of 0. This test is the most powerful of the 3, requiring the smllest
sample size to reject the null hypothesis. The disadvantage of this test is its dependence on the transformation of the difference into
a normally distributed mean, which becomes less appropriate as the sample size decreases.
Sample Size
Comparison
Power Analysis
We wish to compare two methods of contraception. From past records we know that
the pregnancy rate for the intrauterine device is 0.6 per 100 women years
(λ1 = 0.006), and we think that to half that (λ2=0.003) with the
new contraceptive pill would be a meaningful improvement.
We are only interested to know whether the pill is better, so a one tail test
is planned. We decided to use α=0.05 (the same as 0.1 for the two tail test), and power of 0.8, λ1 = 0.006
and λ2 = 0.003.
Looking up the tables in the Sample Sizae for Comparing Two Counts, Explanations, Calculations, and Tables Page ,
we estimate that the sample size is 5719 women years for each group (Whitehead), 6545 (C Test), and 6209 (E Test).
We recruited women into the two groups for our study. At the end of the study,
we had 17 pregnancies in 5000 women years of using the pill (0.0034 per woman year),
and 32 pregnancies in 5000 women years of using the interuterine device (0.0064 per woman year).
Using the program in the Compare Two Counts Program Page
,
we can establish that this difference is significant to the α= 0.016 (1 tail Whitehead), 0.02 (1 tail C Test),
and <.001 (1 tail E Test).
Again using the program in the Sample Sizae for Comparing Two Counts, Explanations, Calculations, and Tables Page,
we can establish that the power of the data obtained is 0.8312 (α=0.05, Whitehead, 1 tail),
0.7742 (α=0.05, C Test, 1 tail), and 0.8057 (α=0.05, E Test, 1 tail).
Please note : Studies of contraception often takes a few years as the
pregnancy rate is low. In some studies, a comparison of the proportion of women
who got pregnant is used, based on the Binomial distribution.
The problems with Binomial distribution in such studies however are that
firstly the pregnant rate is close to 0, so that the binomial distribution has greater variability here.
Secondly, some women do not get pregnant at all, while others may become pregnant
more than once during the study.
The model based on Poisson distribution is therefore more appropriate for
this sort of situation.
Poisson Probability
Steel RGD, Torrie JH Dickey DA (1997) Principles and Procedures of Statistics.
A Biomedical Approach. The McGraw-Hill Companies, Inc New York. p. 558
The C Test
The C Test was the original test used to compare two counts, and quoted by most jouirnal and text books. It
is based on the ratio of the two λs under comparison
Przyborowski J and Wilenski H (1940) Homogeneity of results in testing samples
from Poisson series. Biometrika 31:313-323.
The E Test
The E Test is based on a difference between the two λs under comparison, and is marginally more
powerful, but requires extensive computing power because it requires mathematical iteration.
Krishnamoorthy, K and Thomson, J. (2004). A more powerful test for comparing
two Poisson means. Journal of Statistical Planning and Inference, 119, 249-267.
Program adapted from FORTRAN program by same author, downloaded from
www.ucs.louisiana.edu/~kxk4695/
Whitehead John (1992). The Design and Analysis of Sequential Clinical Trials
(Revised 2nd. Edition) . John Wiley & Sons Ltd., Chichester, ISBN 0 47197550 8. p. 48-50
|