Poisson Exp

Poisson was a French mathematician, and amongst the many contributions he made, proposed the Poisson distribution, with the example of modelling the number of soldiers accidentally injured or killed from kicks by horses. This distribution became useful as it models events, particularly uncommon events.

Counts of events, based on the Poisson distribution, is a frequently encountered model in medical research. Examples of this are number of falls, asthma attacks, number of cells, and so on. The Poisson parameter Lambda (λ) is the total number of events (k) divided by the number of units (n) in the data (λ = k/n). The unit forms the basis or denominator for calculation of the average, and need not be individual cases or research subjects. For example, the number of asthma attacks may be based on the number of child months, or the number of pregnancies based on the number of women years in using a particular contraceptive.

This is different to the Binomial parameter of proportion or risk where proportion is the number of individuals classified as positive (p) divided by the total number of individuals in the data (r = p/n). Proportion or risk must always be a number between 0 and 1, while λ may be any positive number.

For examples, if we have 100 people, and only 90 of them go shopping in a week then the binomial risk of shopping is 90/100 = 0.9. However, some of the people will go shopping more than once in the week, and the total number of shopping trips between the 100 people may be 160, and the Poisson Lambda is 160/100 = 1.6 per 100 person week

Large Lambda (λ=k/n), say over 200, assumes an approximately normal or geometric distribution, and the count (or sqrt(count)) can be used as a Parametric measurement. If the events occur very few times per individual, so that individuals can be classified as positive or negative cases, then the Binomial distribution can be assumed and statistics related to proportions used. In between, or when events are infrequent, the Poisson distribution is used.

A detailed discussion of the use of Poisson related tests is in the reference listed below. Some clarification of nomenclature may be useful.

Counts of events (e.g. number of asthma attacks recorded) are represented by k (k1, and k2 for the 2 groups). These counts must be in terms of how many events over a defined period or environment (e.g. in 100 attacks in 300 children over 6 months, or 10 cells seen in 5 microlitres of fluid),
The denominator are represented by n (n1 and n2 for the 2 groups). e.g. 1800 children months, 5 microlitres.
The mean count, or count rate (k/n) is represented by λ for Lambda (λ1 and λ2 for the 2 groups). Λ and N are used in sample size and power calculations, while k and n are used in the test of statistical significance. e.g. 100 attacks(k) in 1800 children months (n) produces λ=100/1800 = 0.06 attacks per child month (λ)
Commonly, the one tail test is used for Poisson distribution, testing whether one group has more event than the other rather than whether the two groups are different without stating which one has more events.

Looking back, the number of patient complaints per month in a ward was 2,5,4,3 in the last 4 months, averaged to 3.5 per month (λ). Since a new ward manager arrived the number of complaint last month was 6 (k). We want to know what was the probability that this count was no different to 3.5. Using the Poisson Test Program Page we found that the probability of 6 or more complaints being the same as λ=3.5 is 0.14. If we use p< 0.05 as a decision criteria, then we cannot confidently conclude that 6 is greater than the mean of 3.5.

In the same ward as that in example 1, the total number of complains over 6 months was 30 (averaged to 5 per month). We now want to know whether this should be considered as higher than the previous average of 3.5.

If the mean (λ) is 3.5 per month, then over 6 months, the number of complaints expected is 6 x 3.5 = 21. The probability of having 30 or more complaints when the expected was 21 is 0.0374. If we use p<0.05 as the decision criteria, then we can conclude that 30 complaints over 6 months is significantly greater than the expected average of 3.5 per month.

We wish to establish some standards for detecting blood in the urine by microscopic examination. If the urine is normal, we expect, on average, 3 or less red blood cells per high power field under the microscope. We wish to establish the maximum total red cell count over 5 high power fields acceptable for normality.

If the average (λ) is 3/hpf, then the expected total count over 5 fields is 15. The probability and cumulative probability of observing counts from 0 to 25 are shown in the following table.

Expected	Observed	Probability	Cumulative probability <=	Cumulative probability >=
15	0	<0.0001	<0.0001	>0.9999
15	1	<0.0001	<0.0001	>0.9999
15	2	<0.0001	<0.0001	>0.9999
15	3	0.0002	0.0002	>0.9999
15	4	0.0006	0.0009	0.9998
15	5	0.0019	0.0028	0.9991
15	6	0.0048	0.0076	0.9972
15	7	0.0104	0.018	0.9924
15	8	0.0194	0.0374	0.982
15	9	0.0324	0.0699	0.9626
15	10	0.0486	0.1185	0.9301
15	11	0.0663	0.1848	0.8815
15	12	0.0829	0.2676	0.8152
15	13	0.0956	0.3632	0.7324
15	14	0.1024	0.4657	0.6368
15	15	0.1024	0.5681	0.5343
15	16	0.096	0.6641	0.4319
15	17	0.0847	0.7489	0.3359
15	18	0.0706	0.8195	0.2511
15	19	0.0557	0.8752	0.1805
15	20	0.0418	0.917	0.1248
15	21	0.0299	0.9469	0.083
15	22	0.0204	0.9673	0.0531
15	23	0.0133	0.9805	0.0327
15	24	0.0083	0.9888	0.0195
15	25	0.005	0.9938	0.0112

From column 5, we can see that the probability of observing 22 or more cells is 0.0531, and 23 or more cells 0.0327. If we use p<0.05 as unlikely, we can conclude that we are unlikely to see a total of 23 or more cells in 5 high power fields if we expect an average of 3 per high power field. We can therefore use observing a total of 23 or more red blood cells in 5 high power fields as the diagnostic criteria for detecting blood in the urine.

Poisson Probability

The C Test

The E Test