![]() |
Links : Home Index (Subjects) Contact StatTools |
Related Links:
This page provides a description and general comments on programs that compare proportions in unpaired groups.
StatTools provides two groups of such comparisons, the commonly used comparison of two proportions, and the more esoteric but powerful multiple proportions. Sample size are discussed in the Sample size for Two Proportions Explanations and Tables Page and the Sample Size for Matched Paired Controlled Studies Explained and Tables Page and will not be repeated in this page
Chi Square
Fisher's Exact Probability
Risk Difference
Risk Ratio
Odds Ratio
Chi Square for 2x2 contingency table, from the Unpaired Comparison of Two Proportions Program Page
,
tests the probability that the number of
positive and negative cases in two groups are from a similar population, a
goodness of fit test. Until tests using confidence interval
became popular in the 1990s, this was the principle test used to
compare two proportions.
For Chi Square to be used, the total number in the study should exceed 30 cases, and each cell should have at least 5 cases. Short of these numbers, the assumptions of Chi Squares distribution cannot be assured and there is a possibility of misinterpretation. An example We wish to study the difference in the preferences for specialty training between the male and female interns in a hospital. We found a group of 16 male interns, 10 of whom chose surgical specialties. We also found a group of 21 female interns, 5 of them chose surgical specialties. The Chi Squares is 4.15 and α=0.04. From this we can conclude that male and female interns differ significantly (at the p<0.05 decision level), in choosing surgical specialties for training.
Fisher's exact probability, from the Unpaired Comparison of Two Proportions Program Page
,
estimates the probability of observing the difference in proportions in two groups, departing
from the null hypothesis that the two groups are the same. The calculation provided estimates the
probability for the two tail test.
As the probability is estimated using permutation, no assumption of the underlying binomial distribution is made, and the power of the model is 100%. Because of this, the test remains valid even when the sample size is small. Fisher's test is therefore often used when the sample or cell size are insufficient for the chi square test. The test uses Factorial numbers repeatedly, so that computing time increases exponentially with the number of cases. Therefore this test is only used when the conditions for a Chi Square Test cannot be satisfied. An example We repeat the study in the Chi Square Panel, but use Fisher's Exact Probability instead. We wish to study the difference in the preferences for specialty training between the male and female interns in a hospital. We found a group of 16 male interns, 10 of whom chose surgical specialties. We also found a group of 21 female interns, 5 of them chose surgical specialties. Fisher's Exact Probability = 0.02, and we can conclude that male and female interns differ significantly (at the p<0.05 decision level), in choosing surgical specialties for training.
Risk difference, from the Unpaired Comparison of Two Proportions Program Page
,
is usually used to evaluate the results of a controlled trial.
The risk is the proportion of the group with positive outcomes, so that risk
in group 1 risk1=Pos1/(Pos1+Neg1), and risk in group 2 r2=Pos2/(Pos2+Neg2). Risk difference
is then rd=r1-r2.
There are two methods to estimate the Standard Error (se) of the risk difference, from which the 95% confidence interval is calculated
After a controlled trial, a piece of useful information is how many subjects need to change their treatment to affect a single case with a different outcome. This is known as the numbers needed to treat (NNT), which is the inverse of the risk difference, NNT = 1 / rd. An example : We wish to study whether preoperative antibiotics reduces postoperative infections in a randomised controlled trial. Group 1 are 11 patients that received antibiotics, and 4 of them had postoperative infections. Groups 2 are 15 patients that received no antibiotics and 12 of them had postoperative infections. Risk of infection in the group receiving antibiotic is r1 = 4/11 = 0.31 (31%), and risk in the group that did not receive antibiotic is r2 = 12/15 = 0.8 (80%). Because the sample size is small, the small sample method is used. The risk difference is 0.31 - 0.8 = -0.49 (-49%), with the 95% confidence interval -0.71 to -0.12. In other words, those receiving antibiotics were 12% to 70% less likely to have postoperative infections than those not receiving antibiotics. If the traditional method was used, the 95% confidence interval would be -0.81 to -0.17 (17% to 81%) Number needed to treat is based on the risk difference, which is the same in both methods. NNT = 1 / RD = 1 / 0.49 = 2.03, rounded upwards to 3. In other words, for the reduction of 1 case of infection, three more patients will need to receive pre-operative antibiotics.
Risk Ratio (also called Relative Risks by some), from the Unpaired Comparison of Two Proportions Program Page
,
is used to evaluate risks in epidemiological studies, and to a
less extent also in controlled trials. Relative Risk is the ratio of risks between the two groups rr = r1/r2 = (Pos1(Pos2+Neg2))/(Pos2(Pos1+Neg1)).
Risk Ratio addresses one of the problems associated with risk differences, that of comparing different effects or outcomes in a study. For example, in a study of fetal monitoring, there is an increase in the risk of Caesarean section with the use of monitoring, and possibly a reduction in fetal death. The problem is that a doubling of Caesarean section from 10% to 20% is 10% (rd=0.1), but a halving of fetal death is from 2 in a thousand to 1 in a thousand (rd=0.001). Risk Ratio is therefore a better comparison, as one is 2 and the other 0.5 (1/2). Risk Ratio (rr) is a ratio, which has a log-normal distribution. All the calculations are therefore conducted using log(rr), and the final results are reverse transformed to the non-logarithm for clinical interpretation. The 95% interval of the non-log Risk Ratio is therefore assymmetrical around the mean rr value, being larger on the side of higher values. Graphical representation of Risk Ratio therefore uses a logarithm scale. An example We wish to study whether men are more likely to have car accidents than women. We collected a database of drivers, and found that out of 500 men 23 had an accident in the past year, while out of 530 women 10 had an accident during the same period. Risk of car accident amongst men is r1 = 23/500 = 0.046 (4.6%), risk of accident in women is r2 = 10/530 = 0.019 (1.9%). Relative Risks is rr = 0.046/0.019 = 2.44, with 95% CI 1.17 to 5.07. In other words, male drivers have 1.17 to 5.07 times the risk of having a car accident compared with female drivers.
Odds Ratio, from the Unpaired Comparison of Two Proportions Program Page
,
is often used in situations where cause and effect are confused, and the concept of risk cannot be
applied. An example is the relationship between social class and educational achievement, when it is uncertain whether
the lack of education results in a lower social class or whether lower social class led to under-achievement in education.
Odds is the ratio of positives and negatives, so that odds in group 1 o1 = Pos1/Neg1, and in group 2 o2 = Pos2/Neg2. Odds ratio is or = o1/o2 = Pos1Neg2/Pos2Neg1. Odds Ratio (or) is a ratio, which has a log-normal distribution. All the calculations are therefore conducted using log(or), and the final results are reverse transformed to the non-logarithm for clinical interpretation. The 95% interval of the non-log Odds Ratio is therefore assymmetrical around the mean or value, being larger on the side of higher values. Graphical representation of Odds Ratio therefore uses a logarithm scale. An Example We wish to study the relationship between the deflexion of the fetal head and occipital posterior position in early labour. We do not know, and are unconcerned, which comes first, but merely want to know if there is an association. We found 20 babies with occipital posterior position in early labour, 12 of whom has a deflexed head (Pos1=12) and 8 with a flexed head (Neg1=8) . We also found 35 babies with occipital anterior position, 10 of which had deflexed head (Pos2=10) and 25 with a flexed head (Neg2=25). The Odds ration is 3.75, the 95% confidence interval 1.4 to 9.9. We can therefore conclude that occipital posterior and deflexion of the fetal head in ealy labour are related. Odds Ratio is also used extensively in multivariate assessment of risks such as in Logistic Regression and Bayesian Probability Models. A particular useful research model is the Retrospective Matched Paired Controlled Comparison of two groups, which allows a test of causal factors in groups with different outcomes. This is discussed in the Sample Size for Matched Paired Controlled Studies Explained and Tables Page and will not be repeated here.
Multiple Proportion
Large Contingency Table
Standardization
Regression of Proportions
The Marascuilo Procedure from the Comparison of Proportions in Multiple Groups Program Page
is used to compare proportions when there
are more than two groups with binary outcomes (yes/no). It firstly performs a test of overall homogeneity for a large
contingency table. This is followed by multiple post hoc comparisons between pairs of groups in the data.
The Mascuilo's procedure is simpler and replaces using multiple comparisons of two groups with a Bonferroni's correction for multiple comparisons, so it is preferred when there are multiple groups to be compared at the same time. An Example : is as the default example data from the Comparison of Proportions in Multiple Groups Program Page . We wish to study academic achievements in 3 social groups of students.
An initial analysis using the Chi Squares Test indicates that the rates of under achievement are significantly different in the 3 groups (Chi Sq = 38.04, df=2, α<0.001). Using the Marascuilo procedure to perform post hoc analysis, we found that group 1 is significantly different to group 2 (difference = 35%, α<0.001), and to group 3 (difference = 43.3%, α<0.001). However, the difference between groups 2 and 3 (difference = 8.3%, α=0.47) is not statistically significant.
The Chi Square Test for contingency tables with multiple rows and columns tests that the distirbution between the columns
in all the rows are similar. Like the Chi Squares Test for the 2x2 table, it is a goodness of fit test.
An Example We wish to find out whether the shape of pelvis are similar between different ethnic groups. We reviewed all our pelvimetry, and found the following.
Please note : I used the random number generator to create the data to demonstrate the statistics. The data does not represent any reality. In any case, modern obstetrics generally considered these classifications of pelvis arbitrary and irrelevant anyway. The large contingency table is also used for the calculation of the nonparametric Spearman's Correlation Coefficient. This is discussed in the Correlation and Regression Explained Page , and not repeated here
Standardization was originally developed for the analysis of death rates in
epidemiological studies, to correct for age differences in different population
groups. An example is the analysis of cancer deaths in two populations when
the age distribution in these populations are dissimilar.
The calculation estimates the number of positive cases that would exist if the age distribution is standardized, making it possible to compare the rates (incidents) without the distortion of age distribution. The method has been generalised so that comparison of rates between groups can be made after correction for one or more other variables. Standardization requires the nomination of a standard population. The most common approach is to nominate one of the groups under comparison as the standard, and adjusts the number of positive cases in all other groups accordingly. Usually, the group that represents the control group is used. When there is no clear choice which group should be the standard, the mean values across all the groups are used for standardization. This section describes the use of the program in the Standardization of Proportions Program Page . Input data This consists of 4 columns.
Section 1 displays the input data, and can be referred to when the results are examined.
Section 2 displays the results of analysis. The program in the Standardization of Proportions Program Page calculates standardization multiple times, using in turn each group as the standard, as well as the means of all the groups. The format of these results are the same, and the user should choose only one of the results to present, basing the choice on the most appropriate standard. The results are presented in 2 tables. Table 1 examines each group as they are modified by standardization.
Table 2 performs pair-wise comparison between the groups, with differences, Standard Errors and 95% confidence intervals. These are the final result output.
Example
Standardization is usually performed on epidemiological data, involving large sample sizes. The following examples however uses computer generated data and contains only few subjects. This is done to demonstrate the use of the method We wish to study the epidemiology of unemployment, in particular whether educational levels have an effect. The data is as shown to the right. We have 3 groups to compare.
However, we realise that the sex and marital status of the subject would also have an influence on employment, so we also subdivided our cases into 4 parameter groups
We collected our data as shown above and to the right (we did not, I made up the data). The data input matrix is therefore as to the left Please note that column 1 designate the groups that we primarily intend to address, that of educational level. Column 2 are the 4 qualifying groups of sex and marital status.
The first output, table 1 to the left, shows the total numbers in each subgroup. Please note that the numbers in the subgroups are very much different amongst the three educational level groups, so that, without some form of correction, the effects of education on unemployment will be masked by its relationship with sex and marital status.
The second table (to the right) shows the percent unemployment (+ve) for each subgroup, and the uncorrected total unemployment rate in each group. Although it is pretty obvious that the 3 groups are different, the total unemployment rate between the groups cannot really be compared. The variation of unemployment between sex and marital status vary widely, and the proportion of different sex and marital status are quite different in the 3 groups.
The first table of section 2 (left) shows the analysis. As none of the groups can be taken as standard, the standard is calculated from the means of the 3 groups (last column). The results are in the last 4 rows. Adjusted for sex and marital status, the standardized unemployment rate (Adj%) are 23%(95% CI = 8% to 38%) for those with less than secondary education (group 1), 9%(95% CI = 3% to 15%) for those that finished secondary education (group 2), and 4%(0 to 8%) for those with post-secondary education (group 3).
The last table (to the right) compares the 3 groups. The differences and the standard error of the differences are as shown in the table, but the 95% confidence intervals requires recalculation. As there are 3 groups and therefore 3 comparisons, the 95% CI (p=0.025, z=1.96) will need to be recalibrated. For 95% confidence interval when there are 3 comparisons, Bonferroni's correction means p = 0.025 / 3 = 0.0083, and z for p=0.0083, as looked up in our z table in the Probability of z Explained and Table Page is 2.4. The 95% CI are therefore difference ±2.4SE instead of ±1.96SE. This difference is trivial in this example when the percent values are rounded to integers, but the difference would be more noticeable if there are many groups to compare. The differences are therefore very similar to that without Bonferroni's correction, and are as follows:
The program in the Regression Analysis Using Proportions Program Page
is a special case of the contingency table analysis.
The entry data consists of 3 columns.
The second and the third column are the number of cases that have positive (Npos) and negative (Nneg)attributes
We want to study changing business environment by looking at the proportion of business failures over the years. In 1990 there were 110 business start ups, 10 of which failed (100 successful). In 1992 there were 58 start ups, and 8 failed (50 successful). We ran out of research money in 1993 and 1994 so did not collect any data, but in 1995 we found 361 start ups and 61 of which failed (300 successful). The data matrix for input is therefore as shown on the left.
The data is firstly displayed by the program, as shown to the right. We can see that failure rates were 9.1% for 1990, 13.8% for 1992, and 16.9% for 1995, and the overall failure rate is 14.9%
The program now partitions the chi squares, as shown in the table to the left. The analysis shows that there is a regression that is significant at p<0.05 level. Once this is partitioned, the residual variation is not statistically significant. Finally, the regression coefficient is calculated. Change in proportion per unit row value = 0.015, indicates that, between 1990 and 1995, business failures increases by 1.5% per year.
Two Proportions
Altman D (1994) Practical Statistics for medical Research. Chapman Hall, London. ISBN 0 412 276205 (First Ed. 1991)
Altman GD, Machin D, Bryant TN, Gardner MJ (2000) Statistics with confidence (2nd Ed). BMJ Books ISBN 0 7279 1375 1
Machin D, Campbell M, Fayers, P, Pinol A (1997) Sample Size Tables for Clinical Studies. Second Ed. Blackwell Science IBSN 0-86542-870-0
Marascuilo,L. A. 1966: Large-sample multiple comparison. Psychological Bulletin 65: p. 280 - 290. Daniel,W. W. 1990: Applied nonparametric statistics. 2nd ed. Boston PWS Kent Zwick,R. and L.A.Marascuilo. (1984) Selection of pairwise multiple comparison procedures for parametric and nonparametric analysis of variance models. Psychological Bulletin, 95(1): 148-155. Regression Using Proportions Steel R.G.D., Torrie J.H., Dickey D.A. Principles and Procedures of Statistics. A Biomedical Approach. 3rd. Ed. (1997) ISBN 0-07-061028-2 p. 520-521 Standardization Armitage P (1980) Statistical Methods in Medical Research. Blackwell Scientific Publications, Oxford UK. ISBN 0 632 05430 1 p. 384-388, p.371 |
![]() |