StatTools : Pattern Probability Explained

Links : Home Index (Subjects) Contact StatTools

Related link :
Pattern Probability Analysis (Analysis Using Reference Data) Program Page
Pattern Probability (Use of Reference Pattern on New Data) Program Page

Introduction Example Technical Considerations References

Although the classical Bayesian Probability model as explained in the Bayesian Classification Explained Page is well accepted and its usefulness has stood the test of time, a number of difficulties in using it are identified. These are ;

  • If 2 binary variables are used, 4 patterns are produced (--, -+, +- ++). if 3 binary variables are used, 8 patterns are produced (---, --+, -+-, -++, +--, +-+, ++-, +++). The number of patterns required when n binary variables are used is therefore 2n, even more exponential if there are more than 2 categories in the variables. This exponential increase means that the number of variables used to assign a case to a group must be limited.
  • As there are numerous combinations of variables, some combinations can be uncommon. This means very large databases are required to develop the template to avoid instability in performance.
  • There is a lack of flexibility once the template is developed. All the information required in a pattern must be available for a decision to be made. In real life situations such as medical diagnosis or evaluating students, the complete information may take time and resources to obtained, while interim decisions based on incomplete information may be required, and often are sufficiently adequate for action. An example is in the diagnosis of appendicitis, where the diagnosis is made using a combination of signs, symptoms, tests, and development over time, but in many cases, a decision to operate may need to be made before most of the required information are available.

The Pattern Probability Model addresses these difficulties by making the assumption that the variables (attributes) used to assign a case to a group are unrelated, that there is no within group correlation between these variables. Once this assumption is accepted, the influence of each variable on the final decision can be calculated independently, and the total influence is merely a probability function of the individual influences.

  • If a binary variables are used, 2 patterns are produced (- or + for present, and + or - for absent. if a variable with 3 options is used, 3 patterns are produced (- or + for option 1, - or + for option 2, and - or + for option 3. The number of patterns required increases linearly depending 2 or more options involved in each of the variables. In so doing, the model is able to accommodate a large set of variables (attributes or predictors)
  • Although a particular combinations of variables is uncommon, each variable in the combination is less uncommon. The combined influence is calculated mathematically, and a stable and consistent model is easier to produce.
  • The model can be used flexibly. At any stage, the probability that a case belongs to a particular group can be calculated using only the available information at the time. When a full set of information is available, the results are consistent regardless of the order they are included into the model.

How the program should be used and the results interpreted are best discussed in the examples section, and some of the issue in its used discussed in the technical consideration section