Paired Enumeration Data

The term "paired data" generally refers to two measurements on an experimental unit, both on the same scale. Our interest lies in testing whether these two measurements have the same distribution. This is to be contrasted with a test of independence of two measurements made on the same experimental unit, each one measured on a different scale (such as coffee consumption and pancreatic cancer).

Here we consider only dichotomous data. Paired dichotomous data can be gotten in many ways. Some examples:

How does this situation differ from the one for which the paired t-test is appropriate? Just as in the discussion leading to the paired t-test, you make two measurements on the same experimental unit, but in this case the only possible outcomes for each measurement are 0 and 1 (representing "no" and "yes", or equivalently representing "failure" and "success"). Except for this, the two situations are the same.

Data: (X1,Y1),...,(Xn,Yn) a simple random sample of paired observations, where the X's are dichotomous 0 or 1, and the Y's are dichotomous 0 or 1. Let p1 be the proportion of 1's (successes) among the X's, and let p2 be the proportion of 1's (successes) among the Y's.
Generally within a pair, Xi and Yi are dependent; we want to test whether the distributions for X and for Y are the same. This is the same as testing the null hypothesis that p1 and p2 are equal.

e.g. Cosmetic skin testing for hypoallergenicity. To show a new product is "hypoallergenic" you must prove it provokes fewer reactions than the current market leader. You can test subjects to see of they react to a cosmetic mix put on their skin. Suppose you use each subject in your experiment twice: once with the "Market Leader" and once with the "New Product". You recruit 40 subjects. Paint the backs of these 40 subjects with 2 patches: one patch using the Market Leader, the other using the New Product. Look for reactions: 0=none and 1=any reaction. Suppose that 45% (18/40) react to the Market Leader, 20% (8/40) react to the New Product, 7.5% (3/40) to both, 42.5% (17/40) to neither. Summarize the data:


     Market Leader         New Product       # cases

         0 (no)              0 (no)             17

         0 (no)              1 (yes)             5

         1 (yes)             0 (no)             15

         1 (yes)             1 (yes)             3

                                               ----

                                                40

Note that there are no other possibilities when the measurement is dichotomous. The question: Is there a difference between the Market Leader and the New Product in the probability of provoking a reaction (i.e. is p1 equal to p2)?

Can you use a paired t-test here? It's not justified since the differences are not normally distributed - the only possible differences are -1, 0, or +1. The test we do get, however, uses only these differences, so the idea of a paired t-test is not off the wall.

What about ties, which show up as 0 differences? A no-no response and a yes-yes response tell you nothing about whether the two treatments are the same. So discard the ties (i.e. drop all 0 differences), leaving n'=20 untied observations: 15 where the New Product was better, and 5 where the Market Leader was better. Count each of these as a success if the new is "better" than the old, as a failure if not. Then under the null hypothesis of no difference, you are sampling from a dichotomous population with p=0.5 which leads us to a Binomial Distribution.

Let B=#pairs where Yi < Xi among the 20 untied cases=15. Note that Yi < Xi means that for person i, the New Product was "better" than the Market Leader.
With no ties, under the null hypothesis of no difference, it is just a likely for the New Product to be better than the Market Leader, as it is for the Market Leader to be better than the New Product. So if the null hypothesis is true, B ~ Bin(20,.5). Since 20(.5)=10 the sample size is large enough to justify using a normal approximation to test the null hypothesis. Formally, we write

Ho: Probability of a reaction is the same for the Market Leader and the New Product.
Ha: Not the same
Test Statistic: z=(15-10)/2.236=2.236
Critical Region: Reject Ho in favor of Ha at 5% if |z| >= 1.96
P-value=P(z<=-2.236)+P(z>=2.236)=.0125+.0125=.025
Conclusion: Reject Ho in favor of Ha at 5%.
There is evidence of a difference.

McNemar's Test Statistic

This test is equivalent to the z-test above, but is computationally simpler. Write the data in a 2x2 table:


               New Product Reaction

                    NO     YES

                  ---------------

     Market    NO | 17   |  5   |

     Leader       |-----|-------|

     Reaction YES | 15   |  3   |

                  ---------------

Ho: Probability of a reaction is the same for the Market Leader and the New Product.
Ha: Not the same
Test Statistic: M=(5-15)^2/(5+15)=100/20=5
Critical Region: Reject Ho in favor of Ha at 5% if M >= 3.84
Conclusion: Reject Ho in favor of Ha at 5%.
There is evidence of a difference.

P-value: Using Chi-square tables with df=1, P(M>=5) lies between .025 and .05 (it is actually almost precisely .025).

Some comments:

The test described here (either the z or McNemar's) is often called a test for "correlated proportions", reminding you that the observations on a pair are not independent. Putting the data in a 2x2 table can be problematic, since you can easily get confused over whether to use the ordinary Chi-square test of McNemar's test. How do you decide which to use? The answer lies in the design of your study and its purpose. Remember

Simple formula for McNemar's statistic

Often you will see the formula for McNemar's test statistics written as This is correct as long as your data for Treatments "1" and "2" are written as

                    Treatment 1

                    NO     YES

                  ---------------

     Treat-    NO |  a   |  b   |

     ment         |-----|-------|

       2      YES |  c   |  d   |

                  ---------------

or written as

                    Treatment 1

                    YES     NO

                  ---------------

     Treat-   YES |  a   |  b   |

     ment         |-----|-------|

       2       NO |  c   |  d   |

                  ---------------

and we let
Then the test of the null hypothesis proceeds as follows:

Ho:p1=p2
Ha:p1 < > p2
Test Statistic: M=(b-c)^2/(b+c)
Critical Region: Reject Ho in favor of Ha at 5% if M >= 3.84
P-value: Using the right hand tail of the Chi-square tables with df=1, find where M lies.

In particular, note that we must write no & yes symmetrically in the rows and columns of the table (both no-yes, no-yes, or the other way around, both yes-no, yes-no). The test statistic M uses only b and c from the table, i.e. tied observations do not affect it, and it has the same value regardless of which way (indicated above) you write your data.


How Bad is it to Ignore Pairing?

We have some data looking at allergic reaction to different kinds of penicillin, Type G and Type BT. These are done by skin tests.

Give 500 subjects type G, observe 52 reactions or 10.4%
Give the SAME 500 subjects type BT, observe 68 reactions or 13.6%

Let p1 be the probability of a reaction using type G.
Let p2 be the probability of a reaction using type BT.

This is paired dichotomous data, and you cannot analyze it unless you know, in addition to the above, how many subjects react to both G and BT. But suppose you decide to IGNORE the pairing and analyze the data as two independent samples. This is WRONG, and the calculation below will show you how much!

Incorrect Analysis follows:

Ho: p1 = p2
Ha: p1 <> p2
Test Statistic: z = (.104-.136)/sqrt[ (.12)(.88)(1/500 + 1/500) ] = -.032/.02055 = -1.56
THIS IS WRONG BECAUSE YOU ARE IGNORING PAIRING!
Critical Region: reject Ho for Ha at 5% if z >= 1.96 or z <= -1.96
P-value=.1188
Conclusion: fail to reject at 5%

THE CORRECT ANALYSIS FOLLOWS:

The additional information you need is 50 subjects reacted to both types. Now you can put your data in a 2x2 table:


    Skin Reactions to Penicillin:

              Type G

            yes     no

          ---------------

Type  yes |  50  |  18  |

          |------|------|

 BT    no |   2  | 430  |

          ---------------



Ho: p1 = p2
Ha; p1 < > p2
Test Statistic: M = (18-2)^2/(18+2) = 256/20 = 12.8
Critical Region: reject Ho for Ha at 5% if M >= 3.84
(under Ho M is approx Chi-Squared df=1)
P-value: from Table G, P-value<.0005
Conclusion: reject at 5% (or 1%, or even at .1%)
This represents very strong evidence against Ho.

Ignoring pairing can be very costly.


Return Return to 100A main menu