Table of Contents Table of Contents
Previous Page  257 / 762 Next Page
Basic version Information
Show Menu
Previous Page 257 / 762 Next Page
Page Background

Chapter 6: Categorical Data

257

From part (b), the distributions of ratings were conditioned on the gender of

the student. From the computed conditional distributions for ratings, we see

that the values are the same for the favorable classification for both male and

female genders. Also, for the unfavorable classification, the conditional

distributions are the same for both genders. Based on the definition of

independence for contingency tables, one will again conclude that there is no

association between the gender of the students and their responses of

favorable or unfavorable. That is, these variables are independent of each

other.

In conclusion, if the faculty member who did the survey knew the gender of

the student, he or she would

not

be at an advantage over one who did not

know the gender of the student in predicting the response.

Note:

Contingency tables are not only restricted to 2x2 classifications. Other areas

of statistics deal with much more complex tables.

6-5 Simpson’s Paradox

It is generally accepted that the larger the data set, the more reliable are the

inferences made from the data set. Simpson's paradox, however, shines a

different light on this general opinion. Simpson's paradox illustrates that a

great deal of thought must be given to the inferences when combining small

data sets into a larger one. Sometimes, inferences from the larger data set

contradict the inferences from the smaller data sets. In addition, the

inferences from the larger data set are also usually incorrect. We will

demonstrate with an illustration.

Illustration

For a 1973 study on gender bias in admissions to the graduate school at the

University of California, Berkeley,

Table 6-10

shows the information

obtained for the five largest majors on that campus.