Figure 8-1
At first glance, RACE differences appear to be very important (overall, 58% of those surveyed said people cannot be trusted, but the epsilon statistic -- the difference between the highest and lowest percentage -- is 22). Also note that few Respondents said “Depends” – most had a definite opinion here.
Let’s do some recoding: RACE should be recoded into
a different variable called RACER (Race Recoded). Whites and Blacks will
stay the same, but Other is eliminated by recoding it as missing (see Figures
8-2 and 8-3). Review Chapter 3 if you need to refresh your memory on how
to recode.
Figure 8-2 |
Figure 8-3 |
Let’s also recode TRUST into a different variable called TRUSTR to eliminate the “Depends” category. Don’t forget to create new value labels after you recode. Now run the crosstabs for TRUSTR and RACER. Your output should look like Figure 8-4.
Figure 8-4
When "pp", a percentage point difference (epsilon) is this high, it’s “interesting” (actually, anything higher than 10-12 is interesting) even if you don't yet know whether it is statistically significant. Here you have a pp difference of 24. And here’s how you might describe what you’ve found so far: “Although most Respondents (62%) say that other people cannot be trusted, over 80% of the Black respondents said this compared to 58% of the Whites in this sample.” Or, “Fewer than one-fifth of Blacks said that people can be trusted, compared to more than two-fifths of Whites.”
Is this a strong relationship (statistically speaking)? There are a lot of choices in the "Statistics" dialog box, but here we will just look at the gamma statistic (your instructor will probably have you look at other statistics, but gamma is almost always appropriate; see Figure 8-5). Yes it is significant.
Figure 8-5
Can you have confidence that race is the causal factor here? While it may indeed be true that race is explanatory, you won't really have confidence in this conclusion until you have failed to account for this variation in any other way. To do this, we will need to do some elaboration analysis by running crosstabs of (i.e., "controlling for") other independent variables to see if something else might account for this variation among respondents.
Recall that your original crosstabs procedure produces one contingency table, with as many rows as there are categories (or values) of the dependent variable, and as many columns as there are categories of the independent variable. When you start using control (sometimes called test) variables, you will get as many separate tables as there are categories of the control variable. For instance, if you want to control for levels of education, and simply used EDUC as the control variable, you end up with 20 separate tables. This is NOT a good idea. Try doing this to see what we mean. Notice how difficult it is to compare across this many tables. So before you do any further analysis, recode your variables into the smallest number of categories that are still logically useful. Review Chapter 3 if you have forgotten how to do this.
In the next example EDUC was recoded as EDUC2 into two categories, those with high school or less (0 12 years), and those with more then high school (13+ years). After you have done these recodes, let's see what happens when we do crosstabs again. This time we will control for our recoded education variable. To do the appropriate crosstabs, go to the Analyze, Descriptive Statistics, Crosstabs menu. Enter TRUSTR into the Row box and RACER into the Column box. Now you are ready for the next step, the addition of a control variable. Choose EDUC2 from your variables list and enter it into the empty box at the bottom of the Crosstabs screen. Figure 8-6 shows you what the Crosstabs dialog box will look like.
Figure 8-6
The SPSS output for this procedure is shown in Figure 8-7.
Figure 8-7
Figure 8-4 shows the original, or zero order contingency table of the relationship between trust and race.
