A chi-square is a nonparametric test used to determine if there is a relationship between two categorical variables. Let.s take a simple example. Suppose a researcher brought male and female participants into the lab and asked them which color they prefer.blue or green. The researcher believes that color preference may be related to gender. Notice that both gender (male, female) and color preference (blue, green) are categorical variables. If there is a relationship between gender and color preference, we would expect that the proportion of men who prefer blue would be different than the proportion of women who prefer blue. In general, you have a relationship between two categorical variables when the distribution of people across the categories of the first variable changes across the different categories of the second variable.
To determine if a relationship exists between gender and color preference, the chi-square test computes the distributions across the combination of your two factors that you would expect if there were no relationship between them. In then compares this to the actual distribution found in your data. In the example above, we have a 2 (gender: male, female) X 2 (color preference: green, blue) design. For each cell in the combination of the two factors, we would compute "observed" and "expected" counts. The observed counts are simply the actual number of observations found in each of the cells. The expected proportion in each cell can be determined by multiplying the marginal proportions found in a table. For example, let us say that 52% of all the participants preferred blue and 48% preferred green, whereas 40% of the all of the participants were men and 60% were women. The expected proportions are presented in the table below.
Expected proportion table
| Males | Females | Marginal proportion |
Blue | 20.8% | 31.2% | 52% |
Green | 19.2% | 28.8% | 48% |
Marginal proportion | 40% | 60% | |
As you can see, you get the expected proportion for a particular cell by multiplying the two marginal proportions together. You would then determine the expected count for each cell by multiplying the expected proportion by the total number of participants in your study. The chisquare statistic is a function of the difference between the expected and observed counts across all your cells. Luckily you do not actually need to calculate any of this by hand, since SPSS will compute the expected counts for each cell and perform the chi-square test.
To perform a chi-square test of independence in SPSS
Choose Analyze thengoto Descriptive Statistics thengoto Crosstabs.
Put one of the variables in the Row(s) box
Put the other variable in the Column(s) box
Click the Statistics button.
Check the box next to Chi-square.
Click the Continue button.
Click the OK button.
The output of this analysis will contain the following sections.
Case Processing Summary. Provides information about missing values in your two variables.
Crosstabulation. Provides you with the observed counts within each combination of your two variables.
Chi-Square Tests. The first row of this table will give you the chi-square value, its degrees of freedom and the p-value associated with the test. Note that the p-values produced by a chi-square test are inappropriate if the expected count is less than 5 in 20% of the cells or more. If you are in this situation, you should either redefine your coding scheme (combining the categories with low cell counts with other categories) or exclude
categories with low cell counts from your analysis.