Survey analysis software comes with a variety of techniques for gathering insight from data, and one of those techniques may be correspondence analysis. Correspondence analysis is a tool for summarizing the data in tables, and it’s perhaps best illustrated using an example.
Starting with Crosstab
Cross-tabulation tables, or crosstabs, are common in data analysis when you have information that can be put into two distinct sets of categories. Let’s say you had data relating to people’s opinions of three different types of pets.
Moving to Correspondence Analysis
While crosstabs are handy for reviewing the numbers, charting the results on a graph can make the results even easier to understand while providing additional insights about the data. That graphical representation of the crosstab is known as correspondence analysis.
If you wanted to manually map out the above crosstab data into a correspondence analysis, you could follow a series of steps.
Calculate the averages of the rows and the columns.
Averages were rounded to whole numbers for simplicity.
Calculate the expected values for each cell.
The expected value for each cell can be computed by:
- Taking the row average for that cell
- Multiplying that row average by the column average of the chosen variable
- Dividing the results by the overall average
If we took dog and trustworthy, our calculation would involve:
- The dog row average of 8
- Multiplied by the trustworthy column average of 4
- Divided by the overall average of 6
- 8 x 4 / 6 = 5.3, which we’ll round down to 5
The expected value of each cell in the crosstab would be calculated using the same formula.
Calculate the residuals.
The residual is calculated by subtracting the expected values from the original value. The residuals illustrate the association between the row categories and column categories. Large numbers indicate a strong relationship, while smaller or negative numbers indicate a weaker relationship.
If we subtracted the expected value of 5 for dog and trustworthy from the original value of 10, our result would be a residual of 5.
Plot the different row and column categories, placing those with similar residuals near each other.
You’ll begin to see associations develop as you place the residuals of different categories on the graph. Those with similar residuals will be most alike, and therefore placed close together. Residuals with the greatest difference between them will be placed the greatest distance apart, putting them on opposite ends of the graph.
Interpret the relationship between the row categories and the column categories.
Here’s where things can get tricky. It’s true you’ll have all the row categories with similar residuals close together. You’ll also have all the column categories with similar residuals close together.
But this does not mean the row categories share similar residuals with the column categories, even if they are placed near each other on the graph. The similar nature works from one row category to another row category, and from one column category to another column category.
But it does not work between row categories and column categories.
If you wanted to compare a row category with a column category, you would need to take on another series of computations that ultimately resulted in determining the angle created when you mapped out:
- The line connected your desired row category to the origin
- The line connecting your desired column category to the origin
An Easier Path to Correspondence Analysis
As you may have guessed, there’s an easier way to get from crosstab to correspondence analysis. As you may also have guessed, the solution is survey analysis software.
Even if you never have to think about expected values, residuals or row and column averages as you let your survey analysis software do the work, it’s useful to know what work is being done. This not only helps you understand the computations that go into different analysis techniques, but it definitely gives you a hearty appreciation of the full extent of what survey analysis software can do for you.