Friday, July 31, 2009

103 degrees (of freedom)

In honor of the record 103 degree heat in Seattle this week, I write about degrees of freedom, referenced in the results of every inferential stat. Degrees of freedom are the number of actual observed data points in an inferential calculation minus the number of estimated data points. Suppose we compared 50 users of original Excel with 50 users of redesigned Excel on the time required to create and save a new spreadsheet. We would have 100 observed time scores and 2 estimated means (1 for the group of original Excel users and 1 for the redesigned users). Thus we have 98 degrees of freedom. The results of an independent samples t-test would look like t (98) = 2.53, p < 0.05, where the number in parentheses is the degrees of freedom.

Degrees of freedom address the effect of calculating a statistical estimate (the 5% or less statistical probability of a false positive) based on another estimate (mean score of a group). The more degrees of freedom, the more statistical power you have to find a significant effect. To see this power, consider if you were comparing original Excel vs. redesigned Excel time on task with 2 users in each group. The degrees of freedom would be 4 observed scores minus 2 means = 2. With so few degrees of freedom, the redesigned and original Excel times on task would have to be hugely different to find statistical significance. With more degrees of freedom, the difference between means would not have to be so exaggerated in order to find statistical significance.

No comments:

Post a Comment