The Aligned Rank Transform for Nonparametric Factorial Analyses Using Only ANOVA Procedures
Jacob O. Wobbrock, Leah Findlater, Darren Gergle, James J. Higgins
Presented at CHI 2011, May 7-12, 2011, Vancouver, British Columbia, Canada
Author Bios
- Jacob O. Wobbrock is an Associate Professor in the Information School and an Adjunct Associate Professor in the University of Washington's Department of Computer Science and Engineering. He focuses on novel interaction techniques.
- Leah Findlater will be affiliated with University of Maryland's College of Information Studies, but is currently in the Information School at the University of Washington.
- Darren Gergle is an Associate Professor in Northwestern University's Department of Communication Studies and Department of Electrical Engineering. His work is in HCI related to visual information.
- James J. Higgins is a Professor in Kansas State University's Department of Statistics.
Summary
Hypothesis
How well do aligned rank transform (ART) analysis on existing data sets correspond with the authors' results performed through other measures?
Methods
The authors used their software with data sets from published HCI work and compared the results with the original authors' findings. One case evaluated the use of ART to provide interaciton effects. The second showed how ART is not bound to the distributional assumptions of ANOVA. The third is nonparametric testing of repeated measures data.
Results
The first case study revealed a possible interaction that the Friedman test used in the original could not have found. It also found interactions that even an inappropriate test revealed.
The second study originally found minimal interactions because the data was lognormal. The ART test revealed that the interactions were far more significant.
The third study found that ART reduced the skew in the data and revealed all of the significant interactions, which could not be found in the original study.
Contents
Nonparametric data appears frequently in multi-factor HCI experiments, but current methods are likely to violate ANOVA assumptions or do not allow for the examine of interaction effects. Methods exist to solve this problem but are not widely available or easy to use. The authors developed a generalizable system that relies on ART to align and rank data before performing F-tests. ARTool is the desktop version and ARTweb runs online.
ART is usable in similar situations as parametric ANOVA but it does not require a continuous or ordinal response variable and does not need to be normally distributed. Rank tests apply ranks to data sets, and alignment aligns data depending on the effect to remove all effects except one. The prodcedure follows five steps. The first is to compute residuals, which is the response minus the average of all response who have a factor level that matches the response in question. Then, the estimated effects are calculated for all main and interaction effects. The authors presented a generalized version for an n-way interaction for the response. The aligned response is then found by adding the results from the previous two steps. The averaged ranks are assigned with the smallest rank from the previous step receiving a rank of 1 and so on. Then a full-factorial ANOVA is performed on the result of the previous step. The two opportunities to asess correctness are that the results of the third step should have a column that sums to 0. An ANOVA on the results of the third step should show all of the effects stripped out except the effect for which the data was aligned.
ARTool parses long-format data tables and produces aligned and ranked response for all main and interaction effects. It produces descriptive error messages in cause of a problem and has an output that contains (2+N)+2(2^N-1) columns. The system does not work in case of extreme skew of data and is best in randomized designs.
Discussion
The authors developed a system that applies a vetted statistical method and then validated it against pre-existing papers. Their method confirmed the results but also produced interesting venues of future work that went unfound. Because of this, I found their results to be believable.
I held concerns about ART because such a useful method should be standard in the average statistical package. On the other hand, it may just be a matter of the method's having gone undiscovered in CHI.
Possible future work could include comparisons of the various statistical methods the authors discussed with ART on the same data set. This could help to validate it as a means of evaluating CHI results.
No comments:
Post a Comment