2008-05-24

Obtaining the same ANOVA results in R as in SPSS - the difficulties with Type II and Type III sums of squares

I calculated the ANOVA results for my recent experiment with R. In brief, I assumed that women perform poorer in a simulation game (microwolrd) if under stereotype threat than men. My students who assisted in the experiments used SPSS for their calculations. I realized that they obtained different results than I did, with the same model on the same data set. As I was new to R, my initial calculation, an analysis of covariance (ANCOVA) with the dependent variable microworld performance (MWP), the treatment factors gender and stereotype threat, and the covariate reasoning ability, looked like this:

I see two significant main effects of the treatment factors, a significant effect of the covariate, and a significant interaction effect. However, Quick-R tells me this:
WARNING: R provides Type I sequential SS, not the default Type III marginal SS reported by SAS and SPSS. In a nonorthogonal design with more than one term on the right hand side of the equation order will matter (i.e., A+B and B+A will produce different results)! We will need use the drop1( ) function to produce the familiar Type III results.
I do not want order to matter and adjust my calculation accordingly:
What a difference: The main effect of the participants' gender on thir microworld performance does not reach statistical significance. However, that is still not what SPSS produces:

UNIANOVA MWP BY GENDER STTHREAT WITH reasonz
/METHOD=SSTYPE(3)
/INTERCEPT=INCLUDE
/CRITERIA=ALPHA(0.05)
/DESIGN=reasonz GENDER STTHREAT GENDER*STTHREAT.

In SPSS, the main effect of gender is still significant. I dug a little deeper and found another line I needed to add to the R command in order to get exactly the same result:

As you can see, these results are identical. But why all these differences? What does options(contrasts=c("contr.sum", "contr.poly")) actually do and what the heck are Type-III sums of squares? I surely did not learn about these things at my university. I thus did a little reading.

It turns out that the decision about which type of sums of squares to use is based on the question whether it is reasonable to report main effects in the presence of an interaction. Let's review the hypothesis of the experiment: It assumes that women exhibit a decrease in microworld performance under stereotype threat. This is an interaction hypothesis. An error bar plot (lines representing 1 SE) reveals that this is the case:
The plot indicates a significant interaction between gender and stereotype threat. The main effect of stereotype threat is obtained by averaging the performance scores of all participants (both male and female) over the two stereotype threat conditins. This will lead to a low average score under the stereotype threat condition because of the interaction, because the female participants score so extremely low unter stereotype threat and account for the lower average. Thus, it makes no sense to look at the main effect of stereotype threat if an interaction of stereotype threat * gender is present.

Looking for a main effect of stereotype threat under the presence of a significant interaction is a violation of the marginality principle that assumes that all terms to which a particular term is marginal are zero. Lower order terms are marginal to higher order terms, i.e. the main effects of two factors A and B are marginal to the interaction effect A*B. Thus, in this case, the marginality principle would assume that if we inspect and report main effects of gender and stereotype threat, the interaction of stereotype threat and gender is zero. That is not the case and the above example illustrates that - under the given hypothesis - it is useless to report the main effect of stereotype threat.

Now, the problom with Type-III sums of squares (also referred to as marginal sums of squares) is that they are "obtained by fitting each effect after all the other terms in the model, i.e. the Sums of Squares for each effect corrected for the other terms in the model. The marginal (Type III) Sums of Squares do not depend upon the order in which effects are specified in the model" (source). In the case with stereotype threat, that clearly doesn't make any sense: Reporting the Type III sum of squares (as SPSS does per default) for the main effect of stereotype threat means doing so while correcting for the interaction. But it is precisely this interaction that caused the main effect in the first place! Thus, Type-III sums of squares violate the principle of marginality and do not make any sense in the stereotype threat case. Even more so, Type-III sums of squares do "... NOT sum to the Sums of Squares for the model corrected for the mean". I wonder whether this renders the usual way of calculating a factor's effect size eta-square by dividing the SS of the factor by the total SS useless, too?

Anyway, coming back to the ominous contrasts=c("contr.sum", "contr.poly"): In order to obtain the correction for the rest of the factors in the model that Type-III SSs deliver, R needs to know how to balance the factors in the calculation of the SSs. Therefore, it requires a cotrast matrix with zero-sum columns (see here). The R-help for the options() command (?options()) tells us:
contrasts:
the default contrasts used in model fitting such as with aov or lm. A character vector of length two, the first giving the function to be used with unordered factors and the second the function to be used with ordered factors. By default the elements are named c("unordered", "ordered"), but the names are unused.
As the treatment factors gender and stereotype threat are unordered factors, R will use contr.sum in order to construct a contrast matrix of the apropriate order (i.e., 2), because contrasts=c("contr.sum", "contr.poly") was specified. contr.sum(2) produces

[,1]
1 1
2 -1


My first attempt at Type-III SSs in R above produced nonesense and differed from SPSS, because this wasn't specified.Without going into too much detail here (basically because I haven't yet understood everything myself), there is an alternative to the sequence-dependent Type-I SSs and the marginality-violating Type-III SSs: Type II sums of squares preserve the marginality principle. This is how to get them, and this example illustrates that they are diffrent from Type-III SSs and that they are - at least in this case - order independent:
SPSS can do the same by specifying /METHOD=SSTYPE(2) in the UNIANOVA syntax.

The remaining problem in the present case is the main effect of gender. It does make sense to investigate the effect of gender in the presenence of the interaction with stereotype threat, because it could be that women are generally poorer complex problem solvers than men and perform especially poor under stereotype threat on top of the general difference. In fact, the error bar above indicates that this is the case. This leaves me with one main effect that cannot be interpreted (stereotype threat) and another one that can be interpreted. Which SSs should I use? I am a bit lost.



14 comments:

  1. As someone who has struggled with this issue, I feel your pain. I don't fully understand all the issues myself, but I've learned a couple of things by trial and error that may be useful to you. First, forget about "Types" of sums of squares. It's more trouble than it's worth. Just specify the model based on your hypothesis. I've written a paper about this that I'd be happy to send you if you are interested. Second, I've found it easier to keep track of things if I forget about using pre-defined contrasts like "contr.sum" and "contr.poly" and instead specify my contrasts by hand. For example, to set contrasts for a two-level factor X1 in a data frame "Data" you can use "> contrasts(Data$X2) <- matrix(c(1,0), 2,1)". If "X1" has three levels, you can say contrasts(Data$X2) <- matrix(c(0, 1,0, 0, 0, 1), 3, 2)". The advantage of doing it this way is that you know exactly what the contrast codes are.

    With respect to the specific analysis you describe here: As I understand it, it really doesn't make sense to talk about main effects for either gender or stereotype threat. The gender "main effect here refers to the effect of gender averaged across levels of stereotype threat. In other words, it is telling you the effect of gender at a non-observed value of stereotype threat that is halfway between Yes and No. Likewise, the stereotype threat main effect is giving you the the effect of stereotype threat at a value of halfway between male and female, which makes even less sense. What I think you want to do is test the effect of stereotype threat separately at male and female, and to test the effect of male vs. female at high and low stereotype threat.

    You can test the stereotype threat effect for males as follows:

    >contrasts(gender) <- matrix(c(1,0), 2,1)

    summary(lm(mwp ~ gender + stthreat + reasonings)

    And you can test the stereotype threat effect for females:

    contrasts(gender) <- matrix(c(1,0), 2,1)

    summary(lm(mwp ~ gender + stthreat + reasonings)

    Note that if the gender variable is in a data frame you will need to detach and re attach it after changing the contrasts.

    You can use exactly the same procedure to test the effect of gender at high and low stereotype threat.

    Hope this helps,
    Ista

    ReplyDelete
  2. I'm currently trying to bridge the gap between SPSS and R (I've never used SPSS).

    It doesn't appear that aov in the version of R that I'm using is reporting results based on Type I SS for a 3 factor anova I'm running. At least changing the order of the model doesn't change the output. It seems to contradict what you're saying. I'm using version 2.4.0 on a mac, so maybe they've changed it in newer versions?

    ReplyDelete
  3. Anonymous3:58 AM

    Thanks myowelt, that was helpful. I seem to be getting the Type III SS decomposition without changing the contrasts and without using drop1, but instead by breaking down the factorial anova into its constituent contrasts, and using those on the interaction.

    e.g.

    in a 2(A) by 2(B), I create a contrast matrix (call it "cont"):

    -1 -1 -1
    -1 +1 +1
    +1 -1 +1
    +1 +1 -1

    Then if I use lm(dv~C(A:B,cont)), I get the proper p's. Of course it's a little more complicated when you go beyond a 2 by 2, but those contrasts are informative anyway.

    in a 3 by 2:

    -2 0 -1 +2 0
    +1 -1 -1 -1 +1
    +1 +1 -1 -1 -1
    -2 0 +1 -2 0
    +1 -1 +1 +1 -1
    +1 +1 +1 +1 +1

    Again I would just call this "cont" and use it alone in the model, dv~C(A:B,cont). Now the F's (which were basically omnibuses in the ANOVA) are broken down into contrasts when they have more than 1 df, but I think the SS gets sliced the same way as the Type III approach.

    Tuberite

    ReplyDelete
  4. Anonymous11:53 PM

    If you have a significant interaction you cannot talk about "main effects". The sig interaction tells you that the effect of factor A DEPENDS on which level of factor B you are talking about. You need to split your data and do separate anovas for each factor level of interest.

    Engqvist, L. 2005. The mistreatment of covariate interaction terms in linear model analyses of behavioural and evolutionary ecology studies. Anim. Behav. 70:967-971.

    Also, R doesn't have a "typpeIII" option easily available because it will produce nonsense output unless you really know when to apply it.
    Type III SS is a problem of SAS and SPSS, not of R.

    Venebles also says on a mailing list:
    "The objection to Type III sums of squares is that they encourage naive users to do silly things such as test main effects in the presence of interactions, without really asking whether the test makes sense or not, that is, whether it really addresses a question of any interest."

    http://tolstoy.newcastle.edu.au/R/help/03a/2434.html

    Read about more about it here (page 12):
    www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf

    P. Andreas Svensson, Melbourne

    ReplyDelete
  5. Nice post. I recently had a lengthy and nasty argument with a professor who denied that these different tests existed and always made reference to a "standard ANOVA".

    I'm a believer in explicit model comparison. Though I understand sometimes people want a massive table with lots of p-values.

    ReplyDelete
  6. SPSS help offered by Statistics-consultation has been truly remarkable. We have a team of statisticians who are dedicated towards helping research scholars combat all the statistical data analysis issues.

    Dissertation Statistics Help | Dissertation Statistics Consultant | PhD Thesis Statistics Assistance

    ReplyDelete
  7. Anonymous10:43 PM

    Thank you, so very much, for this post. I have a question related to this topic, and I'm hoping you can help. Is there a function to obtain the estimated marginal means or least squares means that are associated with the type III sum of squares method? SPSS will display that information with the output, but I'm not sure how to do it in R. Just using the by(), aggregate(), or with() commands will not produce the group means associated with the grand mean and F-statistic associated with the type III SS in R. Any advice?

    ReplyDelete
  8. I understand what you bring it very meaningful and useful, thanks.
    Signature:
    Jugar juegos de frozen gratis en lĂ­nea gratis, los nuevos de princesa de Disney juegos frozen - la princesa encantadora y linda. Divertirse frozen!

    ReplyDelete
  9. Watch ICC T20 World Cup 2016 Live Streaming, Live Telecast Of World T20 2016 India, India Vs Pakistan, India Vs Australia Watch Online, T20 World Cup Live Streaming Free. live t20 world cup 2016

    ReplyDelete

Note: Only a member of this blog may post a comment.