# Load packages
library(pwr)
library(effectsize)
library(broom)
library(afex)
library(emmeans)
library(performance)
library(tidyverse)
# Read data and add new column
james_data <- read_csv("data/James_2015.csv") %>%
mutate(PID = row_number(),
Condition = as.factor(Condition)) %>%
select(PID,
Condition,
intrusions = Days_One_to_Seven_Image_Based_Intrusions_in_Intrusion_Diary)
12 One-way ANOVA
This chapter marks the beginning of the content you cover in Research Methods 2. You should be familiar with all the content from Research Methods 1 we covered in Chapters 1 to 11, so please revisit previous chapters if you need a refresher.
In the course materials, you have worked through manually calculating an ANOVA to gain a conceptual understanding. However, when you run an ANOVA, typically the computer does all of these calculations for you. In this chapter, we will show you how to run a one-way ANOVA using the
Chapter Intended Learning Outcomes (ILOs)
By the end of this chapter, you will be able to:
Apply and interpret a one-way ANOVA.
Break down the results of a one-way ANOVA using post-hocs tests and apply a correction for multiple comparisons.
Conduct a power analysis for a one-way ANOVA.
12.1 Chapter preparation
12.1.1 Introduction to the data set
For this chapter, we are using open data from experiment 2 in James et al. (2015). The abstract of their article is:
Memory of a traumatic event becomes consolidated within hours. Intrusive memories can then flash back repeatedly into the mind’s eye and cause distress. We investigated whether reconsolidation - the process during which memories become malleable when recalled - can be blocked using a cognitive task and whether such an approach can reduce these unbidden intrusions. We predicted that reconsolidation of a reactivated visual memory of experimental trauma could be disrupted by engaging in a visuospatial task that would compete for visual working memory resources. We showed that intrusive memories were virtually abolished by playing the computer game Tetris following a memory-reactivation task 24 hr after initial exposure to experimental trauma. Furthermore, both memory reactivation and playing Tetris were required to reduce subsequent intrusions (Experiment 2), consistent with reconsolidation-update mechanisms. A simple, non-invasive cognitive-task procedure administered after emotional memory has already consolidated (i.e., > 24 hours after exposure to experimental trauma) may prevent the recurrence of intrusive memories of those emotional events.
In summary, they were interested in whether you can reduce intrusive memories associated with a traumatic event. Participants were randomly allocated to one of four groups and watched a video designed to be traumatic:
Control
Reactivation + Tetris
Tetris
Reactivation
They measured the number of intrusive memories prior to the start of the study, then participants kept a diary to record intrusive memories about the film in the 7 days after watching it. The authors were interested in whether the combination of reactivation and playing Tetris would lead to the largest reduction in intrusive memories. You will recreate their analyses using a one-way ANOVA.
12.1.2 Organising your files and project for the chapter
Before we can get started, you need to organise your files and project for the chapter, so your working directory is in order.
In your folder for research methods and the book
ResearchMethods1_2/Quant_Fundamentals
, create a new folder calledChapter_12_ANOVA
. WithinChapter_12_ANOVA
, create two new folders calleddata
andfigures
.Create an R Project for
Chapter_12_ANOVA
as an existing directory for your chapter folder. This should now be your working directory.Create a new R Markdown document and give it a sensible title describing the chapter, such as
12 ANOVA
. Delete everything below line 10 so you have a blank file to work with and save the file in yourChapter_12_ANOVA
folder.We are working with a new data set, so please save the following data file: James_2015.csv. Right click the link and select “save link as”, or clicking the link will save the files to your Downloads. Make sure that you save the file as “.csv”. Save or copy the file to your
data/
folder withinChapter_12_ANOVA
.
You are now ready to start working on the chapter!
12.1.3 Activity 1 - Read and wrangle the data
As the first activity, try and test yourself by completing the following task list to practice your data wrangling skills. Create a final object called james_data
to be consistent with the tasks below.
To wrangle the data, complete the following tasks:
-
Load the following packages (several of these are new, so revisit Chapter 1 if you need a refresher of installing R packages, but remember not to install packages on the university computers / online server):
pwr effectsize broom afex emmeans performance tidyverse
Read the data file
data/James_2015.csv
to the object namejames_data
.Create a new variable called
PID
that equalsrow_number()
to act as a participant ID which is currently missing from the data set.Convert
Condition
to a factor.-
Select and rename the following three variables as we do not need the others:
PID
Condition
Rename
Days_One_to_Seven_Image_Based_Intrusions_in_Intrusion_Diary
tointrusions
.
You should have the following in a code chunk:
12.1.4 Activity 2 - Create summary statistics
Next, we want to calculate some descriptive statistics to see some overall trends in the data. We are really interested in the scores from each experimental group rather than overall.
Summarise the data to show the mean, standard deviation, and standard error for the number of intrusive memories (
intrusions
) grouped byCondition
.Your table should have four columns,
Condition
,mean
,sd
, andse
.
Hint: You can calculate the standard error through: sd/sqrt(n)
or sd/sqrt(length(some_variable_name)
.
12.1.5 Activity 3 - Visualisation
Now we can visualise the data. In the original paper they use a bar plot, but let’s use a better plot that gives us more information about the data.
Create a violin-boxplot with the number of intrusive memories on the y-axis and condition on the x-axis (See Chapter 7 if you need a reminder).
Change the labels on the x-axis to something more informative for the condition names.
Your plot should look like this:
You should have the following in a code chunk:
We can see from this plot that there are a few potential outliers in each of the groups. This information is not present in the bar plot, which is why it’s not a good idea to use them for this kind of data.
12.2 One-way ANOVA
Now we can run the one-way ANOVA using aov_ez()
from the mod
. As well as running the ANOVA, the aov_ez()
function also conducts a Levene’s test for homogeneity of variance so that we can test our final assumption.
12.2.1 Activity 4 - Running a one-way ANOVA using afex
aov_ez()
will likely produce some messages that look like errors, do not worry about these, they are just letting you know what it’s done. Run the code below to view the results of the ANOVA.
mod <- aov_ez(id = "PID", # the column containing the participant IDs
dv = "intrusions", # the DV
between = "Condition", # the between-subject variable
es = "pes", # sets effect size to partial eta-squared
type = 3, # this affects how the sum of squares is calculated, set this to 3
include_aov = TRUE,
data = james_data)
mod
Anova Table (Type 3 tests)
Response: intrusions
Effect df MSE F ges p.value
1 Condition 3, 68 10.09 3.79 * .143 .014
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '+' 0.1 ' ' 1
Just like with the t-tests and correlations, we can use tidy()
to make the output easier to work with. Run the below code to transform the output. Do not worry about the warning message, it is just telling you it does not know how to automatically rename the columns so it will keep the original names.
Warning in tidy.anova(.): The following column names in ANOVA output were not
recognized or transformed: num.Df, den.Df, MSE, ges
term | num.Df | den.Df | MSE | statistic | ges | p.value |
---|---|---|---|---|---|---|
Condition | 3 | 68 | 10.08578 | 3.794762 | 0.1434073 | 0.0140858 |
-
term
= the IV
-
num.Df
= degrees of freedom effect -
den.Df
= degrees of freedom residuals -
MSE
= Mean-squared errors -
statistic
= F-statistic -
ges
= effect size
-
p.value
= p.value
You should refer to the lecture for more information on what each variable means and how it is calculated.
Is the overall effect of Condition significant?
What is the F-statistics to 2 decimal places?
According to the rules of thumb, the effect size is
12.2.2 Activity 5 - Checking assumptions for ANOVA
To test the assumptions, we must use the model we created with aov_ez()
. For a one-way independent ANOVA, the assumptions are the same as for a Student t-test / regression model with a categorical predictor:
The DV is interval or ratio data.
The observations should be independent.
The residuals should be normally distributed.
There should be homogeneity of variance between the groups.
We know that 1 and 2 are met because of the design of our study. To test 3, we can look at the qq-plot of the residuals. Instead of saving just the model object, we must specifically select the aov
component and run our diagnostic plots.
The qq-plot shows the assumption of normality might not be ideal. Is this a problem? If the sample sizes for each group are equal, then ANOVA is robust to violations of both normality and of homogeneity of variance. If you are interested, there is a good discussion of these issues in Blanca et al. (2018) and Knief & Forstmeier (2021). We can check how many participants are in each condition using count()
:
Thankfully, the sample sizes are equal, so we should be OK to proceed with the ANOVA. It is not clear whether normality was checked in the original paper.
For the last assumption, we can test homogeneity of variance by checking the third diagnostic plot for the scale against location.
Compared to normality, this assumption looks closer to being supported as the variance of each group is approximately equal. James et al. (2015) suspect there might be issues with this assumption as they mention that the ANOVAs do not assume equal variance, however, the results of the ANOVA that are reported are identical to our results above where no correction has been made although the post-hoc tests are Welch t-tests (you can tell this because the degrees of freedom have been adjusted and are not whole numbers).
While all of this might seem very confusing - we imagine you might be wondering what the point of assumption testing is given that it seems to be ignored - we are showing you this for three reasons:
To reassure you that sometimes the data can fail to meet the assumptions and it is still ok to use the test. To put this in statistical terms, many tests are robust to mild deviations of normality and unequal variance, particularly with equal sample sizes.
As a critical thinking point, to remind you that just because a piece of research has been published does not mean it is perfect and you should always evaluate whether the methods used are appropriate.
To reinforce the importance of pre-registration where these decisions could be made in advance, and/or open data and code so that analyses can be reproduced exactly to avoid any ambiguity about exactly what was done. In this example, given the equal sample sizes and the difference in variance between the groups isn’t too extreme, it looks like it is still appropriate to use an ANOVA but the decisions and justification for those decisions could have been more transparent.
12.2.3 Activity 6 - Post-hoc tests
For post-hoc comparisons, the paper appears to have computed Welch t-tests but there is no mention of any multiple comparison correction. We could reproduce these results by using t.test()
for each of the contrasts.
For example, to compare condition 1 (the control group) with condition 2 (the reactivation plus tetris group) we could run:
james_data %>%
filter(Condition %in% c("1", "2")) %>%
droplevels() %>% # ignore unused factor levels
t.test(intrusions ~ Condition,
data = .)
Welch Two Sample t-test
data: intrusions by Condition
t = 2.9893, df = 22.632, p-value = 0.006627
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
0.990331 5.454113
sample estimates:
mean in group 1 mean in group 2
5.111111 1.888889
Because Condition
has four levels, we cannot just specify intrustion ~ Condition
because a t-test compares two groups and it would not know which of the four to compare so first we have to filter the data and use a new function droplevels()
. It’s important to remember that when it comes to R there are two things to consider, the data you can see and the underlying structure of that data. In the above code we use filter()
to select only conditions 1 and 2 so that we can compare them. However, that does not change the fact that R “knows” that Condition
has four levels - it does not matter if two of those levels do not have any observations any more, the underlying structure still says there are four groups. droplevels()
tells R to remove any unused levels from a factor. Try running the above code but without droplevels()
and see what happens.
However, a quicker and better way of doing this that allows you apply a correction for multiple comparisons easily is to use emmeans()
which computes all possible pairwise comparison t-tests and applies a correction to the p-value.
First, we use emmeans()
to run the comparisons and then we can pull out the contrasts and use tidy()
to make it easier to work with.
Run the code below. Which conditions are significantly different from each other? Are any of the comparisons different from the ones reported in the paper now that a correction for multiple comparisons has been applied?
mod_pairwise <-emmeans(mod,
pairwise ~ Condition,
adjust = "bonferroni")
mod_contrasts <- mod_pairwise$contrasts %>%
tidy()
mod_contrasts
term | contrast | null.value | estimate | std.error | df | statistic | adj.p.value |
---|---|---|---|---|---|---|---|
Condition | Condition1 - Condition2 | 0 | 3.2222222 | 1.058604 | 68 | 3.0438406 | 0.0199179 |
Condition | Condition1 - Condition3 | 0 | 1.2222222 | 1.058604 | 68 | 1.1545602 | 1.0000000 |
Condition | Condition1 - Condition4 | 0 | 0.2777778 | 1.058604 | 68 | 0.2624001 | 1.0000000 |
Condition | Condition2 - Condition3 | 0 | -2.0000000 | 1.058604 | 68 | -1.8892804 | 0.3787128 |
Condition | Condition2 - Condition4 | 0 | -2.9444444 | 1.058604 | 68 | -2.7814405 | 0.0419783 |
Condition | Condition3 - Condition4 | 0 | -0.9444444 | 1.058604 | 68 | -0.8921602 | 1.0000000 |
The inquisitive among you may have noticed that mod
is a list of 5 and seemingly contains the same thing three times: anova_table
, aov
and Anova
. The reasons behind the differences are too complex to go into detail on this course (see The R Companion website here for more information) but the simple version is that anova_table
and Anova
use one method of calculating the results (type 3 sum of squares) and aov
uses a different method (type 1 sum of squares). What’s important for your purposes is that you need to use anova_table
to view the overall results (and replicate the results from papers) and aov
to run the follow-up tests and to get access to the residuals (or lm()
for factorial ANOVA). As always, precision and attention to detail is key.
12.2.4 Activity 7 - Power and effect sizes
Finally, we can replicate their power analysis using pwr.anova.test
from the
On the basis of the effect size of d = 1.14 from Experiment 1, we assumed a large effect size of f = 0.4. A sample size of 18 per condition was required in order to ensure an 80% power to detect this difference at the 5% significance level.
Balanced one-way analysis of variance power calculation
k = 4
n = 18.04262
f = 0.4
sig.level = 0.05
power = 0.8
NOTE: n is number in each group
We have already got the effect size for the overall ANOVA, however, we should also really calculate Cohen’s d using cohens_d()
from mod_contrasts
- just make sure your understand which bits of the code you would need to change to run this on different data. As we are binding rows and columns rather than joining, it is critical the comparisons are already in the correct order.
# Calculate Cohen's d for all comparisons
d_1_2 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(1, 2)) %>%
droplevels())
d_1_3 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(1, 3)) %>%
droplevels())
d_1_4 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(1, 4)) %>%
droplevels())
d_2_3 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(2, 3)) %>%
droplevels())
d_2_4 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(2, 4)) %>%
droplevels())
d_3_4 <- cohens_d(intrusions ~ Condition,
data = filter(james_data,
Condition %in% c(3, 4)) %>%
droplevels())
# Bind all the comparisons in the order of mod_contrasts
pairwise_ds <- bind_rows(d_1_2,
d_1_3,
d_1_4,
d_2_3,
d_2_4,
d_3_4)
# Bind this object to the mod_contrasts object
mod_contrasts <- mod_contrasts %>%
bind_cols(pairwise_ds)
What are your options if the data do not meet the assumptions and it’s really not appropriate to continue with a regular one-way ANOVA? As always, there are multiple options and it is a judgement call.
You could run a non-parametric test, the Kruskal-Wallis for between-subject designs and the Friedman test for within-subject designs.
If normality is the problem, you could try transforming the data.
You could use bootstrapping, which is not something we will cover in this course at all.
12.3 Reporting the results of your ANOVA
The below code replicates the write-up in the paper, although has changed the Welch t-test to the pairwise comparisons from emmeans()
.
Second, and critically, for the 7-day diary postintervention, there was a significant difference between groups in overall intrusion frequency in daily life, F(`r mod_output$num.Df`, `r mod_output$den.Df`) = `r mod_output$statistic %>% round(2)`, p = `r mod_output$p.value %>% round(3)`, ηp2 = .`r mod_output$ges %>% round(2)`. Planned comparisons demonstrated that relative to the no-task control group, only those in the reactivation-plus-Tetris group, t(`r mod_contrasts$df[1]`) = `r mod_contrasts$statistic[1] %>% round(2)`, p = `r mod_contrasts$adj.p.value[1] %>% round(2)`, d = `r mod_contrasts$Cohens_d[1] %>% round(2)`, experienced significantly fewer intrusive memories; this finding replicated Experiment 1. The reactivation-plus-Tetris group had significantly fewer intrusive thoughts than the reactivation-only group, t(`r mod_contrasts$df[5]`) = `r mod_contrasts$statistic[5] %>% round(2)`, p = `r mod_contrasts$adj.p.value[5] %>% round(2)`, d = `r mod_contrasts$Cohens_d[5] %>% round(2)`. Further, there were no significant differences between the reactivation-plus-Tetris group and the Tetris-only group, t(`r mod_contrasts$df[4]`) = `r mod_contrasts$statistic[4] %>% round(2)`, p = `r mod_contrasts$adj.p.value[4] %>% round(2)`, d = `r mod_contrasts$Cohens_d[4] %>% round(2)`, the no-task control group and the reactivation-only group, t(`r mod_contrasts$df[3]`) = `r mod_contrasts$statistic[3] %>% round(2)`, p = `r mod_contrasts$adj.p.value[3] %>% round(2)`, or between the no-task control group and the Tetris-only group, t(`r mod_contrasts$df[2]`) = `r mod_contrasts$statistic[2] %>% round(2)`, p = `r mod_contrasts$adj.p.value[2] %>% round(2)`
If you add that code to your R Markdown document, knitting it should create the following:
Second, and critically, for the 7-day diary postintervention, there was a significant difference between groups in overall intrusion frequency in daily life, F(3, 68) = 3.79, p = 0.014, ηp2 = .0.14. Planned comparisons demonstrated that relative to the no-task control group, only those in the reactivation-plus-Tetris group, t(68) = 3.04, p = 0.02, d = 1, experienced significantly fewer intrusive memories; this finding replicated Experiment 1. Critically, as predicted by reconsolidation theory, the reactivation-plus-Tetris group had significantly fewer intrusive memories than the Tetris-only group, t(68) = -1.89, p = 0.38, d = -0.84, as well as the reactivation-only group, t(68) = -2.78, p = 0.04, d = -1.11. Further, there were no significant differences between the no-task control group and the reactivation-only group, t(68) = 0.26, p = 1, or between the no-task control group and the Tetris-only group, t(68) = 1.15, p = 1
12.4 End of chapter
Well done! You have now covered how to run a one-way ANOVA using the
In the next chapter, we will extend this to when you have multiple independent variables and you want to investigate the interaction between them for how they affect your dependent variable / outcome.