APA-style manuscripts with RMarkdown and papaja

Dr James Bartlett and Dr Gaby Mahrholz

Workshop overview

Overview of reproducible documents.
papaja walkthrough.
Work on your own project.

Preparation

Have you downloaded the workshop materials?
Are tidyverse, papaja, and pwr installed?
Do you have a LaTeX system installed?

What problems are we trying to solve?

Computational reproducibility

31% reproducibility success rate in one journal (Hardwicke et al., 2018)
58% reproducibility success rate in registered reports (Obels et al., 2020)

Reporting errors

49.6% of articles containined an inconsistency between test statistic, degrees of freedom, and p-values (Nuijten et al., 2016)

What problems are we trying to solve?

Connection between code and output

Even when authors share data and reproducible code, there might not be a clear connection between the code and relevant output in the manuscript (anecdote)

Reproducible manuscripts

Potential to avoid these errors as you combine code, results, and prose in one document
When there are errors, there are reproducible errors

James’ journey

It took me several years and projects to adopt this workflow:

Copying results from SPSS output
Copying results from R output
Using R Markdown to create reproducible results
Using papaja to write reproducible manuscripts

papaja

papaja (Aust & Barth, 2022) adds templates when you create a new R Markdown document

# Install latest CRAN release
install.packages("papaja")

In this tutorial, we will walk through:

YAML options
Citations/references via a .bib file
Inline code
Tables / figures

Mock example: Error-free vs error-full code

We have created a mock example using papaja and simulated data to show it capabilities
Building on Hoffman and Elmi (2021): What is the effect of teaching debugging skills on students’ data wrangling ability?
Randomly allocate students to an error-full or error-free lecture (IV) and measure performance on a data skills assignment (DV)

YAML options

Author information

Name and affiliation for each author, but only one corresponding author
Option to include contributorship roles, such as CRediT.

---
author: 
  - name          : "James Bartlett"
    affiliation   : "1"
    corresponding : yes    # Define only one corresponding author
    address       : "62 Hillhead Street, Glasgow"
    email         : "james.bartlett@glasgow.ac.uk"
    role:         
      - "Conceptualization"
      - "Writing - Original Draft Preparation"
      - "Writing - Review & Editing"
---

YAML options

Adding a .bib file

I recommend Zotero as a reference manager: https://www.zotero.org/

Create a collection
Export collection
Format BibTeX and OK
Save as in your document working directory

    ---
    bibliography      : ["references.bib", "r-references.bib"]
    ---

YAML options

Changing the reference style

Once you have a .bib file, you can easily change the style by selecting a different citation style language (CSL)
Over 10,000 in the Zotero style repository, just save as and add .csl to the file: https://www.zotero.org/styles

E.g., APA 7th edition

    ---
    csl               : apa7.csl
    ---

YAML options

Changing the reference style

Once you have a .bib file, you can easily change the style by selecting a different citation style language (CSL)
Over 10,000 in the Zotero style repository, just save as and add .csl to the file: https://www.zotero.org/styles

E.g., Vancouver

    ---
    csl               : vancouver.csl
    ---

YAML options

Manuscript options

Depending on the journal submission guidelines, you can change different features like:

Floating figures/tables in-text or at the end
Being kind to your reviewer and adding line numbers
Masking the manuscript and omitting author information

---
floatsintext      : yes # Figures and tables floating or at the end?
linenumbers       : yes # Add line numbers? 
draft             : no # Add draft watermark on every page? 
mask              : no # Hide author details for blind submission? 
---

YAML options

Output options

The default output for the knitted document is a PDF:

---
output            : papaja::apa6_pdf 
---

However, you can also knit a Word document if you (or collaborators) need it:

---
output            : papaja::apa6_word
---

Citations and references

citr

Alongside papaja, Aust created a great helper package called citr which makes it easy to browse a .bib file and insert citations.

# Not currently on CRAN
devtools::install_github("crsh/citr")

Inline code

Power analysis

# Inputs hidden on slides to save space

sample_size <- ceiling(
  pwr.t.test(d = small_telescopes,
             sig.level = alpha, 
             power = power,
             type = "two.sample",
             alternative = "two.sided")$n
)

Using an effect size of d = 0.38, we aimed to recruit 149 participants per group for an independent samples t-test ($\alpha$ = 0.05, power = 0.9).

Inline code

Power analysis

Behind the scenes…

Using an effect size of d = 'r small_telescopes', we aimed to recruit 'r sample_size' participants per group for an independent samples t-test ('$\alpha$' = 'r alpha', power = 'r power').

Adding external images

papaja supports adding external images via knitr: http://frederikaust.com/papaja_man/reporting.html#figures

knitr::include_graphics("Screenshots/procedure_diagram.png")

Figures

You display reproducible graphs from your code chunks

Figures

Behind the scenes…

mock_data %>% 
  ggplot(aes(x = Group, y = DV, fill = Group)) + 
  geom_violin() +
  # remove the median line with fatten = NULL
  geom_boxplot(width = .2, 
               fatten = NULL, colour = "black") +
  stat_summary(fun = "mean", geom = "point") +
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               width = .1) + 
  scale_fill_viridis_d(option = "D", begin = 0.3, end = 0.6) + 
  theme_classic() + 
  theme(legend.position = "None") + 
  labs(x = "Lecture Group",
       y = "Data skills test score (%)")

Figures

In the code chunk settings, you can do things like reference a caption and control the size of figures

Figure \@ref(fig:violin-plot) shows...

(ref:violin-plot-caption) Violin and boxplot of... 

```{r violin-plot, fig.cap="(ref:violin-plot-caption)", out.width="100%"}

mock_data %>% 
  ggplot(aes(x = Group, y = DV, fill = Group)) + 
  geom_violin()...
  
```

Tables

papaja has some helper functions for creating APA style tables (which don’t play nicely with html…): http://frederikaust.com/papaja_man/reporting.html#tables

(#tab:unnamed-chunk-8) Descriptive statistics of…
Group	Mean	SD	Min	Max
Error-Free	49.94	11.03	14.91	75.80
Error-Full	54.88	10.42	24.13	92.22

Note. Test scores could range from 0-100%

Tables

Behind the scenes…

# Calculate descriptives
mock_descriptives <- mock_data %>% 
  group_by(Group) %>% 
  summarise(Mean = mean(DV),
            SD = sd(DV),
            Min = min(DV),
            Max = max(DV))

# papaja function to round and save as character
descriptives <- printnum(mock_descriptives)

# papaja function to creata APA table
apa_table(descriptives,
          caption = "Descriptive statistics of...",
          note = "Test scores could range from 0-100%")

Inline code

Statistical tests

papaja has helper functions for creating APA style result formatting: http://frederikaust.com/papaja_man/reporting.html#statistical-models-and-tests

“Consistent with our hypothesis, a Welch t-test shows that participants in the error-full group produced significantly higher data skills assignment scores than those in the error-free group, $\Delta M = -4.94$, 95% CI $[-7.49, -2.40]$, $t(272.23) = -3.82$, $p < .001$.”

Inline code

Statistical tests

Behind the scenes…

# Save ttest as object
mock_ttest <- t.test(DV ~ Group, 
                     data = mock_data)

# papaja helper function of printing results in APA
apa_ttest <- apa_print(mock_ttest)$full_result

Model objects

Saving objects

If you have code which takes a long time to run, you can save model objects:

write_rds(mock_ttest, 
          file = "Model1.Rds")

Loading objects

You can then load in the objects quickly within a code chunk:

mock_ttest <- read_rds(file = "Model1.Rds")

Where to learn more?

Slides and folder containing mock example available on Github: https://github.com/BartlettJE/papaja_demo
Full example from James’ recent publication: https://osf.io/gm4jr/
papaja manual (section 7 includes published manuscripts using papaja): http://frederikaust.com/papaja_man/