APA-style manuscripts with RMarkdown and papaja

Dr James Bartlett and Dr Gaby Mahrholz

Workshop overview

  1. Overview of reproducible documents.

  2. papaja walkthrough.

  3. Work on your own project.

Preparation

  • Have you downloaded the workshop materials?

  • Are tidyverse, papaja, and pwr installed?

  • Do you have a LaTeX system installed?

What problems are we trying to solve?

Computational reproducibility

Reporting errors

  • 49.6% of articles containined an inconsistency between test statistic, degrees of freedom, and p-values (Nuijten et al., 2016)

What problems are we trying to solve?

Connection between code and output

  • Even when authors share data and reproducible code, there might not be a clear connection between the code and relevant output in the manuscript (anecdote)

Reproducible manuscripts

  • Potential to avoid these errors as you combine code, results, and prose in one document

  • When there are errors, there are reproducible errors

James’ journey

It took me several years and projects to adopt this workflow:

  1. Copying results from SPSS output

  2. Copying results from R output

  3. Using R Markdown to create reproducible results

  4. Using papaja to write reproducible manuscripts

papaja

papaja (Aust & Barth, 2022) adds templates when you create a new R Markdown document

# Install latest CRAN release
install.packages("papaja")

In this tutorial, we will walk through:

  • YAML options

  • Citations/references via a .bib file

  • Inline code

  • Tables / figures

Mock example: Error-free vs error-full code

  • We have created a mock example using papaja and simulated data to show it capabilities

  • Building on Hoffman and Elmi (2021): What is the effect of teaching debugging skills on students’ data wrangling ability?

  • Randomly allocate students to an error-full or error-free lecture (IV) and measure performance on a data skills assignment (DV)

YAML options

Author information

  • Name and affiliation for each author, but only one corresponding author

  • Option to include contributorship roles, such as CRediT.

---
author: 
  - name          : "James Bartlett"
    affiliation   : "1"
    corresponding : yes    # Define only one corresponding author
    address       : "62 Hillhead Street, Glasgow"
    email         : "james.bartlett@glasgow.ac.uk"
    role:         
      - "Conceptualization"
      - "Writing - Original Draft Preparation"
      - "Writing - Review & Editing"
---

YAML options

Adding a .bib file

I recommend Zotero as a reference manager: https://www.zotero.org/

  • Create a collection

  • Export collection

  • Format BibTeX and OK

  • Save as in your document working directory

    ---
    bibliography      : ["references.bib", "r-references.bib"]
    ---

YAML options

Changing the reference style

  • Once you have a .bib file, you can easily change the style by selecting a different citation style language (CSL)

  • Over 10,000 in the Zotero style repository, just save as and add .csl to the file: https://www.zotero.org/styles

E.g., APA 7th edition

    ---
    csl               : apa7.csl
    ---

YAML options

Changing the reference style

  • Once you have a .bib file, you can easily change the style by selecting a different citation style language (CSL)

  • Over 10,000 in the Zotero style repository, just save as and add .csl to the file: https://www.zotero.org/styles

E.g., Vancouver

    ---
    csl               : vancouver.csl
    ---

YAML options

Manuscript options

Depending on the journal submission guidelines, you can change different features like:

  • Floating figures/tables in-text or at the end

  • Being kind to your reviewer and adding line numbers

  • Masking the manuscript and omitting author information

---
floatsintext      : yes # Figures and tables floating or at the end?
linenumbers       : yes # Add line numbers? 
draft             : no # Add draft watermark on every page? 
mask              : no # Hide author details for blind submission? 
---

YAML options

Output options

  • The default output for the knitted document is a PDF:
---
output            : papaja::apa6_pdf 
---
  • However, you can also knit a Word document if you (or collaborators) need it:
---
output            : papaja::apa6_word
---

Citations and references

citr

Alongside papaja, Aust created a great helper package called citr which makes it easy to browse a .bib file and insert citations.

# Not currently on CRAN
devtools::install_github("crsh/citr")

Inline code

Power analysis

# Inputs hidden on slides to save space

sample_size <- ceiling(
  pwr.t.test(d = small_telescopes,
             sig.level = alpha, 
             power = power,
             type = "two.sample",
             alternative = "two.sided")$n
)

Using an effect size of d = 0.38, we aimed to recruit 149 participants per group for an independent samples t-test (\(\alpha\) = 0.05, power = 0.9).

Inline code

Power analysis

Behind the scenes…

Using an effect size of d = 'r small_telescopes', we aimed to recruit 'r sample_size' participants per group for an independent samples t-test ('$\alpha$' = 'r alpha', power = 'r power').

Adding external images

papaja supports adding external images via knitr: http://frederikaust.com/papaja_man/reporting.html#figures

knitr::include_graphics("Screenshots/procedure_diagram.png")

Figures

You display reproducible graphs from your code chunks

Figures

Behind the scenes…

mock_data %>% 
  ggplot(aes(x = Group, y = DV, fill = Group)) + 
  geom_violin() +
  # remove the median line with fatten = NULL
  geom_boxplot(width = .2, 
               fatten = NULL, colour = "black") +
  stat_summary(fun = "mean", geom = "point") +
  stat_summary(fun.data = "mean_se", 
               geom = "errorbar", 
               width = .1) + 
  scale_fill_viridis_d(option = "D", begin = 0.3, end = 0.6) + 
  theme_classic() + 
  theme(legend.position = "None") + 
  labs(x = "Lecture Group",
       y = "Data skills test score (%)")

Figures

In the code chunk settings, you can do things like reference a caption and control the size of figures

Figure \@ref(fig:violin-plot) shows...

(ref:violin-plot-caption) Violin and boxplot of... 

```{r violin-plot, fig.cap="(ref:violin-plot-caption)", out.width="100%"}

mock_data %>% 
  ggplot(aes(x = Group, y = DV, fill = Group)) + 
  geom_violin()...
  
```

Tables

papaja has some helper functions for creating APA style tables (which don’t play nicely with html…): http://frederikaust.com/papaja_man/reporting.html#tables

(#tab:unnamed-chunk-8) Descriptive statistics of…
Group Mean SD Min Max
Error-Free 49.94 11.03 14.91 75.80
Error-Full 54.88 10.42 24.13 92.22

Note. Test scores could range from 0-100%

 

Tables

Behind the scenes…

# Calculate descriptives
mock_descriptives <- mock_data %>% 
  group_by(Group) %>% 
  summarise(Mean = mean(DV),
            SD = sd(DV),
            Min = min(DV),
            Max = max(DV))

# papaja function to round and save as character
descriptives <- printnum(mock_descriptives)

# papaja function to creata APA table
apa_table(descriptives,
          caption = "Descriptive statistics of...",
          note = "Test scores could range from 0-100%")

Inline code

Statistical tests

papaja has helper functions for creating APA style result formatting: http://frederikaust.com/papaja_man/reporting.html#statistical-models-and-tests

“Consistent with our hypothesis, a Welch t-test shows that participants in the error-full group produced significantly higher data skills assignment scores than those in the error-free group, \(\Delta M = -4.94\), 95% CI \([-7.49, -2.40]\), \(t(272.23) = -3.82\), \(p < .001\).”

Inline code

Statistical tests

Behind the scenes…

# Save ttest as object
mock_ttest <- t.test(DV ~ Group, 
                     data = mock_data)

# papaja helper function of printing results in APA
apa_ttest <- apa_print(mock_ttest)$full_result

“Consistent with our hypothesis, a Welch t-test shows that participants in the error-full group produced significantly higher data skills assignment scores than those in the error-free group, 'r apa_ttest'.”

Model objects

Saving objects

If you have code which takes a long time to run, you can save model objects:

write_rds(mock_ttest, 
          file = "Model1.Rds")

Loading objects

You can then load in the objects quickly within a code chunk:

mock_ttest <- read_rds(file = "Model1.Rds")

Where to learn more?