Reproducible Data Processing and Visualization
in R and tidyverse
Introduction

This Open Source eBook provides materials for the semester-long Master’s seminar course “Reproducible Data Processing and Visualization in R” that I deliver at the University of Bern’s Institute of Psychology.
How to use this book
This book is made up of individual Quarto (.qmd) workbooks. Many of the exercises are easiest to complete in your own local copy of these .qmd files.
I suggest that you download the .zip file of all the .qmd files and their supporting data and cheatsheets and use the eBook as a reference book. If the the .zip file does not contain the most recent versions, please contact me and I’ll updated it. If I’m slow or you’re feeling impatient, you can also download the most recent versions from GitHub, along with the other files that create this eBook.
You can also copy and paste the code for any chapter directly from the website. Click the “</> Code” button on the top right of each page to see the full .qmd file’s code. You can copy and paste this into a .qmd file. However, it’s probably easier to download all the .qmd files and data as mentioned above.
Learning to code is a practice skill. Almost anyone can become competent in writing reproducible code for data processing and visualization with practice. More than anything else, completing this course requires that you practice in your own time, using not only the examples provided but also ones you create yourself. Take real data sets from your own studies, or the thousands available on the Open Science Framework (osf.io), or create simulated messy datasets with my R package {truffle} and practice.

Other learning resources
There are many excellent Open Source resources to learn R and {tidyverse} for data processing and visualization. Readers are encouraged to seek them out to support the materials already provided in this book. I can particularly recommend the following ones:
- Lisa DeBruine et al.’s (2021) Open Source textbook Data Skills for Reproducible Research
- Allison Horst’s interactive web app for learning {dplyr}
- Hadley Wickham’s Open Source textbook R For Data Science (aka R4DS)
- Particularly the section on data transformation
- You can complete all the exercises in R4DS with interactive feedback by installing the {r4ds.tutorials} package (
install.packages("r4ds.tutorials")) loading the library (library(r4ds.tutorials)), and going to the “Tutorial” tab in the Environment corner of RStudio. Individual tutorials can be completed alongside the R4DS book, in parallel to this book.
- Garrick Aden-Buie’s tidyexplain gifs for understanding how {tidyr}’s pivot functions work
- datasciencebox.org
- Interactive tutorials that very useful for practicing processing and visualization skills once you’ve already learned the functions (i.e., don’t start here)
- Recorded presentations on many relevant topics
Find me elsewhere
You can find my recent and current research interests, a summary of the courses I teach, and occasional blog posts on my personal website mmmdata.io.
You can find me on Bluesky at @ianhussey.mmmdata.io.
Contributing
If you are interested in contributing to or adapting this eBook, all code and data are available on GitHub.