Clone wiki

Workshops / Home

PhD Camp 2019: Tidyverse and ggplot2 for Beginners


This workshop is based on Hadley Wickham's "Many Models" tutorial, most of the stuff in this tutorial is covered in much more depth here. I mainly did the workshop because I found tidyverse hard to get my head around (it took a few months) but I feel like if I'd gone to a workshop it would have made things slightly quicker.

Why am I here?

Existentially: not sure.

If you want to learn either Tidyverse, or ggplot2 though, this might be an ok place.

What do I need to know before I do this?

Some basic things about R wouldn't hurt. There are a few resources below to help with this, but honestly, the code is all in the .Rmd file in the repository, so you might be ok just running stuff and seeing what happens (back yourself!). There is information below on how to run the code.

How do I download the materials?

Navigate to Downloads and click Download Repository - if you hover over the words you will see it's a hyperlink and you can click on them.

Screen Shot 2019-04-08 at 12.09.09 pm.png

What is contained in the repository?

Go to the folder called PhDCamp_2019:

  • Tidyverse - ggplot2 : A presentation to accompany the code. There are versions in powerpoint (.pptx), keynote (.key) and pdf (.pdf).

    • I would use either the powerpoint or keynote as there are videos in there, but pdf is there just in case.
  • tidyverse_for_beginners.Rmd : The accompanying code for the presentation

    • Open this file with RStudio
  • A pdf version of the code from the tidyverse_for_beginners.Rmd file

    • I have knitted the code into a pdf document if you would just like to read through it and not run any code. Also good to pop on your tablet and annotate. Fancy.
  • raw_data.csv : a file with the data for the ggplot2 workshop section

  • markdown_figures : a folder containing all of the figures which appear in the tidyverse_for_beginners.Rmd file.

  • figure1.png : when you run the tidyverse_for_beginners.Rmd file, it will produced one saved plot as a demo. This is the plot. If you run the code, it will save over this, so if you want to check saving works ok, just delete this file from your folder, and see if running the code produces another one!

Before you start

You need a copy of the following on your computer:

  • The R coding language

  • RStudio

    • Note: a common beginners error is thinking RStudio and R are the same thing. They aren't! R is a language, like French, which allows you to "speak stats". R studio is a program which runs R (the language) and has stuff which makes it easier to use. It's like walking around Paris with a French interpreter, but you know, without any of the good stuff like pastry and I'm really selling this aren't I?

Packages in R

  • Once you have downloaded R, we need to install a few extra bits of software to do some stats in R.

  • These are called "packages", which you can think of as special boxes which contain scripts (instructions) called "functions" which are like boxes that take our data, do something to it, and throw it out of the box, nice and shiny.

  • Note, R has a lot of built in packages (and functions) but we want some extra ones.

  • After you have installed R, ann R studio, double click on the RStudio program on your computer.

  • Open it up and then install the following packages in one of the following ways:


  • Click on "Install" in the Packages tab (see screen shot below). In the command box which pops up, type the names of the packages you want to install and make sure they are separated by commas.

  • Click the "Install dependencies" box (see screenshot below):

Screen Shot 2019-04-08 at 12.07.25 pm.png

METHOD 2: In the console, type the following:

install.packages(c("tidyverse", "gapminder", "broom", "sjstats", "tibble", "gridExtra"), dependencies = TRUE)

  • You might be prompted to answer "yes" or "no" to stuff in the console. Try "yes" first.

How to organise everything on your computer

  • You should have the following things in a folder called PhDCamp_2019 somewhere on your computer where you can easily navigate to it. Keep everything in a folder, don't have things floating around on your Desktop in some sort of anti-Kondo stance.

Screen Shot 2019-04-08 at 11.39.33 am.png

How do I get started once I install everything?

  • Open the tidyverse_for_beginners.Rmd file using RStudio.

  • This will open a RMarkdown file. You can play sections of the code (called Code Chunks) using the "play" button shown below, the tiny little green button on the RHS.

Screen Shot 2019-04-08 at 11.49.23 am.png


If you want to learn more about R:

  • If you already know some basics regarding the tidyverse but would like to practise your skills on more datasets, Tidy Tuesday might be useful for you

  • Danielle Navarro has an excellent book for psychological scientists: and there are some online resources.

  • Hadley Wickham has an excellent book called R for data science and if you want more you can even read cue dramatic music Advanced R which is also truely delightful!

  • Danielle also has a tidyverse for beginners workshop and the links to slides are available on this website and also here, have them now!

  • Software Carpentry have an excellent tutorial on the specific dataset used in this tutorial too.

This tutorial is just a basic summary of some of the stuff here to get you started quickly with your data, and it's not intended to be comprehensive in any sense.

What if I find errors in the code?

If you could email me that would be lovely!

Feel free to include memes with your email, I love em.

Will this spark joy?

It should, but I guess there are no guarantees in life. The figure below shows you how you can organise data by country, and store each country's data as a separate dataset, so you can then run models on them individually, which is pretty cool!

Screen Shot 2019-04-08 at 11.34.08 pm.png