Instructors and helpers


Data scientists, according to interviews and expert estimates, spend from 50 percent to 80 percent of their time mired in the mundane labor of collecting and preparing data, before it can be explored for useful information. - NYTimes (2014)

What to expect

This is going to be a fun workshop.

The plan is to expose you to a lot of great tools that you can have confidence using in your research. You’ll be working hands-on and doing the same things on your own computer as we do live on up on the screen. We’re going to go through a lot in these two days and it is hard to remember it all at once, but you’ll know you can do it and know where to look for help as you go forward with your analyses. Googling is a big part of coding!

In this workshop we’ll be talking about:

Workshop materials

Data science workflow

The tidy data workflow will help you think deliberately about data and your analyses. In our workshop we will be focusing Tidy, Transform, and Visualise.

This graphic is from Wickham & Grolemund’s R for Data Science, which is a must-read (read it for free online or order a hardcopy from Amazon). This is a way to think deliberately and reproducibly about they way you work with data.

By the end of the course

You’ll have hands-on experience with importing and wrangling data in OpenRefine and R. In R, you’ll have created scripts with analysis and visualizations. You’ll have seen how git does version control on these scripts for you (no more ‘my_script_v2_Aug_17.R’) and can interact with them online on your GitHub account. It’s going to be great!