Data Analysis and Visualization in R for Ecologists

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R.

This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R.

This lesson assumes no prior knowledge of R or RStudio and no programming experience.

Episodes

  1. Before we start
  2. Introduction to R
  3. Starting with data
  4. Manipulating, analyzing and exporting data with tidyverse
  5. Data visualization with ggplot2
  6. SQL databases and R

Preparations

Data Carpentry’s teaching is hands-on, and to follow this lesson learners must have R and RStudio installed on their computers. They also need to be able to install a number of R packages, create directories, and download files. To avoid troubleshooting during the lesson, learners should follow the instruction below to download and install everything beforehand. If they are using their own computers this should be no problem, but if the computer is managed by their organization’s IT department they might need help from an IT administrator.

Install R and RStudio

R and RStudio are two separate pieces of software:

If you don’t already have R and RStudio installed, follow the instructions for your operating system below. You have to install R before you install RStudio. Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.

Windows

MacOS

Linux

Update R and RStudio

If you already have R and RStudio installed, check if your R and RStudio are up to date:

Install required R packages

During the course we will need a number of R packages. Packages contain useful R code written by other people. We will use the packages tidyverse, hexbin, patchwork, and RSQLite.

To try to install these packages, open RStudio and copy and paste the following command into the console window (look for a blinking cursor on the bottom left), then press the Enter (Windows and Linux) or Return (MacOS) to execute the command.

install.packages(c("tidyverse", "hexbin", "patchwork", "RSQLite"))

Alternatively, you can install the packages using RStudio’s graphical user interface by going to Tools -> Install Packages and typing the names of the packages separated by a comma.

R tries to download and install the packages on your machine. When the installation has finished, you can try to load the packages by pasting the following code into the console:

If you do not see an error like there is no package called ‘...’ you are good to go!

Download the data

We will download the data directly from R during the lessons. However, if you are expecting problems with the network, it may be better to download the data beforehand and store it on your machine. The data files for the lesson can be downloaded manually here: https://doi.org/10.6084/m9.figshare.1314459

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/datacarpentry/R-ecology-lesson, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".