Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R.
This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R.
This lesson assumes no prior knowledge of R or RStudio and no programming experience.
tidyverse
ggplot2
Data Carpentry’s teaching is hands-on, and to follow this lesson learners must have R and RStudio installed on their computers. They also need to be able to install a number of R packages, create directories, and download files. To avoid troubleshooting during the lesson, learners should follow the instruction below to download and install everything beforehand. If they are using their own computers this should be no problem, but if the computer is managed by their organization’s IT department they might need help from an IT administrator.
R and RStudio are two separate pieces of software:
If you don’t already have R and RStudio installed, follow the instructions for your operating system below. You have to install R before you install RStudio. Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.
.exe
file that was just downloaded.pkg
file for the latest R versionsudo apt-get install r-base
, and for Fedora sudo yum install R
), but we don’t recommend this approach as the versions provided by this are usually out of date. In any case, make sure you have at least R 3.3.1.sudo dpkg -i rstudio-x.yy.zzz-amd64.deb
at the terminal).If you already have R and RStudio installed, check if your R and RStudio are up to date:
sessionInfo()
into the console. If your R version is 4.0.0 or later, you don’t need to update R for this lesson. If your version of R is older than that, download and install the latest version of R from the R project website for Windows, for MacOS, or for LinuxDuring the course we will need a number of R packages. Packages contain useful R code written by other people. We will use the packages tidyverse
, hexbin
, patchwork
, and RSQLite
.
To try to install these packages, open RStudio and copy and paste the following command into the console window (look for a blinking cursor on the bottom left), then press the Enter (Windows and Linux) or Return (MacOS) to execute the command.
install.packages(c("tidyverse", "hexbin", "patchwork", "RSQLite"))
Alternatively, you can install the packages using RStudio’s graphical user interface by going to Tools -> Install Packages and typing the names of the packages separated by a comma.
R tries to download and install the packages on your machine. When the installation has finished, you can try to load the packages by pasting the following code into the console:
If you do not see an error like there is no package called ‘...’
you are good to go!
We will download the data directly from R during the lessons. However, if you are expecting problems with the network, it may be better to download the data beforehand and store it on your machine. The data files for the lesson can be downloaded manually here: https://doi.org/10.6084/m9.figshare.1314459
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/datacarpentry/R-ecology-lesson, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".