Workshop 4. Introduction to R

Reasons why you should learn R

Runtime: ~7 min. Created by Gracia Bonilla

RStudio interface

Here is an introduction to the main features of RStudio, the user-friendly IDE (integrated development environment) preferred by many researchers. This video introduces the help operator ? which is a very useful feature of R, since it allows the user to easily search through the documentation pages for R functions and other objects. It also demonstrates how to assign values to variables using the <- operator, and it introduces the setwd() command, to change the working directory.

Runtime: ~10 min. Created by Katherine Norwood

Installing packages

As we mentioned earlier, a great thing about R is that there are a lot of packages already out there to perform analysis on your data. Therefore, installing packages is an essential task when working with R, since it will allow you to incorporate sophisticated analyses into your work very easily. This video introduces the command install.packages and it covers the steps required to install a package from the Bioconductor repository.

Runtime: ~7 min. Created by Katherine Norwood

Console and Working Environment Basics

The following video introduces the basics of R scripting, by using the interactive prompt.

Runtime: ~8 min. Created by Katherine Norwood

Atomic data types in R

This video introduces 4 of the atomic data types in R,

  • logical
  • numeric
  • complex
  • character

Also, this video introduces the use of the typeof and mode functions,

Runtime: ~7 min. Created by Katherine Norwood

This video covers aspects of missing values

Runtime: ~3 min. Created by Katherine Norwood

Multidimensional Data Types in R

Introduction to vectors

Runtime: ~7 min. Created by Katherine Norwood

Introduction to Matrices, Arrays, Lists, and Data Frames, and how to obtain subsets of the data.

Runtime: 9:30 min. Created by Katherine Norwood

For more information on how you can manipulate vectors visit this page

For examples on how to manipulate multi-dimensional data objects in R, check out this tutorial

Basics of Flow Control

This video covers the basics of scripting, including the if statements, and logical operators.

Runtime: ~5:30 min

This video covers the apply function

Runtime: ~7 min

For more information on the apply family of functions, visit

For more information on the apply versus for loop debate, visit:

Data Wrangling

Tidyverse is a suite of R packages that are designed to work with each other by sharing principles in the way data is structured. Most of the tidyverse packages were written by Hadley Wickham, a key figure in the R(studio) development world. Here, we present a brief intro to some of these packages, highlighting their main functionalities using example datasets, as well as how one can tie these concepts together to go from data in its more raw or ‘messy’ form to pretty visualizations in R!

The following three videos illustrate data manipulation and plotting. The code used can be found here and the data is available here (cancer_samples.csv)

Introduction to reshape. This video also demonstrates how to read a data table into your R environment.

Runtime: ~6 min. Created by Vinay Kartha

Introduction to the plyr package

Runtime: 8:30 min. Created by Vinay Kartha

Plotting and ggplot

A great thing about R, is that it makes it very easy to create high quality graphics from complex data. The following video introduces the basics of the ggplot2 plotting system, which is preferred by many data scientists over the base R plots due to the flexibility it provides when dealing with complex data sets.

Runtime: 9 min. Created by Vinay Kartha

Here are some good resources for beginners:

Once you have some experience, great resources are:

Bonus R Markdown for Reproducibility

Composing reproducible manuscripts using R Markdown

Introduction to R Markdown

For an example, you can view the RMarkdown document that generated the R Workshop: Intro to reshape, plyr and ggplot () document. You can open this in RStudio, and do File -> Knit (ctrl + shift + K ).