Class goals

Class format

Introduction to R and RStudio

Why do I have to learn to program in R?

Data science is a valuable, often necessary, skill for many jobs in environmental science and policy. Over the past 20 years, R has become the leading analysis tool in the environmental sciences. Python and other programming languages have also become increasingly prevalent. Any of these approaches increases replicability (or reproducability) of data analyses, which is important for the advancement of science and policy work. Documenting your analysis in R, using R scripts or Rmd files allows you to re-run your analysis whenever the need arises, and to share your analysis work flow with others. We are learning R in this program because it is fairly standard in our fields and learning this program / language will give you a good foundation to learn other programs as needed.

Getting started

Difference between R and RStudio

In this class, we will be working primarily in RStudio. So what is the difference between R and RStudio? R is both a programming language (specifically a statistical analysis programming language) and a software for using that language, while RStudio is an Integrated Development Environment, or IDE for short. RStudio offers a number of features, mostly related to visual presentation of information, that make writing and working with R code easier.

Overall layout

There are four panels in the RStudio interface (though you may only have three open when you first start it), each has valuable information.

  • Console / Terminal panel (lower-left)
  • Environment / History / Git (upper-right)
  • Files / Plots / Packages / Help (lower-right)
  • Source / Editor (upper-left)

File management

Before we do anything in R/RStudio, let’s make a new folder on our computers where our class data can reside. You can use your operating systems file manager (i.e., Finder on Mac and Windows Explorer on Windows) to created a new folder where ever suites you.

  • Setup a ENS-623-Research-Stats directory
  • Make a data directory
  • Make a scripts directory

Making an R Project

Go back to RStudio. Let’s make a new R Project associated with your ENS-623-Research-Stats directory. To make a new project, go to the upper right-hand side of the RStudio interface, where it says Project: (None). Click the little downward arrow, select “New Project”, then select “Existing Directory” from the window that pops up. Use the graphical user interface (GUI) to navigate to the ENS-623-Research-Stats directory, then select “Create Project”. Once you’ve created your project, your RStudio session will restart in this project.

There is also now a file in the folder you just made that ends in *.Rproj. If you double click on this file, RStudio will open, opening this project at the same time. This is a good way to re-open RStudio for your next session.

Setting your working directory

If you open a project using the *.Rproj file, then your R session will automatically set the working directory to your ENS-623-Research-Stats folder. If you need to set the working directory manually, here are two ways to do that.

Point-and-click method - Use ‘Session’ > ‘Set Working Directory’ > ‘Choose Directory’.

Using the R Console:

setwd("/Users/maiellolammens/Dropbox/Pace/Teaching/ENS-623-Research-Stats/ENS-623-Research-Stats-SP25/")

Getting help

  • Help panel (lower right corner)
  • help.search
help.search("bar plot")

Challenge

Use the help.search function to search for something in statistics that you think should be in R? Did you find anything?

  • I know my function - just give me the details - ?barplot

R as calculator

We can use R just like any other calculator.

3 + 5
## [1] 8

There’s internal control for order of operations (Please Excuse My Dear Aunt Sally)

(3 * 5) + 7
## [1] 22
3 * 5 + 7
## [1] 22

Challenge

Write an example where adding parentheses matters.

Internal functions

There are a ton of internal functions, and a lot of add-ons.

sqrt(4)
## [1] 2
abs(-5)
## [1] 5
sqrt(-5)
## Warning in sqrt(-5): NaNs produced
## [1] NaN

R script file

In order to ensure replicability we often use R script files. They are easy to go back to and to document your work while you go.

Important: within an R file, you can use the # sign to add comments. Anything written after the # is not interpreted when you run the code.

Important: we can write commands in our R script file, then run/execute those commands by pressing CMD + RETURN (Macs) or CNTR + RETURN (Windows). We can also execute only a section of code by highlighting it and pressing these same key strokes.

Challenge

Create a new R script file in your scripts directory. Run a few basic commands in the new file.

Basic file managment in R

# What working directory am I in?
getwd()

# Move to a different director?
setwd(".")

# What files are in the current directory?
dir()

Things to cover

  • Navigating the file path
  • Tab completion of file paths
  • Tab completion of R commands

Challenge

  • Try to auto-complete fil, what do you find?
  • Use the brief help menu that comes up to find a function that starts with file, and describe what you think it does.

Rmd file

Using R Markdown, we can integrate descriptive text, R code, and the output from that code, into a seamless document that can be easily reproduced. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. While some of the details of using Markdown (and thus R Markdown) can get tricky, the basics are very easy to pickup. RStudio even comes with two very useful tools to learn and use Markdown. First, go to the Help tab and select Markdown Quick Reference. Second, a more detailed reference can be found in the Help -> Cheatsheets section.

Markdown vs R Markdown

R Markdown is a tool that allows you to make a simple text document using the Markdown formatting syntax with R code and associated output embedded.

Starting with R Markdown

Go to the new file button (upper left corner of the RStudio interface, or File -> New File) and choose the R Markdown option. A screen will pop-up requesting information such as Title, Author, and Output Format. Give your new doc a title and select HTML as the output format. A new file that is pre-populated with a bunch of text will open in the Source panel.

We will go over each section briefly:

  • Header
  • Sections
  • Code Chunks (The code chunk syntax is very specific and important. As a beginner, you may choose to use the GUI button to make empty chunks - it’s the little green box with a C in it and a plus sign.)

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. Click Knit now to see what happens.