Section Goals


User defined functions

What is a function?

R code that does some generic “computation” task.

Why write a function?

To make repeating a process easier. Rule of thumb - if you are copying and pasting your code, to use in someplace else in a script, you should probably write a function.

Example: Converting from Fahrenheit to Kelvin

Note: the example below is borrowed from the Software Carpentry Introduction to Programming materials.

If I give you a single temperature measurement in Fahrenheit, how would you convert it to the Kelvin scale?

my_temp <- 41
((my_temp - 32) * (5 / 9)) + 273.15
## [1] 278.15

If I give you another value, you would probably copy-paste the above, and change the number accordingly.

Instead, make a generic function.

fahr_to_kelvin <- function(temp) {
  kelvin <- ((temp - 32) * (5 / 9)) + 273.15
  return(kelvin)
}

Aside: Indentations and white space

You will notice in R Studio, when you are making a function, the editor will automatically indent the code lines inside the function. Technically this indentation (white space) is not necessary in R (though it is for other languages, such as Python), however, it is meant to make reading your code easier. Generally, lines that are indented similarly are “at the same level” (i.e., in the function or in the loop).

Convert from Kelvin to Celsius

kelvin_to_celsius <- function(temp) {
  celsius <- temp - 273.15
  return(celsius)
}

Challenge

Write a function to convert from Fahrenheit to Celsius.


for loops

How do I do the same thing many times?

Generic for loop

for (variable in collection) {
  do things with variable
}

Note: I created a short YouTube video several years ago presenting for loops, which you might find helpful too https://youtu.be/tTOTZtcoo1c

Let’s get more specific. Say we took a bunch of measurements of temperature in Fahrenheit, but want to convert them. How might we do it?

Make our temperature data set

set.seed(8)
temp_data <- runif(n = 20, min = -5, max = 5) + 45

Challenge

What did we just do in the code above?


Aside: Resetting the seed so the numbers are random again

To reset your seed, you can execute the following command:

rm(.Random.seed, envir=globalenv())

Use our fahr_to_kelvin function on each element

Iteration 1.

for( x in temp_data){
  fahr_to_kelvin(x)
}

Iteration 2. By element

for( x in temp_data){
  print(fahr_to_kelvin(x))
}
## [1] 280.185
## [1] 278.749
## [1] 282.037
## [1] 281.216
## [1] 279.3806
## [1] 281.5885
## [1] 279.2104
## [1] 282.7737
## [1] 281.8675
## [1] 281.175
## [1] 280.1336
## [1] 278.0906
## [1] 279.9966
## [1] 280.622
## [1] 278.3624
## [1] 282.749
## [1] 277.6017
## [1] 279.0637
## [1] 279.1307
## [1] 280.4895

Iteration 3. By element location

for( x in 1:length(temp_data)){
  print(fahr_to_kelvin(temp_data[x]))
}
## [1] 280.185
## [1] 278.749
## [1] 282.037
## [1] 281.216
## [1] 279.3806
## [1] 281.5885
## [1] 279.2104
## [1] 282.7737
## [1] 281.8675
## [1] 281.175
## [1] 280.1336
## [1] 278.0906
## [1] 279.9966
## [1] 280.622
## [1] 278.3624
## [1] 282.749
## [1] 277.6017
## [1] 279.0637
## [1] 279.1307
## [1] 280.4895

Iteration 4. By element location with output in a new vector

temp_data_kelvin <- vector()
for( x in 1:length(temp_data)){
  temp_data_kelvin[x] <- fahr_to_kelvin(temp_data[x])
}

print(temp_data_kelvin)
##  [1] 280.1850 278.7490 282.0370 281.2160 279.3806 281.5885 279.2104 282.7737
##  [9] 281.8675 281.1750 280.1336 278.0906 279.9966 280.6220 278.3624 282.7490
## [17] 277.6017 279.0637 279.1307 280.4895

Flow control with conditionals

We can use conditional statements to control the flow of our code, and to “make choices” as it progresses.

if and else

The if and else statements are key to making choices in your code. Before understanding if/else statements, we need to review booleans - i.e., TRUE and FALSE

Aside: TRUE and FALSE values

TRUE
## [1] TRUE
20 == 20
## [1] TRUE
20 > 40
## [1] FALSE

A ! sign can be used as a logical negation

!(20 > 40)
## [1] TRUE

There are many logical operators to consider.


Challenge: Describe what each fo the following operators are doing.

x <- TRUE
y <- FALSE
x & y
## [1] FALSE
x | y
## [1] TRUE
xy <- c(x,y)
any(xy)
## [1] TRUE
all(xy)
## [1] FALSE

Now that we know a bit about booleans, let’s get into if/else statements.

Essentially, an if is a conditional that says, “do this thing after the if statement, if the conditional was TRUE.”

Here’s a simple example (taken from Software Carpentry’s lessons)

num <- 37
if (num > 100) {
  print("greater than 100")
} else {
  print("not greater than 100")
}
## [1] "not greater than 100"

Challenge

Re-define num so you get the other option.


We don’t need an else statement for this to work -

num <- 37
if (num > 100) {
  print("The number was greater than 100")
}

if/else cascades

We can also write a “cascade” of if/else statements

if (num > 0) {
  print("positive")
} else if (num == 0) {
  print("equal to 0")
} else {
  print("negative")
}
## [1] "positive"

Challenge

Make the above code into a function, that takes in any value, and returns whether it is positive, negative, or equal to 0.


while loops

while loops are not necessarily as common as for loops, but they are quite important. One key difference between for loops and while loops is that the former is usually set to repeat a given number of times, while the later repeats until some condition is met. Incidentally, this also means it’s easy to get into an infinite loop when writing while loops. Be careful!

Here is a basic while loop.

foo <- 0
while(foo < 10){
  print(foo)
  foo <- foo + 1
}
## [1] 0
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9

We’re going to use a while loop to address the challenge below.

Writing Pseudocode

When faced with a problem such as that in the challenge below, I will often write out the flow of my simulation before I start coding. This is called pseudocode in computer programming. For this challenge, I might write out the pseudocode in several steps.

Challenge: Number of coin flips it takes to get to 100 heads

Imagine you are flipping a fair coin (i.e., there is a 0.5 chance of getting a heads and a 0.5 chance of getting a tails). You want to continue flipping the coin until you get 100 heads. Write a script that simulates coin flips, and counts the number of times you need to flip the coin until you get 100 heads. Note that since the coin flip is a random process, it will not necessarily take you the same number of flips if you re-run the script.

Pseudo code

  1. Begin by writing down the main things you need to do
- Flip a coin
- Determine if the coin is "heads" or "tails"
- Repeat the flip until you get 100 heads
- Tally the number of heads and the number of total flips
  1. Start putting these things in order

  2. Add in key components of the flow-control