A proposed explanation for some event or process that is based on the available evidence related to that event. We also call this a model. Models don’t necessarily mean math.
A prediction based on our model (or theory). For a hypothesis to be considered scientific it must be falsifable.
Traditional view:
Checkout Understanding Science
The process of analyzing data is often iterative. That means, we may start looking at our data, applying analysis techniques that might not be the techniques we use in our final report (i.e., our reports and manuscripts). Don’t be afraid to start exploring your data, and applying analyses, before you think you’re ready.
In short, it’s “industry standard”. But the longer answer has to do with replicability of data analyses. Documenting your analysis in R, using R scripts or Rmd files allows you to re-run your analysis whenever the need arises, and to share your analysis workflow with others.
Biostatistics
directorydata
directoryscripts
directoryhelp.search
help.search("bar plot")
Challenge
Use the help.search
function to search for something in statistics that you think should be in R? Did you find anything?
?barplot
We can use R just like any other calculator.
3 + 5
## [1] 8
There’s internal control for order of operations (Please Excuse My Dear Aunt Sally)
(3 * 5) + 7
## [1] 22
3 * 5 + 7
## [1] 22
Challenge
Write an example where adding parentheses matters.
There are a ton of internal functions, and a lot of add-ons.
sqrt(4)
## [1] 2
abs(-5)
## [1] 5
sqrt(-5)
## Warning in sqrt(-5): NaNs produced
## [1] NaN
Use a script file for your work. It’s easier to go back to and easy to document.
Important: within an R file, you can use the # sign to add comments. Anything written after the # is not interpreted when you run the code.
Challenge
Create a new R script file in your scripts
directory.
# What working directory am I in?
getwd()
## [1] "/Users/maiellolammens/Dropbox/Pace/Teaching/Web/ENS-623-Research-Stats/lectures"
# Move to a different director?
setwd(".")
Challenge
fil
, what do you find?file
, and describe what you think it does.Use this to integrate text and R code into the same document. I will expect most of your homework assignments as an Rmd file.
Practice with Rmd file
There are several basic types of data structures in R.
str()
class()
Define a variable
my_var <- 8
And another
my_var2 <- 10
Work with vars
my_var + my_var2
## [1] 18
Make a new variable
my_var_tot <- my_var + my_var2
Challenge
Change the value of my_var2
my_var2 <- 3
What is the value of my_var_tot
now?
Combining values into a vector
# Vector of variables
my_vect <- c(my_var, my_var2)
# Numeric vector
v1 <- c(10, 2, 8, 7, 11, 15)
# Char vector
pets <- c("cat", "dog", "rabbit", "pig")
Making a vector of numbers in sequence
v2 <- 1:10
v3 <- seq(from = 1, to = 10)
Challenge
seq
function, and use this to make a vector from 1 to 100, by steps of 5.length.out
argument.