ggplot2
ggplot2
to create data
visualizations.In section
03 we learned to use ggplot2
to make an x-y scatter
plot comparing Sepal.Length
and Petal.Length
in the iris
data set.
Recall that you must first install ggplot2
if you have
not done so already. ONLY DO THIS IF YOU HAVE NOT ALREADY
INSTALLED GGPLOT 2:
install.packages("ggplot2")
Then load it into your environment:
library(ggplot2)
The code to create one of the plots from section 03 looks like this:
data(iris)
ggplot(data = iris, aes( x = Sepal.Length, y = Petal.Length, colour = Species )) +
geom_point()
We also learned that you can add the term
geom_smoth(method = "lm")
to add linear regression lines
onto your plot, as such:
# You will modify this code, as per description below
ggplot(data = iris, aes( x = Sepal.Length, y = Petal.Length, colour = Species )) +
geom_point() +
geom_smooth(method = "lm")
## `geom_smooth()` using formula = 'y ~ x'
Use the code above, but replace Petal.Length
with
Sepal.Width
. Make a new x-y scatter plot with linear
regression lines. Describe how having separate lines for each species
versus one line for all of the data combined influences your
interpretations of these data, and specificially the relationship
between Sepal.Length
and Sepal.Width
.
Hint: You should make another plot where you
do not include any color
arguments to see
what happens. That is, a plot where you ignore the species
identifications.
Put your written answer to the question here.
Below is a code chunk to create a histogram of petal
length values using the iris
data set.
ggplot() +
geom_histogram(data = iris, aes(x = Petal.Length, fill = Species),
position = "dodge") +
theme_bw()
Modify this code to make a histogram for Sepal.Length
instead.
# Put your code here.
What differences do you notice about the values, as presented in these histograms, for petal length versus sepal length?
Put your written answer to the question above, here.