You should see a plot appear if setup is successful.
2016-06-22
You should see a plot appear if setup is successful.
The qplot() function is the basic workhorse of ggplot2
The qplot() function has a basic syntax:
qplot(variables, plot type, dataset, options)
We will explore the diamonds data set (preloaded along with ggplot2) using qplot for basic plotting.
The data set was scraped from a diamond exchange company data base by Hadley. It contains the prices and attributes of over 50,000 diamonds
What does the data look like?
Lets look at the top few rows of the diamond data frame to find out!
head(diamonds)
## carat cut color clarity depth table price x y z ## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 ## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 ## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 ## 4 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63 ## 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 ## 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
Basic scatter plot of diamond price vs carat weight
qplot(carat, price, geom = "point", data = diamonds)
Scatter plot of diamond price vs carat weight showing versitility of options in qplot
qplot(carat, log(price), geom = "point", data = diamonds, alpha = I(0.2), colour = color, main = "Log price by carat weight, grouped by color") + xlab("Carat Weight") + ylab("Log Price")
All of the your turns for this section will use the tips data set:
tips <- read.csv("http://heike.github.io/rwrks/summerschool/data/tips.csv")
Basic histogram of price
qplot(price, geom = "histogram", data = diamonds)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Histogram of price, binwidth is set to $50
qplot(price, geom = "histogram", binwidth = 50, data = diamonds)
The gap in prices at around $2000 is due to the scraping procedure.
Price histograms faceted by clarity
qplot(price, geom = "histogram", data = diamonds, binwidth = 100, facets = .~clarity)
Side by side boxplot of diamond prices within clarity groupings
qplot(clarity, log(price), geom = "boxplot", data = diamonds)
Why does price decrease as the quality of the diamonds increases?
Side by side boxplot of log prices within clarity groupings with jittered values overlay
qplot(clarity, log(price), geom = "boxplot", data = diamonds, main = "Boxplots of log Diamond Prices Grouped by Clarity") + geom_jitter(alpha = I(.025))
There are two groups of prices … maybe related to size?
Side by side boxplot of log prices within clarity groupings
qplot(clarity, log(price)/carat, geom = "boxplot", data = diamonds)
To investigate bar plots we will switch over to the Titanic data set:
titanic <- as.data.frame(Titanic)
Data includes passenger characteristics and survival outcomes for those aboard the RMS Titanic's ill fated maiden voyage
Basic bar plot of survival outcomes
qplot(Survived, geom = "bar", data = titanic, weight = Freq)
Bar plot faceted by gender and class
qplot(Survived, geom = "bar", data = titanic, weight = Freq, facets = Sex~Class)