2016-06-21

R Packages

  • Commonly used R functions are installed with base R
  • R packages containing more specialized R functions can be installed freely from CRAN servers using function install.packages()
  • After packages are installed, their functions can be loaded into the current R session using the function library()

Finding R Packages

  • How do I locate a package with the desired function?
  • Google ("R project" + search term works well)
  • R website task views to search relevant subjects: http://cran.r-project.org/web/views/
  • ??searchterm will search R help for pages related to the search term
  • sos package adds helpful features for searching for packages related to a particular topic

Handy R Packages

  • ggplot2: Statistical graphics
  • dplyr/tidyr: Manipulating data structures
  • lme4: Mixed models
  • knitr: integrate LaTeX, HTML, or Markdown with R for easy reproducible research

Creating Your Own Functions

Code Skeleton:

foo <- function(arg1, arg2, ...) {
    # Code goes here
    return(output)
}

Example:

mymean <- function(data) {
    ans <- sum(data) / length(data)
    return(ans)
}

If/Else Statements

Skeleton:

if (condition) {
    # Some code that runs if condition is TRUE
} else {
    # Some code that runs if condition is FALSE
}

Example:

mymean <- function(data) {
    if (!is.numeric(data)) {
        stop("Numeric input is required")
    } else {
        ans <- sum(data) / length(data)
        return(ans)
    }
}

Looping

  • Reducing the amount of typing we do can be nice
  • If we have a lot of code that is essentially the same we can take advantage of looping.
  • R offers several loops: for, while, repeat.
  • R supports implicit looping over lists (apply functions … more later)
for (i in 1:3) {
    print(i)
}
## [1] 1
## [1] 2
## [1] 3

For Loops

tips <- read.csv("http://heike.github.io/rwrks/01-r-intro/data/tips.csv")

id <- c("total_bill", "tip", "size")
for (colname in id) {
    print(colname)
}
## [1] "total_bill"
## [1] "tip"
## [1] "size"
for(colname in id) {
    print(paste(colname, mymean(tips[, colname])))
}
## [1] "total_bill 19.7859426229508"
## [1] "tip 2.99827868852459"
## [1] "size 2.56967213114754"

While Loops

i <- 1
while (i <= 5) {
    print(i)
    i <- i + 1
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5

Your Turn

  1. Create a function that takes numeric input and provides the mean and a 95% confidence interval for the mean for the data (the t.test function could be useful)
  2. Add checks to your function to make sure the data is either numeric or logical. If it is logical convert it to numeric.
  3. Loop over the columns of the diamonds data set and apply your function to all of the numeric columns.