class: center, middle, inverse, title-slide # Packages and Basic Programming ### Haley Jeppson, Sam Tyner --- ## R Packages - Commonly used R functions are installed with base R - R packages containing more specialized R functions can be installed freely from CRAN servers using function `install.packages()` - After packages are installed, their functions can be loaded into the current R session using the function `library()` --- ## Finding R Packages - How do I locate a package with the desired function? - Google ("R project" + search term works well) - R website task views to search relevent subjects: http://cran.r-project.org/web/views/ - ??searchterm will search R help for pages related to the search term <!-- `sos` package adds helpful features for searching for packages related to a particular topic --> --- ## Handy R Packages - `ggplot2`: Statistical graphics - `dplyr`/`tidyr`: Manipulating data structures - `lme4`: Mixed models - `knitr`: integrate LaTeX, HTML, or Markdown with R for easy reproducible research - `vegan`: Ordination methods, diversity analysis and other functions for community and vegetation ecologists. - `phyloseq`: Handling and analysis of high-throughput microbiome census data - `ggtree`: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data - `Biostrings`: Efficient manipulation of biological strings <!-- this needs updated!! --> --- ## Creating Your Own Functions Code Skeleton: ```r foo <- function(arg1, arg2, ...) { # Code goes here return(output) } ``` Example: ```r mymean <- function(data) { ans <- sum(data) / length(data) return(ans) } ``` --- ## If/Else Statements Skeleton: ```r if (condition) { # Some code that runs if condition is TRUE } else { # Some code that runs if condition is FALSE } ``` Example: ```r mymean <- function(data) { if (!is.numeric(data)) { stop("Numeric input is required") } else { ans <- sum(data) / length(data) return(ans) } } ``` --- ## Looping - Reducing the amount of typing we do can be nice - If we have a lot of code that is essentially the same we can take advantage of looping. - R offers several loops: `for`, `while`, `repeat`. ```r for (i in 1:3) { print(i) } ``` ``` ## [1] 1 ## [1] 2 ## [1] 3 ``` --- ## For Loops ```r final_shed <- read.csv("http://heike.github.io/rwrks/01-r-intro/data/final_shedding.csv") id <- c("pig_weight", "total_shedding", "daily_shedding") for (colname in id) { print(colname) } ``` ``` ## [1] "pig_weight" ## [1] "total_shedding" ## [1] "daily_shedding" ``` ```r # this does NOT work if the csv is read in using read_csv for(colname in id) { print(paste(colname, mymean(final_shed[, colname]))) } ``` ``` ## [1] "pig_weight 28.8230508474576" ## [1] "total_shedding 40.6754134817021" ## [1] "daily_shedding 3.44509255080266" ``` --- ## While Loops ```r i <- 1 while (i <= 5) { print(i) i <- i + 1 } ``` ``` ## [1] 1 ## [1] 2 ## [1] 3 ## [1] 4 ## [1] 5 ``` --- class: inverse ## Your Turn 1. Create a function that takes numeric input and provides the mean and a 95% confidence interval for the mean for the data (the t.test function could be useful) 2. Add checks to your function to make sure the data is either numeric or logical. If it is logical convert it to numeric. 3. The diamonds data set is included in the `ggplot2` package and is well known as a convenient data set for examples. It can be read into your environment with the function `data("diamonds", package = "ggplot2")`. Loop over the columns of the diamonds data set and apply your function to all of the numeric columns.