2016-06-21

R Markdown

We have been using R markdown in the slides all along, so now we take a closer look at different formats and parameters

Follow along (copy & paste the code into the console):

curl::curl_download(
  "https://raw.githubusercontent.com/heike/rwrks/gh-pages/summerschool/01-Introduction-to-R/knitr/5-rmarkdown.Rmd",
  "5-rmarkdown.Rmd"
)
file.edit("5-rmarkdown.Rmd")

Hello R Markdown!

Choose your output format!

Why R Markdown?

  • It's simple. Focus on writing, rather than debugging silly errors (I'm looking at you LaTeX).
  • It's flexible. Markdown was created to simplify writing HTML, but thanks to pandoc, Markdown converts to many different formats!
  • It's dynamic. Find a critical error? Get a new dataset? Regenerate your report without copy/paste hell!
  • Encourages transparency. Collaborators (including your future self) will thank you for integrating your analysis & report.
  • Enables interactivity/reactivity. Allow your audience to explore the analysis (rather than passively read it).

First things first, what is Markdown?

  • Markdown is a particular type of markup language.
  • Markup languages are designed to produce documents from plain text.
  • Some of you may be familiar with LaTeX. This is another (less human friendly) markup language for creating pdf documents.
  • LaTeX gives you much greater control, but it is restricted to pdf and has a much greater learning curve.
  • Markdown is becoming a standard. Many websites will generate HTML from Markdown (e.g. GitHub, Stack Overflow, reddit).

Who is using R Markdown, and for what?

What is R Markdown?

R Markdown is an authoring format that enables easy creation of dynamic documents, presentations, and reports from R. It combines the core syntax of markdown (an easy-to-write plain text format) with embedded R code chunks that are run so their output can be included in the final document. R Markdown documents are fully reproducible (they can be automatically regenerated whenever underlying R code or data changes).

Your Turn

Study the first page of the R Markdown Reference Guide.

Yes, the entire markdown syntax can be described in one page!

Can you think of anything that is missing from the syntax (that you might want when creating documents)?

Markdown doesn't natively support…

  • unfortunately quite a lot … :
    • Figure/table referencing (Well, sort of… you can use it for pdfs)
    • Table of contents (Support added recently!)
  • Many, many appearance related things (image/figure alignment, coloring, font families, etc.)

There is hope…

  • You don't have to restrict yourself to markdown. You can always include HTML/LaTeX markup, but don't expect it to convert between output formats.
  • There are many efforts to extend Markdown (but, then again, keeping it simple is the point!)
  • More features are being added almost daily
  • Templates are being created

Your Turn

Have a look at R Markdown presentations and templates.

Pro tip: run devtools::install_github("rstudio/rticles") to get more templates

Yaml Front Matter

The stuff at the top of the .Rmd file (called yaml front matter) tells rmarkdown what output format you want.

---
title: "Untitled"
date: "June 21, 2016"
output: html_document
---

In this case, when you click "Knit HTML", RStudio calls rmarkdown::render("file.Rmd", html_document()). You can certainly change these default values (see the source of this presentation).

What is a code chunk?

A code chunk is a concept borrowed from the knitr package (which, in turn, was inspired by literate programming). In .Rmd files, you can start/end a code chunk with three back-ticks.

1 + 1
## [1] 2

Want to run a command in another language?

print "a" + "b"
## ab

Code chunk options

There are a plethora of chunk options in knitr (engine is one of them). Here are some that I typically use:

  • echo: Show the code?
  • eval: Run the code?
  • message: Relay messages?
  • warning: Relay warnings?
  • fig.width and fig.height: Change size of figure output.
  • cache: Save the output of this chunk (so we don't have to run it next time)?

Your Turn

Study the second page of the R Markdown Reference Guide and go back to the Hello R Markdown example we created.

Easy: Modify the figure sizing and alignment.

Medium: Add a figure caption.

Hard: Can you create an animation? (Hint: look at the fig.show chunk option – you might need to the animation package for this)

Pro Tip: Don't like the default chunk option value? Change it at the top of the document:

knitr::opts_chunk$set(message = FALSE, warning = FALSE)

Formatting R output

m <- lm(mpg ~ disp, data = mtcars)
summary(m) # output isn't very attractive
## 
## Call:
## lm(formula = mpg ~ disp, data = mtcars)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.8922 -2.2022 -0.9631  1.6272  7.2305 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 29.599855   1.229720  24.070  < 2e-16 ***
## disp        -0.041215   0.004712  -8.747 9.38e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.251 on 30 degrees of freedom
## Multiple R-squared:  0.7183, Adjusted R-squared:  0.709 
## F-statistic: 76.51 on 1 and 30 DF,  p-value: 9.38e-10

pander is one great option.

library(pander)
pander(m)
Fitting linear model: mpg ~ disp
  Estimate Std. Error t value Pr(>|t|)
disp -0.04122 0.004712 -8.747 9.38e-10
(Intercept) 29.6 1.23 24.07 3.577e-21

a <- anova(m)
a
## Analysis of Variance Table
## 
## Response: mpg
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## disp       1 808.89  808.89  76.513 9.38e-10 ***
## Residuals 30 317.16   10.57                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

pander(a)
Analysis of Variance Table
  Df Sum Sq Mean Sq F value Pr(>F)
disp 1 808.9 808.9 76.51 9.38e-10
Residuals 30 317.2 10.57 NA NA

Pander knows about a lot of different methods!

methods(pander)
##  [1] pander.anova*           pander.aov*            
##  [3] pander.aovlist*         pander.Arima*          
##  [5] pander.call*            pander.cast_df*        
##  [7] pander.character*       pander.clogit*         
##  [9] pander.coxph*           pander.cph*            
## [11] pander.CrossTable*      pander.data.frame*     
## [13] pander.Date*            pander.default*        
## [15] pander.density*         pander.describe*       
## [17] pander.evals*           pander.factor*         
## [19] pander.formula*         pander.ftable*         
## [21] pander.function*        pander.glm*            
## [23] pander.Glm*             pander.gtable*         
## [25] pander.htest*           pander.image*          
## [27] pander.irts*            pander.list*           
## [29] pander.lm*              pander.lme*            
## [31] pander.logical*         pander.lrm*            
## [33] pander.manova*          pander.matrix*         
## [35] pander.microbenchmark*  pander.mtable*         
## [37] pander.name*            pander.nls*            
## [39] pander.NULL*            pander.numeric*        
## [41] pander.ols*             pander.orm*            
## [43] pander.polr*            pander.POSIXct*        
## [45] pander.POSIXlt*         pander.prcomp*         
## [47] pander.randomForest*    pander.rapport*        
## [49] pander.rlm*             pander.sessionInfo*    
## [51] pander.smooth.spline*   pander.stat.table*     
## [53] pander.summary.aov*     pander.summary.aovlist*
## [55] pander.summary.glm*     pander.summary.lm*     
## [57] pander.summary.lme*     pander.summary.manova* 
## [59] pander.summary.nls*     pander.summary.polr*   
## [61] pander.summary.prcomp*  pander.summary.rms*    
## [63] pander.summary.survreg* pander.summary.table*  
## [65] pander.survdiff*        pander.survfit*        
## [67] pander.survreg*         pander.table*          
## [69] pander.tabular*         pander.ts*             
## [71] pander.zoo*            
## see '?methods' for accessing help and source code

Your Turn

  • Look through the list of pander methods. Can you apply any of the methods that we haven't discussed? We just saw pander.lm and pander.anova.