August 28, 2014

knitr engines

  • knitr has built-in support for some "engines" including (but not limited to):
    • "popular" languages (python, perl, fortran, haskell, etc.)
    • various "command-line tools" (sed, awk, bash, sh, zsh, etc.)
  • Unfortunately, for each knitr chunk that does not have 'R' as it's engine option
    • a new session is opened
    • code is parsed and evaluated in that session (or set eval = FALSE)
    • results are returned and the session is closed
  • Opening & closing sessions can be expensive, so ideally, we open one new session and only ship code when necessary.
    • Yihui wrote the runr package with this idea in mind.

A julia engine for knitr!

```{r setup}
library(knitr)
library(runr)
j = proc_julia()
j$start()
knit_engines$set(julia = function(options) {
    knitr:::wrap(j$exec(options$code), options)
})
```


```{r hello, engine = 'julia'}
string("Hello", " World!")
```
#> "Hello World!"

Now using Gadfly!

using Gadfly, RDatasets
diamonds = dataset("ggplot2", "diamonds");
p = plot(diamonds, x = "Carat", y = "Price", color = "Clarity")
  • It's probably best to generate plots in an interactive session, then save to your favorite file format:
draw(PDF("diamonds.pdf", 8inch, 6inch), p);
draw(PNG("diamonds.png", 8inch, 6inch), p);
draw(SVG("diamonds.svg", 8inch, 6inch), p);
draw(SVGJS("diamonds.js.svg", 8inch, 6inch), p);
  • SVGJS has ability to pan and zoom (include in your HTML docs like so):
<div align = "center">
  <object src="diamonds.js.svg" type="image/svg+xml"></object>
</div>

Some notes

Geom.subplot_grid() =~ facet_grid()

p = plot(diamonds, x = "Carat", y = "Price", color = "Clarity",
         xgroup = "Cut", Geom.subplot_grid(Geom.point))

which is analogous to

library(ggplot2)
p = qplot(data = diamonds, x = carat, y = price, colour = clarity, 
          facets = ~cut, geom = "point")

The subtle difference is that subplot_grid is specific to a geometry. The reason for this is to eventually support embedded plots: two-tiered graphics that embed subplots within a set of axes.

For sake of comparison…

plot of chunk diamonds

  • Currently no facet_wrap equivalent, only facet_grid (can't specify nrow or ncol).

Drawbacks to using engine = 'julia'

  • cache = TRUE doesn't really work
    • This is a big headache when doing expensive computations.
    • Saving/loading objects is fairly trivial for R, but it's not clear if and how this should for Julia (or other languages for that matter?)
  • IPython/IJulia has much better support for saving work.
  • Make sure to j$stop() the julia session that we started at the beginning!