Home > R > Interactive graphics for data analysis

Interactive graphics for data analysis

Rocking out, reading Theus & Urbanek

I got a copy of Martin Theus and Simon Urbanek’s Interactive Graphics for Data Analysis a couple of years ago, whence it’s been sat on my bookshelf. Since I’ve recently become a self-proclaimed expert on interactive graphics I thought it was about time I read the thing. Which is exactly what I did last weekend at the Leeds Festival (in between rocking out).

It’s a book of two halves, and despite the title the interactivity isn’t really the focus. The book is actually a guide on how to do exploratory data analysis. The first half of the book works like an advanced chart chooser, explaining which plots are useful for which types of data, and what types of interactivity they can benefit from. For me, it was worth it for the many rare plots, like spineplots and interaction plots and mosaic plots and fluctuation diagrams. If you’re bored of barcharts, this is a great way to expand your graphical vocabulary. The second half of the book consists entirely of case studies, where you can practice a workflow for exploring data, which is something that’s always worthwhile doing.

The really big takeaway that I got is that exploratory graphics have different priorities to publication graphics. When you are in the courting stage with a dataset, just getting to know each other, you don’t really care so much about whether the greek letters in your axis label are formatted correctly or whether the shade of pink in your dots is quite right. All you really need is to be able to generate lots and lots of plots quickly, and to be able to see the relationships between them.

It is this last point that the authors claim interactivity is most useful for. Perhaps the canonical example of this is clicking a bar on a histogram or barchart, and having corresponding points on a scatterplot highlighted. To demonstrate this, here’s an example using Simon’s Acinonyx package (shortly to be renamed ix for “iplots Extreme”). Acinonyx isn’t yet available on CRAN, see its home page
for installation details.

library(Acinonyx)        
library(MASS)
data(Cars93)
interactive_scatter <- with(Cars93, iplot(Horsepower, MPG.city))  
interactive_histo <- with(Cars93, ihist(EngineSize))

Click a bar in the histogram and the the corresponding points in the scatterplot are highlighted. Likewise, drag to select points in the scatterplot and fractions of the histgram are highlighted.

The equivalent static version would be to use trellising and draw each possible graph combination. Splitting a scatterplot into different groups depending upon bars of a histogram works something like this:

library(ggplot2)
Cars93$EngineSizeGroup <- cut(Cars93$EngineSize, 11)
(static_trellis_scatter <- ggplot(Cars93, aes(Horsepower, MPG.city)) +
  geom_point() +
  facet_wrap(~ EngineSizeGroup)
)

(We don’t actually need to bother with the histograms, since they are a little boring.) The reverse operation – going from a selected region of scatterplot to a higlighted region of bar chart is also possible, but trickier. In this case, we do need both graphs.

Cars93 <- within(Cars93, 
{
  selected <- ifelse(
    Horsepower < 200 & MPG.city > 20 & MPG.city < 30, 
    "selected", 
    "unselected"
  )
})
(static_scatter_with_highlight <-
  ggplot(Cars93, aes(Horsepower, MPG.city, colour = selected)) +
  geom_point()
)
(static_histo_with_highlight <- 
  ggplot(Cars93, aes(EngineSizeGroup, fill = selected)) +
  geom_histogram() + 
  opts(axis.text.x = theme_text(angle = 30, hjust = 1, vjust = 1))
)

My conclusion from reading the book, and from my initial experimentation with Acinonyx is that anything you can do interactively is also possible by drawing many static graphs, but the interaction can let you see things quicker.

About these ads
  1. 2nd September, 2011 at 12:48 pm

    A useful book review, thanks Richi.

  2. statisticalgraphics
    13th September, 2011 at 21:25 pm

    Richie,

    first of all, thanks for the hilarious picture. I probably never saw someone reading our book with such devotion – was the music really bad ;-) ? It sort of inspires me to call out a competition for pictures of really devoted readers – even better than planking …

    Other than that, yes, I agree that we certainly can generate all graphics we get in an interactive setting as static graphics as well. But on the other hand, we must be fair and acknowledge, that we often would never get to the “right” static graphics if it weren’t for interaction. Life is always easy when you already know where to go.

    Cheers

    Martin

    btw, there is no single interactive graphic in the whole book anyways …

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 229 other followers

%d bloggers like this: