Archive

Archive for the ‘Uncategorized’ Category

Legendary Plots

12th March, 2011 Leave a comment

I was recently pointed in the direction of a thermal comfort model by the engineering company Arup (p27–28 of this pdf). Figure 3 at the top of p28 caught my attention.

Arup thermal comfort model, figure 3

It’s mostly a nice graph; there’s not too much junk in it. One thing that struck me was that there is an awful lot of information in the legend, and that I found it impossible to retain all that information while switching between the plot and the legend.

The best way to improve this plot then is to find a way to simplify the legend. Upon closer inspection, it seems that there is a lot of information that is repeated. For example, there are only two temperature combinations, and three levels of direct solar energy. Humidity and diffused solar energy are kept the same in all cases. That makes it really easy for us: our five legend options are

Outdoor temp (deg C) Direct solar energy (W/m^2)
32 700
32 150
32 500
29 500
29 150

Elsewhere we can explain that the mezannine/platform temps are always 2/4 degrees higher than outdoors, and that the humidity is always 50%, and that the diffused solar energy is always 100W/m^2.

Living in Buxton, one of the coldest, rainiest towns in the UK, it amuses me to see that their “low” outdoor temperature is 29°C.

The other thing to note is that we have two variables mapped to the hue. For just five cases, this is just about acceptable, but it isn’t the best option and it won’t scale to many more categories. It’s generally considered best practice to work in HCL color space when mapping variables to colours. I would be tempted to map temperature to hue – whether you pick red as hot and blue as cold or the other way around depends upon how many astronomers you have in your target audience. Then I’d map luminance (lightness) to solar energy: more sunlight = lighter line.

I don’t have the values to exactly recreate the dataset, but here are some made up numbers with the new legend. Notice the combined outdoor temp/direct solar energy variable.

time_points <- 0:27
n_time_points <- length(time_points)
n_cases <- 5
comfort_data <- data.frame(
  time = rep.int(time_points, n_cases),
  comfort = jitter(rep(-2:2, each = n_time_points)),
  outdoor.temperature = rep(
    c(32, 29),
    times = c(3 * n_time_points, 2 * n_time_points)
  ),
  direct.solar.energy = rep(
    c(700, 150, 500, 500, 150),
    each = n_time_points
  )
)
comfort_data$combined <- with(comfort_data,
  factor(paste(outdoor.temperature, direct.solar.energy, sep = ", "))
)

We manually pick the colours to use in HCL space (using str_detect to examine the factor levels).

library(stringr)
cols <- hcl(
  h = with(comfort_data, ifelse(str_detect(levels(combined), "29"), 0, 240)),
  c = 100,
  l = with(comfort_data,
    ifelse(str_detect(levels(combined), "150"), 20,
    ifelse(str_detect(levels(combined), "500"), 50, 80))
  )
)

Drawing the plot is very straightforward, it’s just a line plot.

library(ggplot2)
p <- ggplot(comfort_data, aes(time, comfort, colour = combined)) +
  geom_line(size = 2) +
  scale_colour_manual(
    name = expression(paste(
      "Outdoor temp (", degree, C, "), Direct solar (", W/m^2, ")"
    )),
    values = cols) +
  xlab("Time (minutes)") +
  ylab("Comfort")
p

My version of the plot, with an improved legend

Sensible people should stop here, and write the additional detail in the figure caption. There is currently no sensible way of writing annotations outside of the plot area (annotate only works inside panels). The following hack was devised by Baptiste Auguie, read this forum thread for other variations.

library(gridExtra)
caption <- tableGrob(
  matrix(
    expression(
      paste(
        "Mezzanine temp is 2", degree, C, " warmer than outdoor temp"
      ),
      paste(
        "Platform temp is 4", degree, C, " warmer than outdoor temp"
      ),
      paste("Humidity is always 50%"),
      paste(
        "Diffused solar energy is always 100", W/m^2
      )
    )
  ),
  parse = TRUE,
  theme = theme.list(
    gpar.corefill = gpar(fill = NA, col = NA),
    core.just = "center"
  )
)
grid.arrange(p,  sub=caption)

The additional information is included in the plot's subcaption

My New Year’s Resolution: Be lazier

3rd January, 2011 2 comments

I wrote this on New Year’s Eve but given the contents it seemed more appropriate to post a few days late. In many walks of life, laziness is a terrible vice but for programmers and statisticians it can be an unsung virtue. You don’t believe me? Then read on to hear my ideas for virtuous laziness.

Idea 1: Write less code
Writing less code means less code to maintain which means even less work – a virtuous circle of laziness. Of course, you can’t just stop doing your job, but you can use existing packages instead of reinventing wheels and you can write code that is reusable (write functions instead of scripts and packages instead of loose functions).

Idea 2: … but code instead of clicking
There are loads of little tasks that computers (and other machines) do better than humans. You just need to tell them to do it! If you find yourself typing the same thing over and over again, write a function, script or macro so you don’t need to bother next time. Jobs that complete themsleves automatically are the best kinds of jobs!

Idea 3: If your can’t automate, then simplify
Of course, many tasks are tricky to automate. In that case, can you simplify your problem? If you’re doing bleeding edge research, can you make your task straightforward enough that other researchers can do it? If you’re doing something more routine, can you simplify it enough that the intern could take over? If you’re the intern, can you simplify your job so that a kid could do it? Now we’re nearly at the point of automation.

Idea 4: Give the gift of laziness
I know that a lot of you readers are coders and will probably have heard something like this idea before. Out there in the wider world, I don’t think the concept of automating tasks is as common. This doesn’t always mean programming things either. Over the coming days, teach your grandma about keyboard shortcuts or take the time to introduce a less technical colleague to the idea of network shortcuts or the button that minimizes all your windows. (It’s amazing how few people seem to know about that!)

Two amigos: follow up

13th October, 2010 Leave a comment

Brett and Jiro have announced the results of the competition to make a Bob-free image. There were five entries, two prizes and … I didn’t win either. Still, it was a fun challenge and a useful learning experience so I’m consoling myself with cliches like “it’s not the winning that’s important but the taking part”. I’m certainly not using MATLAB to construct a voodoo-doll image of Brett and Jiro.
jiro and brett with pins in their heads

%% Read in image and display
theAmigos = imread('the amigos better blur.jpg');
image(theAmigos)

%% Add lines
pinColour = [.5 .5 .5];

xcoords = { ...
   [130 180] ...
   [132 182] ...
   [136 184] ...
   [140 186] ...
   [148 190] ...
   [165 195] ...
   [182 200] ...
   [200 205] ...
   [215 214] ...
   [230 223] ...
   [243 228] ...
   [255 234] ...
   [270 237] ...
   [283 244] ...
   [295 247] ...
   [300 246] ...
   [303 248] ...
   ...
   [465 515] ...
   [465 516] ...
   [466 517] ...
   [469 519] ...
   [475 522] ...
   [487 526] ...
   [505 534] ...
   [528 540] ...
   [548 546] ...
   [567 551] ...
   [588 554] ...
   [606 557] ...
   [621 560] ...
   [628 563] ...
   [633 566] ...
   [633 567] ...
   [634 568] ...
};

ycoords = { ...
   [295 300] ...
   [275 290] ...
   [260 280] ...
   [240 275] ...
   [225 274] ...
   [220 274] ...
   [215 273] ...
   [212 273] ...
   [212 273] ...
   [214 273] ...
   [217 274] ...
   [221 274] ...
   [230 275] ...
   [240 277] ...
   [250 280] ...
   [275 285] ...
   [290 292] ...
   ...
   [320 322] ...
   [305 315] ...
   [288 310] ...
   [272 304] ...
   [253 300] ...
   [240 296] ...
   [233 292] ...
   [230 291] ...
   [230 291] ...
   [232 292] ...
   [236 294] ...
   [246 297] ...
   [262 300] ...
   [280 302] ...
   [296 307] ...
   [309 312] ...
   [320 320] ...
};

xstart = cellfun(@(x) x(1), xcoords);
ystart = cellfun(@(x) x(1), ycoords);

hold on
cellfun(@(x, y) line(x, y, 'Color', pinColour), xcoords, ycoords);
arrayfun(@(x, y) plot(x, y, '.', 'Color', pinColour), xstart, ystart);
hold off

%% Remove the extra bits created by plot calls and write to file
set(gca, 'Visible', 'off')
set(gca, 'Position', [0 0 1 1])

print(gcf, '-djpeg', 'the amigos voodoo.jpg')

The truth is stranger than fiction

27th September, 2010 Leave a comment

It turns out that when you search Google for the phrase 4D pie chart, this blog is only the second link.  The number one spot is taken by a tutorial on how to create animated pie charts with Maxon Computer’s Cinema 4D package.  I can’t detect any hint of sarcasm in the instructions.  Hang your head in shame, Maxon Computer employee.

Tags: ,

Dotplot on xkcd

27th September, 2010 Leave a comment

There’s a nice example of a dotplot on xkcd today.  Hand drawn lines aside, it’s well drawn.  The only obvious thing wrong with it is the scale.  If you’re going to log-transform your data, then the axes labels should show the untransformed amounts.  So the numbers -17 to -5 along the top x-axis should really be e^-17 to e^-5, or even better, number that are based 10, because they are much easier to understand.  For example, compare:

“Stochastic is used e to the power 14 times as often as f*cking stochastic.”

“Stochastic is used 10 to the power 6 times as often as f*cking stochastic.”

Tags:

Variability of biomarkers in volunteer studies

9th September, 2010 Leave a comment

Variability of biomarkers in volunteer studies: The biological component, of which I am co-author, is now published in Toxicology letters.  It’s a simple, straightforward look at calculating variability in half-lives of chemicals.

Tags: ,

Pie charts over at Juice

3rd September, 2010 Leave a comment

The Juice Analytics blog (incidentally, one of the few corporate blogs worth reading) has a makeover of the Federal IT dashboard, including a discussion of 3D pie charts.  Only 3 dimensions?  Bah!

Tags: