Home > R > Introducing the Lowry Plot

Introducing the Lowry Plot

Here at the Health and Safety Laboratory* we’re big fans of physiologically-based pharmacokinetic (PBPK) models (say that 10 times fast) for predicting concentrations of chemicals around your body based upon an exposure. These models take the form of a big system of ODEs.

Because they contain many equations and consequently many parameters (masses of organs and blood flow rates and partition coefficients), it is important to do a sensitivity analysis on the model to determine which parameters are important and which are not. After all, it is tricky to accurately measure many of these parameters, so you need to know whether or not an error in a parameter will throw off the whole model.

How to do sensitivity analysis of an ODE-based model is a worthy topic of discussion, but not one I want to cover today. For now, it is sufficient to know the basic idea: you input different parameter combinations and see which parameters cause the model output to change the most.

For each parameter, the sensitivity analysis returns a main effect, which is the fraction of variance in model output accounted for by that parameter. Each parameter also has an interaction effect, which is the fraction of variance accounted for by that parameter’s interaction with all the other parameters. The main effect + interactions give the total effect for that parameter. Got it?

When I was working on this problem recently, I searched the literature for ways to display the results of the sensitivity analysis. “Draw a pie chart” it said. Not with less than 4 dimensions, thank you! “How about a bar chart?”, the literature suggested. How boring. It was clear that something better needed to be done.

Note that the plotting technique I’m about to describe is suitable for any sensitivity analysis, not just those of PBPK models.

The first step from the basic bar chart was to switch to a series of 3 Pareto plots for main effects, interactions and total effects. This was a big improvement, but having three graphs to look at was a pain.

I also tried a stacked bar chart for the total effects, but this still didn’t answer the most important question:

How many parameters should I include in my model?

This is a rather fuzzy question, but it is possible to restate it in a way that becomes solvable

What is the least number of parameters that I need to include to account for a given percentage of variance in my model?

It turns out that you don’t know an exact value for the amount of variance accounted for – you just know a range. The lower bound is the sum of the main effects of the parameters you include in the model. The upper bound is the sum of that total effects that you included. Actually we can do better than that. The effect of interactions are double counted (since each interaction is the interaction with all other parameters). So our second upper bound is that the included vairance can never exceed 100% minus the sum of all the main effects that we didn’t include.

I realise that there have been far too many long sentences in this post already, so I’ll skip to my solution. After some pondering, I realised that I could combine the Pareto plot with the stacked bar chart, which I call the Lowry Plot, named for the Lancashire artist – the resultant graph looks a lot like a smoke stack, of which he painted many.

A Lowry plot of some m-xylene data

If you include the first 4 parameters then, reading up from the fourth bar, you account for between 85% and 90% of the variance. Conversely, to include at least 90% of the variance, you find the bar where the whole ribbon is above 90%. In this case you need the first six parameters.

To recreate this, here’s some sample data taken from a model of m-xylene. The important thing is that you need 3 columns: parameter name, main effect and interaction.

library(ggplot2)

m_xylene_data <- data.frame(
  Parameter = c(
    "BW", "CRE", "DS", "KM", "MPY", "Pba", "Pfaa", 
    "Plia", "Prpda", "Pspda", "QCC", "QfaC", "QliC", 
    "QPC", "QspdC", "Rurine", "Vfac", "VliC", "Vmax"),
  "Main Effect" = c(
    1.03E-01, 9.91E-02, 9.18E-07, 3.42E-02, 9.27E-3, 2.82E-2, 2.58E-05, 
    1.37E-05, 5.73E-4, 2.76E-3, 6.77E-3, 8.67E-05, 1.30E-02, 
    1.19E-01, 4.75E-04, 5.25E-01, 2.07E-04, 1.73E-03, 1.08E-03),
  Interaction = c(
    1.49E-02, 1.43E-02, 1.25E-04, 6.84E-03, 3.25E-03, 7.67E-03, 8.34E-05, 
    1.17E-04, 2.04E-04, 7.64E-04, 2.84E-03, 8.72E-05, 2.37E-03, 
    2.61E-02, 6.68E-04, 4.57E-02, 1.32E-04, 6.96E-04, 6.55E-04
  )
)

Sometimes it is easier to use the data in wide format, other times in long format. This data fortification process returns both, plus some extra columns.

fortify_lowry_data <- function(data, 
  param_var = "Parameter", 
  main_var = "Main.Effect", 
  inter_var = "Interaction")
{
  #Convert wide to long format
  mdata <- melt(data, id.vars = param_var)

  #Order columns by main effect and reorder parameter levels
  o <- order(data[, main_var], decreasing = TRUE)
  data <- data[o, ]
  data[, param_var] <- factor(
    data[, param_var], levels = data[, param_var]
  )

  #Force main effect, interaction to be numeric
  data[, main_var] <- as.numeric(data[, main_var])
  data[, inter_var] <- as.numeric(data[, inter_var])

  #total effect is main effect + interaction
  data$.total.effect <- rowSums(data[, c(main_var, inter_var)])

  #Get cumulative totals for the ribbon
  data$.cumulative.main.effect <- cumsum(data[, main_var])
  data$.cumulative.total.effect <- cumsum(data$.total.effect)

  #A quirk of ggplot2 means we need x coords of bars
  data$.numeric.param <- as.numeric(data[, param_var])

  #The other upper bound
  #.maximum =  1 - main effects not included
  data$.maximum <- c(1 - rev(cumsum(rev(data[, main_var])))[-1], 1)

  data$.valid.ymax <- with(data, 
    pmin(.maximum, .cumulative.total.effect)
  )
  
  mdata[, param_var] <- factor(
    mdata[, param_var], levels = data[, param_var]
  ) 
  list(data = data, mdata = mdata)
}

The plot is essentially a bar chart plus a cumulative frequency ribbon, with some tweaks to make it prettier.

lowry_plot <- function(data,
  param_var = "Parameter",
  main_var = "Main.Effect",
  inter_var = "Interaction",
  x_lab = "Parameters",
  y_lab = "Total Effects (= Main Effects + Interactions)",
  ribbon_alpha = 0.5,
  x_text_angle = 25)
{
  #Fortify data and dump contents into plot function environment
  data_list <- fortify_lowry_data(data, param_var, main_var, inter_var)
  list2env(data_list, envir = sys.frame(sys.nframe()))

  p <- ggplot(data) +
    geom_bar(aes_string(x = param_var, y = "value", fill = "variable"),
      data = mdata) +
    geom_ribbon(
      aes(x = .numeric.param, ymin = .cumulative.main.effect, ymax = .valid.ymax),
      data = data,
      alpha = ribbon_alpha) +
    xlab(x_lab) +
    ylab(y_lab) +
    scale_y_continuous(formatter = "percent") +
    opts(axis.text.x = theme_text(angle = x_text_angle, hjust = 1)) +
    scale_fill_grey(end = 0.5) +
    opts(legend.position = "top",
      legend.title = theme_blank(),
      legend.direction = "horizontal"
    )
  p
}

m_xylene_lowry <- lowry_plot(m_xylene_data)
m_xylene_lowry

EDIT: I’ve tweaked to code to make it a littler simpler; the resampling is no longer needed.

*Dear legal folk, this is a personal blog and I’m not officially speaking on behalf of my employer. That said, a substantial chunk of the people who care about these things love PBPK.

About these ads
Tags: ,
  1. Jim Maas
    12th January, 2011 at 12:40 pm

    Thanks for this. Should it run in R 2.11.1 or do we need to have 2.12? I’m getting an error.

    Error in lowry_plot(m_xylene_data) : could not find function “list2env”

    Thanks for posting! J

    • 12th January, 2011 at 14:43 pm

      list2env was introduced in R2.12.0. (It’s well worth playing with.)

      For those of you running legacy versions, the easiest thing to do is to

      1. Move the call to fortify outside of the lowry_plot function.

      2. Change the signature of lowry_plot to lowry_plot(data, mdata, data_interpolated, _other_args_).

      3. Pass the three data frames returned in step 1 into the plotting routine.

      Hope this helps.

  2. Rainer
    17th March, 2011 at 14:06 pm

    Hi

    I am thinking about creating a lowry plot from the sensitivity analysis I did using the package tgp, but I am struggling, with the terminology: What is the “main effect” and “interaction effect” you are referring to when having an object returned from the sens function (in the tgp package)? The their “main effect” is definitely not your “main effect…
    The “first order sensitivity index” seems to be what you refer to as “main effect”, but the “total effect sensitivity index” does not seem to be your “total effect”, as they do not summ up to one – actually, they are scaled to a scale from 0 to one.
    So how can I translate the results from a sensitivity analysis conducted with sens() in tgp so that I can use the lowry plot?

    Thanks,

    Rainer

    • 21st March, 2011 at 17:15 pm

      The short answer is that I’m not sure. I’m not familiar with the tgp package (but thanks for making me aware of it) and I’m fairly new to sensitivity analysis myself. I’ve passed the query on to someone who knows more about these things than I do; I’ll let you know if there’s any progress.

    • 22nd March, 2011 at 12:14 pm

      Having consulted with my sensitivity analysis expert (colleague Dr Kevin McNally) you are right about “first order sensitivity index” being the same as our main effect.

      We’re both reasonably certain that “total effect sensitivity index” _is_ the same as our total effect. Perhaps I should have made it clear in the text above, total effects should sum to a number greater than one. The reason for this is that you have double counting of interactions (parameter A interacts with parameter B just as parameter B interacts with parameter A). I don’t think that the total effect sensitivity indexes are scaled from zero to one – if s is the return value from a call to sens, then look at max(s$sens$T) – it isn’t equal to one.

      Assuming that I’m right (more likely than not but not guaranteed), then this function will create data in a suitable format.

      create_lowry_data <- function(x, …)
      {
      UseMethod("create_lowry_data")
      }

      create_lowry_data.tgp <- function(s, …)
      {
      with(s,
      data.frame(
      Parameter = names(X),
      Main.Effect = apply(sens$S, 2, median),
      Interaction = apply(sens$T – sens$S, 2, median)
      )
      )
      }

      Hope this helps.

  3. Rainer
    22nd March, 2011 at 15:52 pm

    Thanks – that’s more or less what I thought. Makes perfect sense. The aspect which irritated me was that in your plot, the smoke stack (total effects) do not extend past one – which they (as you mentioned now) will usually do.

    Thanks a lot,

    Rainer

    • 23rd March, 2011 at 11:17 am

      The reason that the smoke stacks don’t extend past one is that it doesn’t make any physical sense. What does more that 100% of variance mean?

      As I mentioned in the post and previous comment, when you add up all the total effects, you get a number greater than one because you count the contribution from interactions more than once. That means that you should think of total effects as an upper bound for the contribution to variance from that parameter, rather than an exact value. (In a similar way, the main effect should be thought of as a lower bound.)

  4. Rainer
    23rd March, 2011 at 11:27 am

    Agreed – it does not make much sense, but only because of the *possible* double counting. But even in a number of 0.9 for the total effects, we could have double counting – but for how much? We don’t know. So – if we take the multiple counting into consideration, we probably *should* extend the smoke stacks, but not interpret them as proportion variance, but rather of what they are: sum of possibly overlapping proportions of variance, which can add up to more then one. Because a value of 2 would indicate a better *chance* that we can explain the variance with the subset of parameters then a value of 1.

    So if we divide the total effects by the sum of the total effects – shouldn’t this give us a more reliable idea about the upper bound?

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 217 other followers

%d bloggers like this: