Home > R > Trading secrets

Trading secrets

Recently I had the opportunity to do a job swap with one of the guys in the laboratory here at HSL.  I helped out with the mass-spectrometry and James helped me with the data analysis.  Two very useful things came out of this.

Firstly, it’s been very informative to see how the data I get is created.  I tend to assume that the numbers that are given to me are either correct or mistakes.  The reality though is more subtle.  One thing at surprised me was the length that the chemists have to go to to make sure that their instruments give sensible answers.  As well as testing urine samples, you need to test blank samples (to clean out the spectrometer’s tubes), standard samples (to calibrate the machine) and quality control samples (to check that the calibration is correct).  Even then, it wasn’t entirely clear that you would get the same answer if you ran the samples twice.

The project was based around testing Thallium levels in the general population.  To give an idea of how much we could trust the data, I re-analysed 50 of the samples that James had run.  The tricky bit was the pipetting; there’s a surprising art to avoiding air bubbles.

Comparison of Thallium levels between Richie and James' measurements

As you can see, my results were consistently lower than James’s.  Taking James as the gold standard in mass-spectrometry skill and myself as the worst-case scenario, you can see that we should only trust the results to the nearest order of magnitude.  This is not a trivial exercise – it demonstrates what would happen if James is replaced by an idiot.  (All too possible, depending on what George Osbourne says later today.)

The second really good thing to come out of this was that I managed to drill into James the importance of manipulating data with code instead of manually editing spreadsheets.  He in turn passed on this message when we presented our findings to the lab.  (Main finding: no-one is about to die of thalium poisoning.)  After the presentation, one of our toxicologists came up to me and said

“I finally get it.  I understand why mathematicians keep saying that you shouldn’t use Excel.  It’s because in order to for your work to be reproducible and auditable, you need the trail of code to see what you’ve done.”

Major win.

  1. Matt
    21st October, 2010 at 18:24 pm

    This was a serendipitous post for me… I just got done explaining to my team lead why we need to have our analysis coded rather than “eyeballing” the data… put much more succinctly in the second to last paragraph than I was able to convey…

  2. 29th October, 2010 at 11:03 am

    I come from the Pharmaceutical R&D industry, where each assay has to be ‘validated’, which involves specific tests to define how accurate and precise it is, and under what conditions those numbers might change.

    For your example I would expect the Lab person to be able to say ‘within this range of concentrations, your result is accurate to +/- x% tot he ‘true’ value, and I would expect it to vary with approx y% coefficient of variation. The statistician (or anyone using that data should totaly be aware of those numbers, otherwise he’s just turning garbage into garbage.

  1. 27th July, 2011 at 17:45 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: