Archive for the ‘MATLAB’ Category

Have my old job!

14th November, 2012 Leave a comment

My old job at the Health & Safety Laboratory is being advertised, and at a higher pay grade to boot.  (Though it is still civil service pay, and thus not going to make you rich.)

You’ll need to have solid mathematical modelling skills, particularly solving systems of ODEs, and be proficient at writing scientific code, preferably R or MATLAB or acslX. From chats with a few people at the lab, management are especially keen to get someone who can bring in money so grant writing and blagging skills are important too.

It’s a smashing place to work and the people are lovely.  Also, you get flexitime and loads of holiday.  If you are looking for a maths job in North West* England then I can heartily recommend applying.

*Buxton is sometimes North West England (when we get BBC local news) and sometimes in the East Midlands (like when we vote in European elections).

Tags: , , , , ,

Viewing the internals of MATLAB Matrices

31st January, 2012 Leave a comment

A cool undocumented trick I just learnt from The MathWorks’ Bob Gilmore. If you type

format debug

Then printing any vector reveals information about its internal representation. For example:

x = magic(3)

x =

Structure address = 6bc1ab0 
m = 3
n = 3
pr = d8dccf0 
pi = 0
     8     1     6
     3     5     7
     4     9     2

The structure address is the address in memory where the matrix is stored, m and n are the number of rows and columns respectively of the matrix, and pr and pi are pointers to the addresses of the matrices storing the real and imaginary components of the matrix.

One interesting thing to look at is the representation of scalar numbers.

 y = 1

y =

Structure address = 6bc31e0 
m = 1
n = 1
pr = d790b90 
pi = 0

Yep: they are stored in exactly the same way as matrices: in the same way the “everything in R is a vector”, everything in MATLAB is a matrix. To finish up, here are some more examples for you to explore:

% higher dimensional arrays
rand(2, 3, 4)
% cell arrays (unfortunately not that revealing)
{1, magic(3)}
% sparse matrices (very interesting)

MATLAB’s stand out new feature

6th October, 2011 Leave a comment

It’s been a while since my last MATLAB post, not because I don’t love the language, but more because I do most of my blogging from home, where I have no license, and because (mostly thanks to R-bloggers) I get ten times as many page views for the R posts. (TODO: Create MATLAB-bloggers service.)

Having returned from holiday (it was lovely, thanks for asking) I’ve been trying out the latest release of MATLAB – R2011b. So far, the standout new feature is the automatic variable renaming. If you change the name of a variable at the point where it was declared, then pressing Shift+Enter lets MATLAB rename all other instances. IDEs for statically-typed languages have had this feature for years, but to see it in a dynamically-typed language is very impressive.

MATLAB's variable auto-renaming in action

Nomograms everywhere!

30th August, 2011 Leave a comment

At useR!, Jonty Rougier talked about nomograms, a once popular visualisation that has fallen by the wayside with the rise of computers. I’d seen a few before, but hadn’t understood how they worked or why you’d want to use them. Anyway, since that talk I’ve been digging around in biology books from the 60s and 70s, and it seems they are full of them. So for those of you who haven’t seen the talk, here’s how they work.

A basic nomogram consists of three scales. By reading off known values from two of the scales, you can estimate a third one. Here’s an example I found in the ICRP‘s reference manual.

Nomogram estimating skin surface area from height and bodyweight

It’s difficult to measure people’s skin surface area, but height and bodyweight are very straightforward. To use the nomogram, you place a ruler (or other straight edge) on the height* and weight scales and read of the point where the ruler crosses the surface area scale. I’m 177cm tall and weigh 72kg, so according to this, my estimated skin surface area is 1.89m2.

Of course nowadays, the standard way to solve this sort of problem is to write a function. Jonty suggested that the main modern use of nomograms is in fieldwork situations, where computers aren’t handily available. (His case study was Kenyan vets trying to estimate the weight of donkeys form there height and girth.)

Altman and Dittmer’s Respiration and Circulation has many more pretty nomograms. I was particularly impressed by those on blood pH, reproduced below for your pleasure.

Nomogram estimating excess base in blood from pH and CO2 pressure via Siggard Andersen

Nomogram estimating excess base in blood from pH and CO2 pressure via Singer Hastings

Your homework is to dig out a pre-1980 textbook and hunt for more nomograms.

*Gruesomely, the fact that the scale is labelled “length” rather than “height” makes me suspect that the bodies that provided the data were in a permanent lying down position – that is, they were corpses.

Friday Function: nclass

6th May, 2011 2 comments

When you draw a histogram, an important question is “how many bar should I draw?”. This should inspire an indignant response. You didn’t become a programmer to answer questions, did you? No. The whole point of programming is to let your computer do your thinking for you, giving you more time to watch videos of fluffy kittens.

Fortunately, R contains three functions to automate the answer, namely nclass.Sturges, nclass.scott and nclass.FD. (FD is short for Freedman-Diaconis; watch out for the fact that scott isn’t capitalised.)

The differences depend upon length and spread of data. For longer vectors, Scott and Freedman-Diaconis tend to give bigger answers.

short_normal <- rnorm(1e2) 
nclass.Sturges(short_normal)      #8
nclass.scott(short_normal)        #8
nclass.FD(short_normal)           #12
long_normal <- rnorm(1e5) 
nclass.Sturges(long_normal)       #18
nclass.scott(long_normal)         #111
nclass.FD(long_normal)            #144

For strongly skewed data, you are best to use some sort of transformation before you draw a histogram, but for the record, Freedman-Diaconis again gives bigger answers for highly skewed (and thus wider) vectors.

short_lognormal <- rlnorm(1e2) 
nclass.Sturges(short_lognormal)   #8
nclass.scott(short_lognormal)     #9
nclass.FD(short_lognormal)        #20
long_lognormal <- rlnorm(1e5) 
nclass.Sturges(long_lognormal)    #18
nclass.scott(long_lognormal)      #443
nclass.FD(long_lognormal)         #1134

My feeling is that since each of the three algorithms is rather dumb, it is safest to calculate all three, then pick the middle one.

nclass.all <- function(x, fun = median)

hist(log_islands, breaks = nclass.all(log_islands))

I also wrote a MATLAB implementation of this a couple of years ago.

It is worth noting that ggplot2 doesn’t accept a number-of-bins argument to geom_histogram, because

In practice, you will need to use multiple bin widths to
discover all the signal in the data, and having bins with
meaningful widths (rather than some arbitrary fraction of the
range of the data) is more interpretable.

That’s fine if you are interactively exploring the data, but if you want a purely automated solution, then you need to make up a number of bins.

calc_bin_width <- function(x, ...)
  rangex <- range(x, na.rm = TRUE)
  (rangex[2] - rangex[1]) / nclass.all(x, ...)

p <- ggplot(movies, aes(x = votes)) +
  geom_histogram(binwidth = calc_bin_width(log10(movies$votes))) + 
Tags: , ,

supercalifragilisticexpialidocious = 1

21st April, 2011 2 comments

I notice that the latest version of R has upped the maximum length of variable names from 256 characters to a whopping 10 000! (See ?name.) It makes the 63 character limit in MATLAB look rather pitiful by comparison. Come on MathWorks! Let’s have the ability to be stupidly verbose in our variable naming!

Presentation on testing frameworks

10th February, 2011 Leave a comment

It’s testing week in the Software Carpentry course. To commemorate the occasion, I’ve uploaded a presentation I gave a couple of years ago to evangelise the use of testing frameworks to make your life easier. Just for you, I’ve recorded an all new audio track to accompany the slides. The example uses MATLAB but the principle is applicable to any language.

Why you should use a testing framework