Archive

Archive for January, 2012

Viewing the internals of MATLAB Matrices

31st January, 2012 Leave a comment

A cool undocumented trick I just learnt from The MathWorks’ Bob Gilmore. If you type

format debug

Then printing any vector reveals information about its internal representation. For example:

x = magic(3)

x =


Structure address = 6bc1ab0 
m = 3
n = 3
pr = d8dccf0 
pi = 0
     8     1     6
     3     5     7
     4     9     2

The structure address is the address in memory where the matrix is stored, m and n are the number of rows and columns respectively of the matrix, and pr and pi are pointers to the addresses of the matrices storing the real and imaginary components of the matrix.

One interesting thing to look at is the representation of scalar numbers.

 y = 1

y =


Structure address = 6bc31e0 
m = 1
n = 1
pr = d790b90 
pi = 0
     1

Yep: they are stored in exactly the same way as matrices: in the same way the “everything in R is a vector”, everything in MATLAB is a matrix. To finish up, here are some more examples for you to explore:

% higher dimensional arrays
rand(2, 3, 4)
% cell arrays (unfortunately not that revealing)
{1, magic(3)}
% sparse matrices (very interesting)
sparse(ones(3))

Exploring the functions in a package

26th January, 2012 4 comments

Sometimes it can be useful to list all the functions inside a package. This is done in the same way that you would list variables in your workspace. That is, using ls. The syntax is ls(pos = "package:packagename"), which is easy enough if you can remember it. Unfortunately, I never can, and have to type search() first to see what the format of that string is.

Today, that problem is solved with a tiny utility function to save remembering things, and to save typing.

lsp <- function(package, all.names = FALSE, pattern) 
{
  package <- deparse(substitute(package))
  ls(
      pos = paste("package", package, sep = ":"), 
      all.names = all.names, 
      pattern = pattern
  )
}

all.names and pattern behave in the same way as they do in regular ls. You use it like this:

lsp(base)
lsp(base, TRUE)
lsp(base, pattern = "^is")


EDIT: I’ve had a couple of questions about the use case, and there are some interesting comments on alternatives. My thinking behind this function was that I sometimes know I’ve seen a function in a package but can’t remember what it’s called. If you can hazard a guess at the name, then apropos is probably better, though it looks everywhere on the search path rather than in a particular package. Autocompletion is also useful for this, but you need to know the first few characters of what you are looking for. (Activate autocompletions by pressing TAB in R GUI or Rstudio or CTRL+space in eclipse. I can’t remember what the shortcut is in emacs, but you probably just mash CTRL+META until you have RSI.) Finally, the unknownR package is useful for finding new functions that you hadn’t heard of yet.

Adding metadata to variables

6th January, 2012 Leave a comment

There are only really two ways to preserve your statistical analyses. You either save the variables that you create, or you save the code that you used to create them. In general the latter is much preferred because at some point you’ll realise that your model was wrong, or your dataset has changed, and you need to re-run your analysis. If you only stored your variables then you are now stuck rewriting your code in order to create new versions, which is really not fun. On the other hand, if you saved your code, all your have to do is tweak it and run it.

Occasionally though, just keeping the code and rerunning an analysis isn’t practical. The most obvious case being when it takes a long time. If your model takes more than ten minutes to run, it can be really useful to save its variables as well as the source code.

The problem with saving variables is that when you come back and load them six months later, it isn’t always obvious what they are or where they came from. With code, we solve this by using comments to jog our memory, so it would be nice to have an equivalent for variables. In fact, in R, such a facility exists with the – you guessed it – comment function.

library(lattice)
comment(barley) <- "Immer's barley data, 1934.  The data from the Morris site may have the wrong years."
comment(barley)

The comment function simply stores the string as an attribute of the variable, with some special rules on printing. Other common attributes that you may be familiar with are names for vectors and lists, and dim and dimnames for matrices.

You can find the names of all the attributes of a variable with the attributes function, and get and set individual attributes with attr.

x <- c(apple = 1, banana = 2)
attr(x, "type") <- "fruit"
attributes(x)
attr(x, "names") #same as names(x)

Attributes are really great for storing contextual metadata about a variable. For starters, when you come back to your saved workspace after those six months you might want to know who created the variable and when. To get this facility, we need an enhanced version of assign.

get_user <- function()
{
  env <- if(.Platform$OS.type == "windows") "USERNAME" else "USER"
  unname(Sys.getenv(env))    
}  
  
assign_with_metadata <- function(x, value, ..., pos = parent.frame(), inherits = FALSE)
{
  attr(value, "creator") <- get_user()
  attr(value, "time_created") <- Sys.time()
  more_attr <- list(...)
  attr_names <- names(more_attr)
  for(i in seq_along(more_attr))
  {
    attr(value, attr_names[i]) <- more_attr[[i]]
  }
  assign(x, value, pos = pos, inherits = inherits)
}

assign_with_metadata("x", 1:3, monkey = "chimp")

Notice the ... that allows you to add arbitrary attributes to the variable.

While this is great, and solves the problem, typing assign_with_metadata is way too clunky. It would be much easier if we could just use <- to assign variables and get the metadata for free.

Actually, overriding <- itself is going to lead to slowness and likely errors. Since we don’t want to store metadata for every variable (just the important ones), it is better to define our own operators to do so.

`%<-%` <- function(x, value)
{
  xname <- deparse(substitute(x))
  pos <- parent.frame()
  assign_with_metadata(xname, value, pos = pos)
}

`%<<-%` <- function(x, value) 
{
  xname <- deparse(substitute(x))
  pos <- globalenv()
  assign_with_metadata(xname, value, pos = pos)
}

m %<-% "foo"    #local assignment with metadata
f <- function()
{
  n %<<-% "bar" #global assignment with metadata
}
f()

With these functions, if you want to save your variables for later, simply swap <- for %<-%.

Follow

Get every new post delivered to your Inbox.

Join 43 other followers