Friday, April 23, 2010

Timing Tests

OK - I'm not a timing test expert. I'm not a program profiling expert. etc etc etc

I know timing tests are 'hard to do right'. I know that there are all kinds of consideration. I know that 'to do it right, you have to . . .'

But I don't really care about 'doing it "right"' according to some picky standard.

What I do care about is not writing really slow code.

For that, the rules are simple:

Rule 1: Don't do things that take a long time

Rule 2: If you have to do repeat something a lot, check out alternative ways to do it and pick the method which is both clear to read / understand and takes the least time.

Rule 1 - expanded:

You do this by knowing how long things take and which operations are blocking. You don't need to be accurate, because for most things, 'takes a long time' is measured in orders of magnitude.

Here are the relevant cases:
  1. Monolithic program doing in-memory data access/processing
  2. Multi-threading/parallel processing/whatever - any form of parallel processing which is executed inside a single process context. Here you tend to lose because of blocking and communication - one thread needs to access shared data and so blocks reads, etc OR one thread needs results from another to continue OR etc.
  3. Self generating code - aka Metaprogramming. This is a cool way to impress your friends, but it costs orders of magnitude in performance. The idea is to write code which traps function calls to functions which don't exist, then parse the function name and build a function 'on the fly' to do the task encoded in the function name, dynamically build the call sequence, execute the function and return the result. It's not hard to do, but it's pretty much unnecessary (almost all the time) and really slows things down, because parsing strings takes a lot of repetitive work. That's why we have compilers!
  4. Disk reads are much longer - at a minimum they require a context switch as you make a system call. Then it depends on file size, caching in the operating system, memory size, etc. The Rule is: for repeated reads, try to do only Once and cache the result in a variable
  5. Run a subprocess - this requires a context switch, process invocation, lots of disk reads, etc etc followed by receiving result, parsing it, etc etc. Much more expensive than disk, but less expensive than Network reads.
  6. Network reads are longest - not only do they require a system call, you typically have to run another process someplace. If that process is on a different host, then the cost is astronomical relative to in-memory and disk i/o.
So Rule 1, says - if you don't really need to do the Slow Thing, then don't. And Do the Slow Thing as seldom as you can get away with.

Rule 2 - expanded

Right now I'm writing a lot of PHP (don't groan, it seemed like a good idea at the time) and in this code I have lots of places where I need to do things based on the value of a string. For example, I'm writing a lot of PHP5 objects where I put guards on attribute access so that I can find spelling errors (my High School English teachers understand why I need to do this).

So I have lots of functions that look like:

function __get($name) {
if (in_array($name, array('foo', 'bar', 'baz'))) {
return $this->$name;
}else {
throw new Exception("$name is not a valid attribute name");
}
}


or
function __get($name) {
if (($name == 'foo' || $name == 'bar' || $name == 'baz'))) {
return $this->$name;
}else {
throw new Exception("$name is not a valid attribute name");
}
}

or
function __get($name) {
switch ($name) {
case 'foo':
case 'bar':
case 'baz':
return $this->$name;
default:
throw new Exception("$name is not a valid attribute name");
}
}
I do this a lot, so I need to know which one is the fastest. I don't need to know precisely, I just need to know 'more or less'.

To do this, I need to build a test case and run it to get some timing numbers.

The test case doesn't have to be perfect, but it does need to put the emphasis on the differences between the three different methods. It also has to be large enough to be able to distinguish run times between the methods.

In this case, I built the three functions each with about 150 alternatives and the built a list of trials which would fail about 1/2 the time. I then executed each function a bunch of times.

How many is the right bunch? I'm lazy, so I start small for the number of repetitions and then crank it up until the total run time per method is around 10 to 60 seconds.

Here's what I got:
  • switch: 49.7731 seconds
  • in_array method: 86.3004 seconds
  • if with complex conditional: 57.0134 seconds
Guess which method I'm going with.

[guess how I'm going to refactor a lot of my code (sigh - I should have tested first)