Friday, April 23, 2010

Timing Tests

OK - I'm not a timing test expert. I'm not a program profiling expert. etc etc etc

I know timing tests are 'hard to do right'. I know that there are all kinds of consideration. I know that 'to do it right, you have to . . .'

But I don't really care about 'doing it "right"' according to some picky standard.

What I do care about is not writing really slow code.

For that, the rules are simple:

Rule 1: Don't do things that take a long time

Rule 2: If you have to do repeat something a lot, check out alternative ways to do it and pick the method which is both clear to read / understand and takes the least time.

Rule 1 - expanded:

You do this by knowing how long things take and which operations are blocking. You don't need to be accurate, because for most things, 'takes a long time' is measured in orders of magnitude.

Here are the relevant cases:
  1. Monolithic program doing in-memory data access/processing
  2. Multi-threading/parallel processing/whatever - any form of parallel processing which is executed inside a single process context. Here you tend to lose because of blocking and communication - one thread needs to access shared data and so blocks reads, etc OR one thread needs results from another to continue OR etc.
  3. Self generating code - aka Metaprogramming. This is a cool way to impress your friends, but it costs orders of magnitude in performance. The idea is to write code which traps function calls to functions which don't exist, then parse the function name and build a function 'on the fly' to do the task encoded in the function name, dynamically build the call sequence, execute the function and return the result. It's not hard to do, but it's pretty much unnecessary (almost all the time) and really slows things down, because parsing strings takes a lot of repetitive work. That's why we have compilers!
  4. Disk reads are much longer - at a minimum they require a context switch as you make a system call. Then it depends on file size, caching in the operating system, memory size, etc. The Rule is: for repeated reads, try to do only Once and cache the result in a variable
  5. Run a subprocess - this requires a context switch, process invocation, lots of disk reads, etc etc followed by receiving result, parsing it, etc etc. Much more expensive than disk, but less expensive than Network reads.
  6. Network reads are longest - not only do they require a system call, you typically have to run another process someplace. If that process is on a different host, then the cost is astronomical relative to in-memory and disk i/o.
So Rule 1, says - if you don't really need to do the Slow Thing, then don't. And Do the Slow Thing as seldom as you can get away with.

Rule 2 - expanded

Right now I'm writing a lot of PHP (don't groan, it seemed like a good idea at the time) and in this code I have lots of places where I need to do things based on the value of a string. For example, I'm writing a lot of PHP5 objects where I put guards on attribute access so that I can find spelling errors (my High School English teachers understand why I need to do this).

So I have lots of functions that look like:

function __get($name) {
if (in_array($name, array('foo', 'bar', 'baz'))) {
return $this->$name;
}else {
throw new Exception("$name is not a valid attribute name");
}
}


or
function __get($name) {
if (($name == 'foo' || $name == 'bar' || $name == 'baz'))) {
return $this->$name;
}else {
throw new Exception("$name is not a valid attribute name");
}
}

or
function __get($name) {
switch ($name) {
case 'foo':
case 'bar':
case 'baz':
return $this->$name;
default:
throw new Exception("$name is not a valid attribute name");
}
}
I do this a lot, so I need to know which one is the fastest. I don't need to know precisely, I just need to know 'more or less'.

To do this, I need to build a test case and run it to get some timing numbers.

The test case doesn't have to be perfect, but it does need to put the emphasis on the differences between the three different methods. It also has to be large enough to be able to distinguish run times between the methods.

In this case, I built the three functions each with about 150 alternatives and the built a list of trials which would fail about 1/2 the time. I then executed each function a bunch of times.

How many is the right bunch? I'm lazy, so I start small for the number of repetitions and then crank it up until the total run time per method is around 10 to 60 seconds.

Here's what I got:
  • switch: 49.7731 seconds
  • in_array method: 86.3004 seconds
  • if with complex conditional: 57.0134 seconds
Guess which method I'm going with.

[guess how I'm going to refactor a lot of my code (sigh - I should have tested first)

Monday, April 19, 2010

Self Image, Self Identity and All That

Who am I?

Or, more to the point, what is the 'idea' of myself that I identify with?

Or, even more to the point, how do I 'like' to think about myself?

I put 'like' in quote marks because 'like' doesn't necessarily mean 'enjoy' or 'makes me happy', but here it means 'what I keep coming back to because I believe it's true'.

In other words, the way I 'like' to think about myself might not be very nice - if I'm convinced I don't measure up to my ideals.

Everybody 'thinks' of themselves as something - has an expectation of who and what they are. In other words, Everybody 'likes' to think of themselves in a particular way.

That's the setup. Now, here are some questions:
  • Am I nothing more than an opinion? Or am I real?
  • Can I change 'Who I am' by changing my opinion?
  • Is my 'World View' a result of 'Who I am' or is it something I create for my 'Who I am' to live in?
  • Do I see and hear the world around me OR do I pick and choose what I hear and know?
  • Do I really know 'Who I am'?
  • Do I really know my friends? Or do I make them supporting actors for myself?
  • Can I live without knowing 'Who I am'?
  • How can I avoid living in an Illusion if I continue to 'know' 'Who I am'?
First Hypothetical

Let's suppose that 'Who I am' is an opinion.

Opinions are just ideas that can be changed. They aren't 'facts'.

If I hold one opinion today and another one tomorrow, it's unlikely I will be arrested, burst into flame, or that anything else substantial might happen.

I'll just have a different 'opinion'.

So, with my different opinion, won't the World be different?

Won't my friends become different people?

Won't the boundaries between good and bad and Right and Wrong shift? Won't they have changed just enough so I can make my 'opinion' work - at least as well as my old one did?

All I need do to test this is genuinely change my opinion once and see if this is what happens.

If it works like this, doesn't this mean that 'Who I am' is an opinion? An Illusion? and that I am living in a false world of my own creation?

Do I want to know?

Second Hypothetical

Let's assume 'Who I am' is somehow 'real'. It doesn't matter what this means other than that it is something other than an opinion which can be changed at a whim.

Now one part of my World View ranks everything by how 'good' and how 'bad' it is. There is usually a sliding scale from 'good' to 'bad' with 'saintly' on one end and 'absolute evil' on the other.

Naturally, I will think of myself as more 'good' than 'bad' - no matter how I think about how I live up to my expectations. [for example, if I think of myself as falling far short, then I will still think of myself as 'better' for having noticed this and for admitting it to myself]

So how will this effect my World View? How will I tend to filter and interpret that which I see?

Isn't is natural for me look for the evil and bad - so I am - in contrast - much better 'than average'?

Won't I go out of my way to do so? Won't I respond with much satisfied emotion to my discoveries of the evil in others? Satisfied in my own 'goodness by contrast'?

How can we test this?

Isn't this consistent with both the continuous litany of complaint and criticism - in the press, in entertainment, and in our own, wool gathering minds?

What happens if I see lots of goodness around me? Doesn't that push my 'Who I am' down into the muck of badness - or at least shift me down a little?

If I can't change my 'Who I am', then I will not 'like' myself (and remember 'like' means what I said it means up above). Isn't that hard to tolerate? 'Who do those "goody, goodies" think they are anyway?' Doesn't it seem natural for Cain to kill Abel?

Third Hypothetical

Again, suppose 'Who I am' is an opinion.

Then there must be something which has that opinion.

That something must be able to observe - inasmuch as it has thoughts, the 'opinion' being one of them.

So can this 'something' watch it's opinion and the thoughts it's opinion is thinking? (or maybe the thought's it is thinking for its opinion).

If this is true, then 'Who I am' is an opinion and the 'something' can become aware of this.

How can we test this?

Can we watch our own thoughts? As we think them?

If we can, then this is true and it opens the _possibility_ that the 'Who I am' is an opinion and that it can be changed.

If this is true, then can't psychic trauma be impermanent? And if impermanent, can't it dissipate? And if dissipated, hasn't it been healed?

Further, how is an opinion maintained? It isn't made of wood or metal. It has no substance other than thought. If thought isn't thinked, then is isn't. It's not there. It's gone.

So, if psychic trauma is thought, isn't it impermanent and has to be 'thinked' over and over again in order to be? So isn't not-thinking it the path to it's dissolution?

Is the dwelling on 'the bad things' and 'how sick I am' the cure or the cause of disease and despair?

Fourth Hypothetical

'Who I am' and 'Who You are' are different.

It doesn't matter if they are real or just opinions.

You see the world differently from how I see the world because you must shape your 'world' so it fits your 'Who I am' and I must do the same.

But mine is different from yours, so our 'worlds' are different.

Can I really see 'Who You are'?

Can I do more than guess?

Suppose your 'Who I am' world conflicts with my 'Who I am', from my point of view. Won't I filter and squash what I see and hear to fit my 'Who I am' instead of yours (no matter how honest, just, and polite I think I am)?

So how can I ever see where you don't make my 'Who I am' good and right? And can you see me?

Deep down don't you think you're a little better than me? I know I am a little better than you - or at least a little righter.

Doesn't that prove we can never know each other?

How can we converse?

Aren't we having two meaningless conversations with ourselves while we pretend that the other is there?

Fifth Hypothetical

I find that knowing 'Who I am' leads to an expectation that I will continue to be 'Who I am' and that I interpret and bend everything I see, hear, taste, smell, feel and think so that that will be true. I insist on continuing my existence as I envision it.

Doesn't this mean that I've been living in an illusion?

Can I escape the illusion without giving up this expectation, this prediction of the future?

Realizing this, can I continue to maintain the illusion - knowing it is a lie?

If I give up my expectation that I will continue to be 'Who I am', won't this mean I will change 'Who I am' into something else? And can I tolerate replacing one 'Who I am' with another?

Copyright Mike Howard, 2010. All rights reserved.

Saturday, April 10, 2010

News Flash: PHP Documenters Insane!!!!

The PHP documentation has gone from very useful to hideously obstructive.

The people who are rearranging the doc into little, tiny chunks which are hyperlinked all over the place obviously never write code.

I just spent 10 minutes trying to find the name of an IO Exception so I can use it in some code I'm writing.

Old Doc:

  1. I would go to the index, click on Exceptions and then scroll down the page (or do a find on IO) and there it would be. 10 seconds tops.

New Doc:
  1. Go to the index click on Predefined Exceptions
  2. Click on Exception - find description of Exception Object - info not there
  3. Back Button
  4. Click on Error Exception - find description of Generic ErrorExeption object
  5. Back Button
  6. Click on SPL Exceptions (what the hell is this? - something new?)
  7. Look at Table of contents: 13 Exception Categories - none of which
  8. looks like an IOException
  9. Click on Predefined Exceptions in the See Also -
  10. Back to Previous Useless Page - And Repeat

First they completely screw up the Perl Regular Expression page by chopping it into tiny, obscure chunks and now you destroy the exception documentation.

To the PHP Documentation Project:

PLEASE put it back the way it was.

Or get somebody who actually uses this stuff like a handbook while writing code to fix it

Or shoot somebody.

To Everybody Else:

Maybe the documentation people have stock in a book company and want the reference books to succeed by making the online Doc unusable?

All I can say is that the way they are going is really going to help Rails and Django.

What do you think?

P.S. Please Send a Nasty Note to the Gods of PHP