Sunday, January 27, 2008

DSL Design - that's Domain Specific Language

A Domain Specific Language - for those who don't know - is a bunch of functions named so that writing function calls 'reads' like natural language. It seems to be sprouting wildly in Ruby - most probably because Ruby doesn't require parentheses around function arguments and Ruby programmers are kind of rebels anyway.

For example:
rabbit_jumps_in_the_hole :hole_size => 10

Even if you don't understand Ruby Syntax, you know what the function call does.

Think of it this way:
  1. Function/methods are really 'verbs'.
  2. Function Options are 'adverbs'
  3. Object Identifiers are 'nouns'
  4. Object Attributes are 'adjectives'
  5. Class Names are 'class names' [gottcha!!!!]
I think we should think of learning one of these DSL things the same way we think about learning a new Language.

This can be either a good thing or very bad.

Size Matters

Which is easier to learn: A language with 10 verbs or one with 1,000?

Just for the heck of it, I recently tried to get a count of the 'verbs' in Rails 2.0.2. I ran 'egrep -r 'def [a-z]' on all lib directories and came up with:
  • actionmailer/lib 694
  • actionpack/lib 1393
  • activerecord/lib 1134
  • activeresource/lib 125
  • activesupport/lib 577

  • Total 3923
In contrast, the Merb Framework is much smaller:
  • merb 713
  • merb.rb 19
  • tasks.rb 0

  • Total 732
Of course, this isn't fair because Merb doesn't come with an ORM [Object Relational Mapper library (bunch of database access functions - for those really out of it)] [or Active Record Pattern Implementation, for those . . . - well, you know who you are], so you have to add that in.

But Merb gives you a choice of ActiveRecord - with it's 1,100 verbs; DataMapper - with about 500 methods; or Sequel - with about 600.

So learning Merb should be easier than Rails because the vocabulary is about 1/3 to 1/2 the size.

Synonyms are Bad

Which is easier to Learn: a language with one word for each concept or with two or more?

A programming language or environment isn't meant for composing poetry, novels, or movies. It's supposed to precisely express a procedure. Period. It should be concise. That makes it easier for Programmers to understand.

Case closed. DSL's should be concise, singular, and boring - but very, very accurate.

Corollary:

The Rails Inflector is a mistake in every possible way:
  • It Expands rather than Tightens the vocabulary of the Rails DSL
  • It injects confusion because Programmers now have to worry about singular and plural forms depending on context
  • It doesn't work:
    • 'XMLClass'.underscore -> 'xml_class'
    • 'XMLClass'.underscore.camelize => 'XmlClass'
    • 'slave'.pluralize == 'slaves'
    • 'slave'.pluralize.singularize == 'slafe'
  • It wastes lots of cycles doing it - machine, programmer, and learning
Distance is Good

I'm talking about the distance between words. For example frog is very close to frogs but far from toads. That makes it easier to tell a frog from a toad in print than in real life.

Good DSL design should not only use expressive and concise identifiers, but should also keep them far apart, especially when the referents do significantly different things.

Again, picking on Rails, the methods update_attribute(attribute) and update_attributes(attributes) are very close together, but one bypasses attribute Validation. Can you tell which one by the names? Don't you think it's important to know?

DSL is Not Documentation

Most DSL seem to grow more or less organically. The Ruby universe is filled a lot of apparently useful packages with virtually no documentation. Almost all of them have fairly reasonable API documentation - which allows 'one' to learn what each of the 'verbs' in the DSL do, but that's like learning to drive a car by reading Glossary of the Parts! It Just Don't Work.

It's hard as hell to learn a system without some sense of what the thing is supposed to be doing and how it's put together.

Don't belive me?

Figure out a Car from stuff like this:

Wheel - 1. circular object in contact with ground; 2. circular object interfacing driver to directional controls.
Nut - 1. Device for attaching wheel; 2. driver in other automobile; 3. nutritious snack
etc.

That's API doc and that's what you've got when all there is is the DSL.

'nuff for now

Friday, January 25, 2008

Is Java Bad for You?

How do Heavily Constrained programming environments - such as Java, C#, and friends - effect our thinking and creativity?

I think the goal of heavy constraints and requirements started out as a way to get better code by automatically checking as much stuff as possible mechanically. It all started with compile time type checking and has extended into things I don't want to know about.

Anyway, the result is that it's hard to write code with these tools. From what I hear - and I don't do Java, C#, and friends - the 'programmers' spend most of their time figuring out what API's and Design Patterns to use. I don't find that fun at all.

I usually spend most of my time trying to better understand the problems I'm trying to solve and creating software structures which mimic the nature of the problem. After I've coded and tested one of these structures, I call it a 'solution'.

In other words: I don't use Design Patterns and don't think in terms of API's.

Does that make me a dinosaur?

I don't think so.

I gravitate toward unrestricted programming environments. My first introduction was SCO Xenix around 1986 or 87. I became really excited as I realized how easy it was to do mundane tasks by stringing filters together in pipelines. I could accomplish more useful work in 1/10th the time [or so it seemed] than I could writing special programs to do the same thing.

In addition to being faster, it was more fun. I spent more and more of my energy solving problems rather than conforming to code writing rules.

The same thing happened when I discovered Python - and now to a similar extent Ruby. Scripting languages with good support for dynamic strings, arrays, hashes and objects are wonderful. They handle all the details of what I need - as a coder - to do the job.

How do I keep from hurting myself when the Programming Environment doesn't keep tabs on me?

Well, it's not a problem. I just test as I go and keep rewriting my code so it works, is more succinct, and tighter. [I guess you call that Refactoring now - we used to call it rewriting]

The facts are that Anyone can write bad code - and restrictive frameworks don't stop them. Anyone can also write good code - if they take the time to learn how and pay attention to what they are doing. And restrictive languages don't help with that either.

When I need a solution - I just think one up (or two or three) up and try it out. It's easy to write the code, change it, test it, and refine it. In a verbose, API laden environment like Java, that cycle isn't so easy - or at least is a heck of a lot more verbose.

I suspect that restrictive environment programmers get dulled down by the drudgery of just writing the code and learning all the API's. There is so little room for creativity that they lose it - creativity is something you have to practice and cultivate.

As a result, they get used to solving problems by applying packages and patterns. They don't really design: they apply old designs to new problems and hope that they work. [BTW, that's the reality behind Anti-Patterns]

So, I guess it makes sense: if everything you do is a cut-and-paste of something somebody else thought up, you would apply that to Design as well.

God, that's boring!

If I'm write, then Design Patterns are a result of Boring Programming Tools which create Bored, Dull Programmers and more Bad Code.

I don't think that's a good thing.

What do you think?

Wednesday, January 9, 2008

Code Efficiency in Ruby

I got interested in the qualify of Ruby code in Rails when I noticed what appears to me to be a useless method in the ActiveRecord code. Specifically, ActiveRecord::Base.save is a public method which calls the private method ActiveRecord::Base#create_or_update. That doesn't make a lot of sense to me, because it could be replaced by making 'create_or_update' public and then aliasing 'save' to it.

So I decided to check to see what the superfluously method call costs.

I wrote a test which performed a simple task [incrementing an instance variable by a random number between 1 and 10] using five (5) different ways of accessing the instance variable and invoking the action.

The class definition is at the bottom of this post.

I then ran these methods 10,000,000 times using four different ways of invoking the methods:

  • directly calling the methods - e.g. foo.inc_instance_variable()

  • invoking via the method's 'call' attribute - e.g. foo.inc_instance_variable.call()

  • invoking via the method via 'send' - e.g. foo.send('inc_instance_variable')

  • invoking via 'eval'ing the string - e.g. eval 'foo.inc_instance_variable'

The Precent Results are simply the run time divided by the minimum run time for all tests converted to a percent increase.

Here's the summary:

  • Invoking via an Alias doesn't cost anything

  • unnecessarily accessing an instance Variable via an accessor slows down about 20%

  • The unnecessary method/function call slows down by about 25%

  • Combining the unnecessary call with accessor access slows down about 45% - so the effect is linear

  • Invoking by the 'call' method slows it down by about 13%

  • Using 'send' slows down about an additional 45%

  • Using eval slows the process down by something on the order of 400%, but the effect is not linear, so 'eval' must be doing some additional mucking about.

So, what's the point? None, if you're satisfied with glacial execution speeds.

On the other hand, it's something you should know if you are writing critical code and have to make choices about how to implement it.

As usual, your mileage may vary. The full program code is at http://www.clove.com/downloads/method-call-timing-tests.rb.

Here are the detailed Percentage Results

Percent Results for direct method of invocation
foo.inc_var_as_instance 0.99
foo.inc_var_as_instance_alias 0.00
foo.inc_var_as_method 19.89
foo.inc_var_as_func_and_instance 25.28
foo.inc_var_as_func_and_method 45.17

Percent Results for call method of invocation
foo.inc_var_as_instance 14.06
foo.inc_var_as_instance_alias 12.93
foo.inc_var_as_method 32.53
foo.inc_var_as_func_and_instance 42.19
foo.inc_var_as_func_and_method 59.52

Percent Results for send method of invocation
foo.inc_var_as_instance 42.05
foo.inc_var_as_instance_alias 46.02
foo.inc_var_as_method 60.65
foo.inc_var_as_func_and_instance 75.57
foo.inc_var_as_func_and_method 94.03

Percent Results for eval method of invocation
foo.inc_var_as_instance 393.89
foo.inc_var_as_instance_alias 410.94
foo.inc_var_as_method 420.03
foo.inc_var_as_func_and_instance 458.10
foo.inc_var_as_func_and_method 577.84

Here are the raw timing results:

Result using Direct Calls
foo.inc_var_as_instance 7.070000 0.040000 7.110000 ( 7.306318)
foo.inc_var_as_instance_alias 7.000000 0.040000 7.040000 ( 7.155283)
foo.inc_var_as_method 8.390000 0.050000 8.440000 ( 8.585970)
foo.inc_var_as_func_and_instance 8.780000 0.040000 8.820000 ( 8.950350)
foo.inc_var_as_func_and_method 10.160000 0.060000 10.220000 ( 10.378127)

Result using .call
foo.inc_var_as_instance 7.990000 0.040000 8.030000 ( 8.204646)
foo.inc_var_as_instance_alias 7.910000 0.040000 7.950000 ( 8.061320)
foo.inc_var_as_method 9.280000 0.050000 9.330000 ( 9.502992)
foo.inc_var_as_func_and_instance 9.940000 0.070000 10.010000 ( 10.249010)
foo.inc_var_as_func_and_method 11.170000 0.060000 11.230000 ( 11.439927)

Result using 'foo.send '
foo.inc_var_as_instance 9.950000 0.050000 10.000000 ( 10.220120)
foo.inc_var_as_instance_alias 10.220000 0.060000 10.280000 ( 10.462063)
foo.inc_var_as_method 11.250000 0.060000 11.310000 ( 11.514329)
foo.inc_var_as_func_and_instance 12.290000 0.070000 12.360000 ( 12.583919)
foo.inc_var_as_func_and_method 13.590000 0.070000 13.660000 ( 13.891333)

Result using 'eval '
foo.inc_var_as_instance 34.550000 0.220000 34.770000 ( 35.387702)
foo.inc_var_as_instance_alias 35.740000 0.230000 35.970000 ( 36.670451)
foo.inc_var_as_method 36.390000 0.220000 36.610000 ( 37.296297)
foo.inc_var_as_func_and_instance 39.050000 0.240000 39.290000 ( 40.033456)
foo.inc_var_as_func_and_method 47.420000 0.300000 47.720000 ( 48.638035)

Here's the class definition:

class Foo
attr_accessor :var

def initialize
@var = 0
end

# access the instance variable directly
def inc_var_as_instance
@var = @var + 1 + rand(10)
end
# access instance variable directly, but us an alias
alias_method :inc_var_as_instance_alias, :inc_var_as_instance

# access instance via accessor method, even though inside class instance
def inc_var_as_method
self.var = self.var + 1 + rand(10)
end

public
# add an additional method call to accessing via direct access to instance variable
def inc_var_as_func_and_instance
inc_var_as_instance
end

# add an additional method call to accessing via accessor
def inc_var_as_func_and_method
inc_var_as_method
end
end

Saturday, January 5, 2008

Design Patterns and the Fall of S/W

One nice thing about having a blog nobody reads is that I can say anything I want without worrying about it biting my butt.

I hate Design Patterns.

It's that simple.

I hate the guys who promote them.

Most of all, I hate the s/w industry - especially the programmers - for being duped by these guys.

And I'm qualified.

I received an engineering education and am a self taught computer 'something'. I'm not exactly a programmer, although I've written an awfully lot of code in a variety of environments - all the way down to machine code on microprocessors up to hokey database/user interface stuff. I've done device drivers and created my own little languages using lex & yacc and 'in the raw' in C, Python, and awk. And I've been doing this over 40 years - so I've earned the right to be a grouch.

The Design Pattern guys are the current generation of Yordon (sp?), Codd(sp?), and Bouch (sp?): Consultants who watch other people create software while telling them how to do it right. None of them actually do anything, but the sure create a lot of bad advice.

I remember buying three (3!) books by Peter Codd(sp?) on Object oriented programming and design only to find out that he admitted that he didn't really know anything about it. The idiot had gotten excited about the idea, so he and his group spent a year puttering with it and writing books - probably giving lectures and doing expensive consulting with BIG companies at the same time.

It takes years of experience doing something to understand the concepts. [things move quickly, but our brains take a while to catch up]. Hell, it takes 10 years plus to design, implement, and knock most of the bugs out of any programming language.

The Design Pattern guys are the absolute worst! The claim that they are defining these things to add clarity to the software process and they they write the vaguest crap imaginable. Don't believe me?

Grady Bouch: "In the world of software, a pattern is a tangible manifestation of an orginization's tribal memory." - CoreJ2EE Patterns (introduction) [clipped from PHP 5 Objects, Patterns, and Practice - by Matt Zandstra]


Merrian Webster Dictionary: "1. an ideal model. 2. something used as a model for making things. 3. Sample"


Which is clearer? If you like Bouch - then you need to join a consulting company and stop pretending to write code.

I'm starting to froth at the mouth, so it's time to simplify things. Here's a simple procedure to see for yourself.

1. go to a book store and pick up one of Martin Fowler's many books on patterns. WARNING: Do not buy the book.

2. Select a pattern at random and read the first paragraph describing it.

3. Answer yourself honestly: Do I know what this 'pattern' is well enough to describe it in a single sentence?

if No - read the rest of the description and try again

if still No - replace book on shelf.

if Yes, please send that sentence to me.

Thanks,
Mike