Wednesday, December 15, 2010

Code Line Numbers and Productivity

What sparked this post was a homework assignment of mine for a class on artificial intelligence.  We were instructed to write a program that could solve instances of Knuth's conjecture (more info at: http://www.perlmonks.org/?node_id=443037).  That is, given a certain positive integer, it would produce the series of mathematical operations necessary to transform 4 into that integer.  We were to use breadth first search, and had our choice of Lisp, Scheme, and Python.

Initially, this sounded like a lot of work.  I have implemented solvers like this for similarly simple problems like this in both C++ and MIPS assembly.  In C++, given a good design, the problem takes several hundred lines.  I assume it would be shorter in a language with a garbage collector, perhaps Java, as a lot of the trickiness had to do with memory management.  With MIPS...well, ouch.  That was over 1,000 lines of C that I hand translated into about 3,200 lines of assembly.  Plus, the MIPS was actually only a very simple DFS as opposed to a BFS.  My hands were cramping before I even started to write anything.

I'm way more familiar with Lisp than Python, so I went ahead with that language.  To my surprise, I went from design to completely working code in about two hours.  Stripping out whitespace and comments, it was only 52 lines in the end.  Not only that, the solution it printed was actual Lisp code, which could then be executed to verify that it was indeed correct (it was).

"But it's slow!"  Hmm...not really.  For every number I gave it, it had a solution within a second.  Because of the factorials involved, numbers tended to quickly overflow, which my design would treat as a dead search path.  I'd have to do this in C++ or anything else.  Perhaps I should have used a special data type that can handle arbitrarily large numbers, but this was just a toy.  Plus, I could have improved its performance significantly if I had used destructive list operations as opposed to nondestructive ones.

I would really hate to implement this in a more traditional language.  Printing out actual executable code would require much more work than in Lisp.  (For comparison, in LISP, say there is an expression E that has generated the previous number.  Say that square root is used to get the next number.  Adding square root to the previous expression is simply `(sqrt ,E), which returns a new expression that is square root tagged on to the old one.)  Let me translate: more code.  More code is more lines.  People often somehow translate this to better or more productive.  It's not.  It's just foolish.

I have two projects that have my name tagged on them that have passed the 10,000 line mark (note that my code/comment ratio is around 50/50, so 5,000 is a better estimate).  One is in Perl, and the other in Scala (a functional language with striking similarity to Java).  Without going into detail with what they are, rest assured that the Scala project is far more complicated, and it does far more.  (A little more detail is that the Perl project is a Web interface over a database, and that the Scala project is an implementation of my own programming language).  With the Perl project, a significant amount of my time consists of actual coding.  There is very little thought involved; much of it is self-explanatory.  The limiting factor of that project is how fast I type.

The Scala project is the complete opposite.  For that, I have dozens of pages of notes describing design and thought processes.  The limiting factor for that project is my imagination.  For example, I spent two days trying to come up with a routine that ended up being around 100 lines.  After I wrote the code for it, I stared aback at it for 10 or 20 minutes, flabbergasted at what just happened.  I exploited multiple language features of Scala that are just plain not often seen, both in terms of traditional languages and traditional thought.  I took full advantage of recursion, anonymous functions, closures, and pattern matching in addition to the more traditional objects.  The code was clean, made sense, and it worked the first time.  The difficult part was not the code, but the ideas that went into it.  I was making the language do the work that normally my fingers would have.  100 lines in two days?  It can be a lot more than you think.

Don't fall into the trap of equating line numbers with effort!  If I had a job that I was paid by the line, I would implement everything in assembly.  I'd be rich, without having much of anything to show for it.

No comments:

Post a Comment