Sunday, July 29, 2007

Dirty GOTO, vi Harry

Some Brown Thrasher (well, you can't choose your own name, can you) wrote that there are substantial "productivity gains" to be had from using these "higher level languages". Strange that I never experience them in my professional life. What I do experience are buggy programs that nobody seems to be able to fix, bloated executables that have snail performance and projects for trivial applications that never seem to end.

Which is quite understandable. If you don't factor well enough you end up with very large programs. If you rely on an already bloated compiler in an already bloated environment that is what you get. Worse, since there are so many layers on top of you, you have no idea what is going on. You don't have a clue where memory leaks come from since you never allocated that memory yourself.

I write code generators with a compiler I wrote myself. When I encounter a bug I have always to ask myself whether it is in the generated code I modified, the code generator or worse, the compiler itself. That is about the complexity I can handle. But more than once I stumbled upon a bug in the browser or the scripting language.

A simple application in PHP takes me about a week, a more complex one two. Other programs, like simple converters, I do in about an hour. I'm not the kind of programmer that writes down the code and it is good. I always envied the guys that can do that. But I'm the king of the world at work. Very strange, since it is not even my job to write programs. Worse, I even shouldn't. But if you want to do a job well, you obviously have to do it yourself.

My whole life I've been a rebel, may be that is why I chose to do this job. Hush, computer science has nothing to do with science. It is about doing neat things with the tools you got. And that is actually a paraphrased quote from Richard Feynman, a notorious Nobel prize winner. I always had a liking for Feynman, because he always stuck his nose where it shouldn't be. Computer science was only one of the victims, psychology was another one.

Like I said before two things can't be right at the same time. Object orientation can't be right when relational databases are right. Actually, they are both wrong, because they are simply just ways of doing things and handling different kind of problems.

Structured programming is quite neat, but sometimes a well placed GOTO is much more elegant. There is a good reason that C kept it. It is also much faster. No switch() statement can ever achieve the speed of a calculated GOTO. Nowadays when you use a GOTO people look at you as if you've been caught hiding a card up your sleeve at poker. GOTO is so unprofessional..

Another new trick is "template engines" and "separation of presentation and business logic". Basically, the idea is that you have one file with the HTML code and one file with PHP code. Of course, the idea is ridiculous. Why? Because the whole HTML development environment is a mess.

HTML was designed as a markup language. In markup languages you describe the layout of a document. Nothing more, nothing less. Then Netscape thought it was a neat idea to do some fancier stuff, so they added a procedural language on top of it. Since Sun had just introduced Java, they thought it was a good idea to make it look a bit like Java. This is the way Javascript was born. In short HTML was never designed to serve as a front end for business applications. Later they added CSS on top of it. That was a good move, because it was getting quite crammed in these little tags. XML was even a better move, but boy, we were getting quite far from home. This was not what the inventors had had in mind when they designed these little <B> and <I> tags, I guess.

SQL was never intended to be used by programmers. It was meant to enable end users to make their own little reports. Of course they failed miserably. That is why SQL is such a horror. Instead of redesigning the whole thing, tools were made on top of this flawed language. Did you know there are over 60 fundamental design errors in SQL? And of course, while SQL was designed to make procedural programming superfluous, procedural programming was still necessary, so a bunch of procedural dialects were glued on top of it, like Transact-SQL and PL-SQL.

Needless to say, it is always a nice surprise to see how the interpreter mangles your query into the most stupid scheme imaginable. That is why there are tools to show you how the data is finally collected and of course, there are several clever strategies you can choose from.

Ok, this is our web development environment. Some HTML, some Javascript, some SQL, may be a bit of PL-SQL. Are we happy now? No, let's add a preprocessor like PHP to complicate things a little further. Are we happy now? No let's add some more fancy things like template engines. Ok, let's see. You got a template engine that cranks out PHP code. In some way, it is a code generator. Then we push that through PHP which is also a code generator, disguised as an interpreter. Then we have to make a little visit to the SQL interpreter to collect some data. We push the whole thing over the line to the next interpreter, which is your browser. When we're lucky - and we almost always are – we also stroll along the Javascript interpreter within your interpreter.

Ok, that is four different languages and at least as many interpreters one on top of another. How many points of failure is that? I sometimes wonder how I ever get the job done. What a gigantic waste of CPU power and storage! And that is called an architecture? In the old, primitive days there was no SQL. You had a buffer for each table. You put some data in it and called a routine to save or retrieve it. If there was a small table you used numerous times you stored it in memory and added some hashing for quick retrieval. You pulled it over the line only once and did the joining yourself. Of course you could use your own strategy to retrieve the data.

Of course you can do some of the same neat tricks with SQL too, but indirectly. I once was called in a problem where Graphviz had to create a drawing out of some database data. PHP waited for 90 seconds and gave a timeout. Still no drawing though. When I was called in I designed some query that selected the best candidate rows from the database and read them in memory while adding a little hashing to speed up the search. Even the most complex drawings were on screen within 15 seconds. The price? A megabyte of memory per query, max. Well, we had 4 gigs, so who was counting. Sure, you can never have enough hardware, but use it wisely.

So, back to our template engines. You got two files where you first had one. Now, if you add a new column to a table you have to edit two files. If you don't use template engines you only have to edit one. If you just want to change the "presentation logic" (we used to call it "layout" in the old days), you have to edit one. If you have one file you still have to edit one file.

The problem is you haven't separated anything at all. You normally can't reuse any of the files because they are bound to the table layout. "But it is easier to maintain", proponents argue "because you don't have to mix HTML with PHP". True. But what about Javascript and SQL? I designed a simple PHP library that doesn't just encapsulate all HTML and Javascript, but also most of the SQL code. So now I write almost pure PHP. Problem solved. No engine was harmed in making this application.

I even got some nifty functions that allow me to read a table into an associative array or create a select box from a database query. Most of the time I don't even need to write the program by hand, since my code generator takes a CREATE TABLE statement and writes a fully functional program for it, complete with forward and backward buttons and without these horrible "edit" buttons. You can enter or update data right away.

Believe me, Brown Thrasher, productivity comes from clever tools, well-written abstract libraries and a good architecture, not from "higher level languages". The best tools in this line of work are still your head and a copy of Donald Knuth's "The Art of Computer Programming". This industry has been trying for fourty years to make a boy do a mans job. It's like being a soldier. Every kid thinks he's invincible when you put a gun in his hands. The problem is: he isn't.

There a lot of youngsters like you around that want to take on gramps with the fanciest Studio suite they can find. Go ahead, I'll take you anytime, anywhere. But when you do, you've got to ask yourself one question first: do I feel lucky? But being as this is vi, the most powerful editor in the world, and would blow your head clean off, even over a 2400 baud line. Well, do ya punk? Go ahead, make my day!

No comments: