Saturday, June 19, 2010

Null: that little ugly Evil

When Java was created, one of the goals was to simplify C++ for the masses. And it certainly succeed, Java is among the top most popular languages now. In its C++ trimming task, however, one little ugly evil passed under the designers' noses and smuggled into Java: the null value. I can speculate a couple of reasons, but I am more interested in uncovering first its ugly face in the light of programming types.

Null is a value that is assignable to all object types, is a valid object reference in Java. It means that, as value, it belongs to all object types. If you think about this for a minute you will realize that this doesn't make sense: it should be impossible for a programmer to handle a universal value for all types, it's like saying that you could have an object reference that conforms to all possible APIs, type contraints and so forth. It's like putting a god in your program. But of course, such value cannot exist, and instead of a god you are actualy falling prey to a little evil. It promises to be assignable to any type but conforms to it by throwing the program into bottom type, the type that has no values, the type of the failling function types. It ruins it.

What happens every Java, and C# programmer, knows it: a NullPointerException is thrown and the program can safely recover from the impasse - if she or he were careful enough to consider it. So, supposedly, Java programmers should thank the designers when they created a safe programming language that doesn't fall into segmentation fault. Oh, yes, thanks - you just hid the problem by another level of indirection.

Nowadays, you can guess that there are proposed practices and patterns to avoid it. You can read about the Null pattern or practices to avoid null in C#. Could it be solved from start? Well, yes, there are languages out there that don't have a null value, Haskell or Clean to name the ones I know, but they are not popular because they were born in Academia in the area of functional programming. In the object oriented way of thinking probably was too dificult to remove it: can you force programmers to have a valid object reference all the time? Probably not, because those forced object references -say, using a default constructor for example with void implementations as suggested by the pattern mentioned above- would have been unusable anyway, wouldn't it? So why creating objects that are not valid? Inefficient anyway, isn't it?

Well, Haskell and Clean use what's called Algebraic Data Types, which is a funny name given to the ability you had already in type constructors in C, with the caveat that the union was the one requiring improvement and it had it. All wrapped in a coherent type theory... and with no behaviour (at least with OO glasses).

So, one bet is that it was this obsession with behaviour -something on what they also failed- the reason why the language designers continued to use null in the improved version of C++ that they created. But another guess is that this is probably linked to the obsession in object oriented programming with names, but that's a topic for another post.