Nulls: Should they be allowed to be used

The_Doc_Man · Jul 9, 2007

{best Jack Nicholson imitation}Hello, folks, I'm baaa-aaack... {/best Jack Nicholson imitation}

I read the article on Wikipedia. My background includes a PhD with lots more math than I wanted to know, and I see where NULL in SQL (and the references to articles by Dr. Codd) have been clearly explained. I'm going to go more techie than I think I've ever gone here.

First, Dr. Codd is clearly thinking that NULL and INFINITY are similar in meaning. He suggests a multiplicity of nulls, just as there are many kinds of infinity. Most of us never see more than Aleph-0, the "garden variety" of infinity, for which you say "This number is too big to be expressed." But there are other types of infinity, depending on the method you used to arrive there. For instance, there is an infinity that says "increases without limit" and that is a different type of infinity than "too large to express" - for the latter implies a specific value exists, but we can't know it. The former says there is no value because there is no limit.

The arguments about the "flavors" of NULL have a problem in my book. The moment you say null has meaning - such as "Applicable but absent" vs. "Not Applicable" - you have turned null into something that is NOT based on the hard - and very specific - meaning of NULL in set theory. In the latter theory, NULL means the same thing as "empty set" - and it takes the same exact meaning whether you are doing simple sets, joined sets, unioned sets, or any other kind of set you can imagine. NULL = no members in the set. The population of the set is 0. More specifically for JOINS, it means that the INTERSECTION of the joining sets is empty.

Where I go bonkers is that people have a loose understanding of what NULL really means. It means one and only one thing in Access or any other form of SQL that abides by the original set theory meaning. It means - whatever the set is that we are examining, we have found no members.

My own view is the Murphy's law view. There are known problems with the use of NULL in a query or other Access function, action, VBA sequence, or whatever else you've got. If you know ahead of time that you have at least a decent chance of seeing a NULL and you don't protect yourself against the negative consequences of that situation, you have screwed up. I cannot be plainer or more direct than that.

A part of the article intrigued me because there is some discussion about what NULL does to SQL. It resembles what happens in Automata theory when you allow NULL transitions. It also impinges on the Goedel Completeness Theorem, which states (loosely) that a language can be correct or complete, but not both. Interrogative Logic (IL) theory is the basis for queries. In essense, IL can be used to show when a query correctly answers the question implied with it.

In the Wikipedia article, there is a reference to Closed World vs. Open World. This is the same in effect as the Goedel assertion or the basis of IL theory. A query return is TRUE in IL when it completely returns every set member implied by its conditions, and it is FALSE if it leaves things out. You can guess that a query language author can use IL theory to validate the SQL implementation. Well, when you introduce NULL in the equation, the problem becomes quite simple. NULL, like INFINITY, cannot be fully characterized because it represents a DEPARTURE from normal numbering or enumerating systems. NEITHER ONE can be exactly expressed.

Goedel's Theorem explored the "language" of finite automata and the way in which the quality of "finite" could be removed. At the risk of a terrible pun in this context, the NULL transition nullifies the "finite" attribute of any automaton. This is because NULL, being non-existent, cannot be counted or evaluated. Allowing automata to interact with input streams to exert a NULL TRANSITION instantly turns the automaton into a non-finite automaton. In practical terms, what happens isn't that the automaton goes into an infinite loop. It REALLY means that you can no longer decide whether the automaton will exit. Because, you see, Goedel was working on COMPUTABILITY theory among his other interests. The bit about automata is because many elegant proofs about formulas were made using Finite Automata theory to generalize the math. You could probably Google-search for FINITE AUTOMATA and get a lot more than you really wanted to see.

Where has this taken us? SQL has the same problem that Goedel found with other languages - including English. He found that English was COMPLETE therefore cannot be CORRECT. The goal of SQL is that all returns from it can be trusted. When you allow NULL, that is no longer true because NULL is outside of normal logic used for set theory. In case you are wondering why English is COMPLETE but not CORRECT, it is because English allows something called a self-referential statement AND it allows expressions to have truth value in ways that can be very confusing. Example: "This statement is FALSE." Tell me what that means. Go ahead, I dare you. I double-dog dare you.

SQL with poor NULL management becomes a nightmare. SQL with proper NULL management avoids the nightmare. Mucking about with NULL so that there are flavors of NULL (it's in the Wikipedia article) compound the nightmare. All NULL should EVER mean is "have no way of knowing in this context." If you make more of a statement than that, you are going far afield.

I'm gonna break here and think some more. My brain cramped on me and I need to let it cool down.

The_Doc_Man · Jul 9, 2007

I lied a little. I came back sooner.

In this forum, questions are often raised about the use of NULL. I tend to say that there is no excuse for leaving a NULL behind in a record once it is created. You can, if you wish, leave a state-value in the record such that you can define whether the record is ready for use. When you run into an outer join that has the unfortunate effect of leaving half a record NULL, that is where Access gets crazy. But if you check the status of the record and find that it contains nulls in places that just can't be null, you have shown that the outer join resulted in a empty INTERSECTION.

The question is what you should do with records resulting from outer joins. If you would have also gotten the record on an INNER join, you can process it with little or no problem. But if the record would NOT have been returned after an INNER JOIN, you have to decide what you allow to be done. To my personal viewpoint, if you find that you have a record that contains NULL in critical parts, you have something that must be handled very differently than other records. What you cannot do is let the NULL trip you and your code. If you do, you have no way of proving the quality of your algorithm any more.

Nulls: Should they be allowed to be used

The_Doc_Man

Immoderate Moderator

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread