Quick Rant On Why Database NULLs Are Not Necessarily Evil

 

I was having this conversation with coworker last week, and it seemed like the 100th time I’ve said this.  Then it popped up on StackOverflow this morning, and that got me typing, so here’s yet another salty developer’s rant on the topic.

There is no question that NULLs should be avoided when possible, because of the problems they can introduce several problems with indexing and query syntax. 

Due to the problems that NULLs can introduce, there has been a push to stop using them.  However, like most such movements, it has gotten out of control, to the point that some people fanatically insist that NULLs should never be used in the database.  I was in this camp for a lot of years, before I found it to be a little overzealous.

The answer is somewhere in between.  NULLs should be avoided whenever possible, but there are indeed valid business reasons for storing them. 

Often times you need to store an optional numeric value, and many people will tell you to just store a zero to indicate "no value", but this is an even worse antipattern of storing a magical value that really means something else.  What if then you can’t use zero for one field, because zero is considered a meaningful value as well, or because this value is being used as the multiple of a divisor, so you need to use some other magical value (-1?) for that field in just this case, and now you have a worse problem, because all of your optional numeric fields are now behaving differently.  Oye.

Dates are an even more compelling candidate for nullable fields.  The standard "no value" that people use in .NET is the default unassigned DateTime value, which is DateTime.MinValue, or more specifically January 1, 0001.  However, you cannot write this magic value into SQL database because the default minimum value for a SQL Server DATETIME field is January 1, 1973.  You now have to have something that checks that you are translating those values properly as they are written to and read from the database, and you have to have defensive coding all over the place that checks for whether your date fields are less than SqlDateTime.MinValue, instead of just checking whether they are equal to DateTime.MinValue.  Double Oye.

My preference is to deal with the values as they truly are, and not to build a lot of artificial constructs to hide the true meaning and usage of the field.  If a field may very well not have a value, make it nullable, and make it nullable in your application objects as well (if you language supports such a thing).  Then, anytime you are using that field, you are required to consider what should be done in the case where it is NULL, but that is actually a good thing.  Generally I am opposed to making developers waste brain cycles on unnecessary code complexity, but that is because it steals focus away from the true business problem being solved; however, in this case, the lack of a value IS part of the true business problem and must be thought through.  If you are just defaulting these values, then the developer writing a formula or algorithm will be less likely to think through those edge conditions where the values are missing, and may not even realize at the time that it is a possibility that those values are missing.

11 thoughts on “Quick Rant On Why Database NULLs Are Not Necessarily Evil

  1. SeanJA

    So, use NULL when it actually means something… makes sense. The problem is when the system is designed without thinking about the possible meaning of null… What does a NULL value for a boolean field mean? Generally !NULL == false unless you are also doing type comparison… I hate when people make a BOOL field nullable… it just doesn’t make sense.

    Reply
    1. Mike Mooney Post author

      I’ve encountered several times where a NULL boolean value makes sense, such as indicating an unanswered question. For example, if you are developing a mortgage system, there are a list of “Declaration” questions that you are required to ask by law, and so we can’t really be defaulting it to false (the most common approach I’ve seen). If we were to use a default value in the field, we would lose the ability to track whether the question has been answered.

      Of course, another option for handling this type of ternary logic is to use an enumerated value instead, for 0=Unanswered/1=Yes/2=No, and default to 0=Unanswered, but then this also introduces some implementation complexity, because it is not really a boolean value any more.

      In the end, I think that both options (nullable boolean vs enumeration) end up being about the same amount of work and and maintenance. Either way, you are making a point that a field may have a non-value, and forcing the application developer to consider and address this as an unavoidable part of the business model. Either way is probably fine, as long as you are not just defaulting the field to some value because you don’t feel like dealing with NULLs, because this hides meaningful business logic and tricks the application developer into thinking that a field always has a meaning value.

      Reply
  2. Pingback: Tweets that mention The Mooney Project » Blog Archive » Quick Rant On Why Database NULLs Are Not Necessarily Evil -- Topsy.com

  3. Sean Patterson

    +1 for keeping the “right tool for the right job” approach to things. That’s the way it should be.

    I had a “null boolean” scenario before and in my case it wound up being more functional to design the column to be a varchar(1) field with a Y/N/NULL value set. Just something to think about if you’re in that kind of bind.

    Reply
  4. Mike Mooney Post author

    Hey Sean, thanks for the response. I’ll be honest, I really not a fan of the Y/N/NULL approach.

    I just finished up last month with an integration project with a legacy database that used varchar(1) for boolean values all over the place. It was a nightmare, because over the last 10 years, those fields came to include Y, N, y, n, NULL, 0, 1, and then just random characters like C and R because someone decided to repurpose it as a code field but these days who the hell knows what those values were intended to mean.

    What does a nullable varchar(1) get you that nullable doesn’t? If you are going to allow nulls, then the BIT at least restricts you to the truly meaningful values and keeps the other random stuff out of there.

    Reply
  5. Smithd552

    Pretty portion of content. I just stumbled upon your web site and in accession capital to assert that I acquire actually enjoyed account your blog posts. Any way I will be subscribing in your augment or even I fulfillment you access constantly quickly. dekkgeadcaaddacf

    Reply
  6. Smithd373

    Pretty great post. I just stumbled upon your blog and wished to say that I’ve really enjoyed surfing around your weblog posts. In any case I will be subscribing to your feed and I am hoping you write again soon! caedgedgeedbkddc

    Reply

Leave a Reply

Your email address will not be published.