I was having this conversation with coworker last week, and it seemed like the 100th time I’ve said this. Then it popped up on StackOverflow this morning, and that got me typing, so here’s yet another salty developer’s rant on the topic.
There is no question that NULLs should be avoided when possible, because of the problems they can introduce several problems with indexing and query syntax.
Due to the problems that NULLs can introduce, there has been a push to stop using them. However, like most such movements, it has gotten out of control, to the point that some people fanatically insist that NULLs should never be used in the database. I was in this camp for a lot of years, before I found it to be a little overzealous.
The answer is somewhere in between. NULLs should be avoided whenever possible, but there are indeed valid business reasons for storing them.
Often times you need to store an optional numeric value, and many people will tell you to just store a zero to indicate "no value", but this is an even worse antipattern of storing a magical value that really means something else. What if then you can’t use zero for one field, because zero is considered a meaningful value as well, or because this value is being used as the multiple of a divisor, so you need to use some other magical value (-1?) for that field in just this case, and now you have a worse problem, because all of your optional numeric fields are now behaving differently. Oye.
Dates are an even more compelling candidate for nullable fields. The standard "no value" that people use in .NET is the default unassigned DateTime value, which is DateTime.MinValue, or more specifically January 1, 0001. However, you cannot write this magic value into SQL database because the default minimum value for a SQL Server DATETIME field is January 1, 1973. You now have to have something that checks that you are translating those values properly as they are written to and read from the database, and you have to have defensive coding all over the place that checks for whether your date fields are less than SqlDateTime.MinValue, instead of just checking whether they are equal to DateTime.MinValue. Double Oye.
My preference is to deal with the values as they truly are, and not to build a lot of artificial constructs to hide the true meaning and usage of the field. If a field may very well not have a value, make it nullable, and make it nullable in your application objects as well (if you language supports such a thing). Then, anytime you are using that field, you are required to consider what should be done in the case where it is NULL, but that is actually a good thing. Generally I am opposed to making developers waste brain cycles on unnecessary code complexity, but that is because it steals focus away from the true business problem being solved; however, in this case, the lack of a value IS part of the true business problem and must be thought through. If you are just defaulting these values, then the developer writing a formula or algorithm will be less likely to think through those edge conditions where the values are missing, and may not even realize at the time that it is a possibility that those values are missing.