Sunday, December 28, 2008

SQL nullable data items and LINQ

While working on Malevich - my new code review program, - I was hit by a runtime error when doing LINQ operation on a table that contained null datetime items. In C#, DateTime is a structure - not a class - and therefore can not be null.

Classes in C# are always allocated in the heap, and the variable that holds class object is really a pointer, whereas structures can be allocated on stack (just like classes and structures in C++), and variable that references a structure is really the memory representing this structure.

If C# had a sizeof operator, the sizeof of a variable containing class instance would always be a sizeof of a pointer - 4 or 8 bytes, - whereas a sizeof of a variable containing a structure would be the number of bytes that it takes to store this structure in memory.

Which is why string x = null is valid, but DateTime x = null is not. You can, however, explicitly create a "pointer" to DateTime (or any structure or primitive data type) by using the question mark after the type, for example, DateTime? x = null. This is called a "nullable" data type.

It turns out that the logic in Visual Studio that imports the database model (dbml) does not properly recognize nullable data in SQL - it generates not nullable prototypes for it.

For most common data items it is not an issue. Strings are class instances, and so they can be null in C#. It is not usual I suppose to store nullable numeric types - it would likely be more expensive to store the fact that it is or is not null rather than ignore the value when it is not needed.

DateTime is the only important type that really has a problem - it is common for it to be null in the database, and if you have a row that has null value for the date column, and try to instantiate it using VisualStudio-generated database interop class, you will get a runtime exception saying that DateTime field cannot be assigned a null value.


In SQL:

CREATE TABLE [dbo].[DateTimeTest] (
[Id] INT IDENTITY (1, 1) NOT NULL,
[TimeStamp] DATETIME NULL
);

INSERT INTO [dbo].[DateTimeTest] (TimeStamp) VALUES(NULL)

In C#:

// Watch this crash!
DateTimeTest[] x = (from dt in context.DateTimeTests select dt).ToArray();



I've done a few searches, and there are quite a few cases of people reporting the problem, but no good solutions for it - most people tend to rebuild their classes by hand when they need nullability, or fudge the database.

The solution does exist, however, and it is quite simple - in the graphical representation of database dbml, right-click on the offending field, select properties, and change Nullable field to true.

This is all it takes! After that, the date field is modeled as DateTime?, and everything works just fine.

3 comments:

ajax16384 said...

Do you want to share "Malevich" with community or this code review program will be suitable only for internal projects?

Sergey Solyanik said...

It's going on CodePlex very soon, which is Microsoft's community source program.

As of right now, it only supports "Source Depot", which is Microsoft's internal version of perforce (where p4 is renamed to sd). So getting it to work with perforce is super straightforward - just renaming sd.exe to p4.exe.

I am split on whether I should add support for TFS before releasing it though... Opinions?

Anonymous said...

Thank you for this.