CDC Anomaly with Metadata-Only change with Default Constraint – SQL Server

Recently at work we had a requirement to change an existing column on a table from NULL to NOT NULL. Now the column in question (call it ColumnA) was belonged to a very large table, (over 1 billion rows) and for a lot of the rows in this table that column was null. Rather than running an UPDATE statement on this we came up with the idea of just putting a default constraint on ColumnA. This would be much quicker than the UPDATE statement as from SQL Server 2012 onwards this is a metadata only change.

A Metadata what? So rather than updating all of the rows which had the value NULL in ColumnA in our VLT, with a metadata-only change, the default value is stored in the metadata of that table. So no rows are updated at all and the change is instantaneous. Look in the below table to see where the metadata is actually stored.

sys.system_internals_partition_columns

I know what you’re thinking, doesn’t this cause an issue when querying that table? No, there is no impact for not having the data in the actual row. It can still be included when the table is queried and it can still be indexed, its basically the same as if every row was updated with that value without having the bear the cost of performing the physical update.

What happens if one of these rows gets updated? So if a row that has the default value for ColumnA gets updated, the actual default value will be physically stored in ColumnA after the update, (for an update that is not updating columnA itself.)

So that’s great, all good, BUT WAIT! There’s always a catch right. 🙂

So at work we happened to have CDC on the VTL and on ColumnA, after a while we started to notice some strange behavior, if a row was deleted, this was recorded in the capture instance but guess what? The value for ColumnA was null!!

WHAT? So it seems CDC reads the actual value in that column, and does not read from the metadata, and for most rows with that ColumnA the physically stored value would be null.

What can you do? It depends on your use case, for some people this may be acceptable, for us it wasn’t so we really only had one choice, we had to run that update statement after all, setting the default value for all rows where columnA was null. Of course this only impacted legacy rows, any row updated or any new row added the default value would be stored at the row level.

All in all, this to me doesn’t seem correct behavior, either metatdata-only changes work for all scenarios or they don’t at all would be my (probably rather simplistic view) of how it should work.

Until next time Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

Published by sqlrebel

Data Engineer.

Leave a comment