Big Data Series

SSDB – Part 11. ETL – Redefining Nullable Values for Enhanced Reporting

Handling Nulls

Null values represent a challenge in reporting structures and ETL process because there are so many ways they could be handled. Nulls are considered to be an unknown value, but what does unknown mean?

Does it mean that is unknown currently, but will soon be known?

Does it mean that it is unknowable and could never be known?

Does it mean that a value just does not apply to this particular occurrence?

In many situations, null may be truly “unknown,” but sometimes it can mean something else. Where possible, it is best practice to interpret null values as something meaningful, and often this is required for the reporting data to be useful to the end user.

Null values show up in a fact table as dimensional keys or measured values. When the nulls are measured values, it is often best to keep the null values, because of most database aggregate null values into totals and subtotals in a specific way. If you substitute values such as a zero for null, the calculations may become incorrect.

Next, we will see a learning activity that demonstrates a variety of ways that nulls can be interpreted, and how the end user is benefitting from  or perhaps becoming confused by, each type of allocation.

Note: Microsoft is the owner of the content and video in this blog post.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s