Nulls are evil (or at the very least mischievous)

If you have been involved in hands on database development for a while, or in any kind of BI / reporting initiative, you probably already know this. Nulls are evil (or at the very least mischievous).
A null value in a database column is different than a blank, a white space or any multitude of other invisible characters. It is truly void, nothingness, the abyss…
The implications of leaving null values can be very confusing for downstream reporting development. Null values will cause problems when trying to aggregate numeric values, they will make joins fall apart and complicate the querying of any table where null values are involved. As a case in point, I set out to demonstrate below how four major databases handle (or not handle..) null values. The answer is the same in each case.
The use case I created is simple:
I created a table with 8 records in it. 2 of the records are null values. Then I queried the database with three simple questions:
• How many values are there where the value is not ‘a’ – since one record has a value of a, I expected the answer to be 7
• Given a wild card search on the value, provide all records – I expected to get 8 records back from a “wild card” search on all records
• How many records are there in the table, counting the value field – I expected 8
All databases gave the same answer:
How many values are there where the value is not ‘a’ – 5. The 2 null values were not considered
Given a wild card search on the value, provide all records – 6. The null records were ignored.
How many records are there in the table, counting the value field – 6. Yes, the database simply does not count the null values, as if they did not exist. Real nothingness…
Below are the test results in Oracle, SAP HANA, mySql and Sql Server. Beware of nulls…
create table test_nulls (column_a varchar(255));
insert into test_nulls (column_a) values (‘a’);
insert into test_nulls (column_a) values (‘b’);
insert into test_nulls (column_a) values (‘c’);
insert into test_nulls (column_a) values (null);
insert into test_nulls (column_a) values (null);
insert into test_nulls (column_a) values (‘f’);
insert into test_nulls (column_a) values (‘g’);
insert into test_nulls (column_a) values (‘h’);

Oracle:
select * from test_nulls;
ora01
select * from test_nulls where column_a <> ‘a’;

ora02
select * from test_nulls where column_a like ‘%’;
ora03
select count(column_a) from test_nulls;
ora04
HANA:
create column table test_nulls (column_a varchar(255));
insert into test_nulls (column_a) values (‘a’);
insert into test_nulls (column_a) values (‘b’);
insert into test_nulls (column_a) values (‘c’);
insert into test_nulls (column_a) values (null);
insert into test_nulls (column_a) values (null);
insert into test_nulls (column_a) values (‘f’);
insert into test_nulls (column_a) values (‘g’);
insert into test_nulls (column_a) values (‘h’);

select * from test_nulls;
hana01
select * from test_nulls where column_a <> ‘a’;
hana02
select * from test_nulls where column_a like ‘%’;
hana03
select count(column_a) from test_nulls;
hana04
MySQL
select * from test_nulls;
mysql01
select * from test_nulls where column_a <> ‘a’;
mysql02
select * from test_nulls where column_a like ‘%’;
mysql03
select count(column_a) from test_nulls;
mysql04
Sql Server
select * from test_nulls;
sqlserver01
select * from test_nulls where column_a <> ‘a’;
sqlserver02
select * from test_nulls where column_a like ‘%’;
sqlserver03
select count(column_a) from test_nulls;

sqlserver04

This entry was posted in BI At Large, Data Warehousing, SAP HANA and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *