Indexing NULL table column

system · June 19, 2009, 11:38pm

I’m doing delta loads of 185 records using TC/HP/KG tranforms. Ive selected row by row comparison method against the target table which holds over 8 million records, but the job is taking over 4 hours to complete.

I have an index on the comparison table and use this as my input primary key.

The 185 delta records however all have null values in the indexed column so have recreated the index as
“create index on <nvl(indexed column),‘null’.”

I’ve re-analysed the index and re-imported the table, however this index doesn’t appear in the tables properties. Do I now also need to force the index and use, for example, the sql transform. If so how is this achieved with all the transforms mentioned above.

Many thanks in advance.

sajohal (BOB member since 2008-04-29)

system · June 22, 2009, 9:25am

Even if the index is not imported into the DS repo, it is still used by the database. So this is no K.O. criteria.
Why should that index be used at all? Nobody executes a select with a nvl() function in it, hence it is not used.
Null values in key columns is a bad, bad thing. How about adding the rule that the key columns always have to have a vlaue, e.g. -1 or “?”. I do not particularly care about the dataflow, there we can find a solution, but it is everything. The database indexes with null values, queries in the reporting tool have to mention the null values in the where clause explicitly or they do not get the result expected, joins do not work etc.
Why row-by-row method? Because the input dataset is so small comapred to the target table? Okay, that’s a good reason. You might want to try sorted-input, just for a try. Especially with DS 3.1 it might work very well.
With row-by-row mode enabled, I would also play around with the Degree of Parallelism setting of the dataflow. But not in your case, where the database is not using an index yet. But once that is solved you might want to give it a try.

Werner Daehn (BOB member since 2004-12-17)

system · June 22, 2009, 12:53pm

Thanks Werner. I’ve replaced the null values with 1 and re-analysed the index. The delta records are small in comparison to the target table, hence the use of row-by-row method and its now considerably quickly.

Thanks again.

sajohal (BOB member since 2008-04-29)