Outer join performance on large data sets.

Hi,

I am having performance problems using outer joins in DI, to build temporal data sets.

I am trying to collapse a number (5) temporal tables into a single temporal table. This is a technique i have used on a number of different tools.

I outer join the tables on their natural keys, and then join using ‘between’ on the dates on all the points of change ( using a union/pivot of all the ‘date_from’ from the source tables ).

I then use a self join ( outer ) on the natural key to the ‘next date’ row, update the ‘date_to’.

When trying to do this in DI, I am having performance problems with the outer joins. It seems to do a cartesian and then filter! I have 3million rows in the intermediate table. I had to abort at 300 million rows.

By eliminating all DI functions and using just surrogate keys using the temporal dates, I can get it to push down ( insert /+Append/ select ), and it completes in 5 minutes.

This is flaky though as it sometimes times out on the Data transfers not returning in time!

Any tips, tricks, techniques please ?
Do I cut my loses and just use DI to execute SQL statements ?

Frustrated !
Pat


pxbaker :uk: (BOB member since 2007-01-09)

By “pivot”, do you mean you are using DI’s pivot transform?

If you have outer joins and are using BETWEEN in those table joins, DI may not generate fully-pushed down SQL and may do the cartesian product in memory, as you experienced.

I can’t quite tell what you’re trying to do, but one general suggestion: consider doing some data transfer tables in the middle of the dataflow, rather than trying to do all of the date-based filtering in one operation.

By the way: I am moving this thread into the DI Performance sub-forum.


dnewton :us: (BOB member since 2004-01-30)

Px, I feel your pain. See an extended discussion of this at:

In some cases the tricks in the above thread provide a solution. With a self join I have not always been able to manage it, as you will see in later comments.

  • Ernie

eepjr24 :us: (BOB member since 2005-09-16)