Outer Join, null values and job optimisation

system · May 9, 2012, 10:00am

I am new to DS and running into a strange phenomenon:

I have two tables t1 and t2 that are completely identical for the time being (they will have different row counts in the future). I left outer join them in a first transform (tf1) and want to filter on the null values that originate from the outer join in a second transform (tf2). The resultset (now, not in the future) should be empty, since both tables are identical. Unfortunately, it’s not!
If I add another table (be it a template one) as second target for the first transform (tf1), all of a sudden my above mentionned resultset becomes correct. This occurs only in BODS 4 SP 1. In an 11.7 env the procedure would work, I was told. (The fixed issues guide of SP2 doesn’t indicate a solution.)

I am now wondering if BODS is applying some kind of optimisation, so that it “integrates” tf1 and tf2, what - considering we are talking about an outer join - wouldn’t work. May I be right and is there a way to switch off this optimisation?

Regards,

Frank

heckerf (BOB member since 2004-08-24)

system · May 9, 2012, 11:09am

I think that the is null check is pushed down to the database. You can check this in the optimized sql (Validation - Display optimized SQL).

Which database are you using?

I have exactly the same problem on BODS4 SP2. I have a customer message open for this as I consider this a bug.

Johannes Vink (BOB member since 2012-03-20)

system · May 9, 2012, 1:09pm

where do you join, in the where clause or the from clause of the query transform?
Are you using two queries, first does the join, second the filter? You said yes.
On what column do you use the filter on? Does it have NULL values in the table?

I would implement that in one query, the join in the from clause and the filter on the not-null in its where clause. Then it is well defined that the you want to join first and filter the result set and not join the filtered dataset.

Werner Daehn (BOB member since 2004-12-17)

system · July 5, 2012, 12:57pm

Hello everyone,

I had similar issues multiple times.

Example:

Input 1 = table
Input 2 = CSV file

Left outer join on column myColumn

CSV ->
----------X (LEFT OUTER) - - - -> X (FILTER: WHERE myColumn IS NULL) - - -> TAB (TARGET)
TAB ->

Results:

When I start the job, all rows will be written into the target table, which is very wrong in our case. It is, as if the WHERE filter is being ignored. Every value in the column “myColumn” is NULL, which makes it even worse.
If I leave everything as it is and use the debug mode, it shows the correct data coming out of query node 2, but writes the wrong results into the target table.

WORKAROUND 1: If I replace the NULLs from the outer join by “NA” and set the filter in the second query node to “WHERE myColumn = ‘NA’”, then it works correctly.

WORKAROUND 2: Use a pushdown node. => There seems to be a bug in the DS Jobengine.

WORKAROUND 3: Do as heckerf wrote: use an additional target table after the first query node.

When will this be fixed? This is dangerous, especially in migration situations. I cannot imagine that noone else had this error before.
I found many similar errors, e.g. when using joins and groupings in different nodes etc…

Best regards,

Sascha

S.Balensiefen-Sommer (BOB member since 2008-11-17)

system · July 5, 2012, 9:30pm

Which version?

Johannes Vink (BOB member since 2012-03-20)

system · July 6, 2012, 7:23am

We had these issues in any version from DS4 to DS4 SP2 Patch 2. The problem is not described in any of the patch release notes so far, at least I wasn’t able to find anything till yesterday.

S.Balensiefen-Sommer (BOB member since 2008-11-17)