Push down on 2 DB and DOP issue

system · April 22, 2010, 4:13pm

Hi,

Just came across this interesting scenario, want to make sure, is this how it should work!

So I have a DF, with following logic:

Table A and B are in DB A
Table C is in DB B.

Select * from Table A, Table B left outer join Table C
ON TableB.COL2 = TableC.COL1
where
TableA.COL1 = TableB.COL1

Table A and B are both Cached, while Table C is set to use No Cache mode.

So I can see it creates that “AIVariable_1”, just like it should be, but only if DOP for that DF is set to 0.

If you change the DOP > 0, that “AIVariable_1” is gone!

Is this a bug, or this is how it should work!

We are on 11.7.3 on Win server 2003 using Oracle 10g DB.

Thanks,

data_guy (BOB member since 2006-08-19)

system · April 23, 2010, 6:20am

Hmmm, not sure.

…DOP should not change the execution plan…for parallel processing hash joins might be better…still I need to be able to control the execution plan…

I would call it a bug. I cannot see a reason why a nested loop join would not be possible with DOP>1.

Werner Daehn (BOB member since 2004-12-17)

system · April 28, 2010, 6:51pm

So Werner,

Is this a bug, only when it’s display optimize SQL option ?

Or the functionality also impacted ?

Thanks,

data_guy (BOB member since 2006-08-19)

system · April 30, 2010, 10:19am

The optimize SQL screen is executing the same code as the engine itself, so very likely a bug. But you can try yourself, execute the dataflow and set the trace options in the execution dialog trace_sql_reader=yes; show_sql_only=yes.

This will print all SQLs submitted by the readers into the trace log.

Werner Daehn (BOB member since 2004-12-17)

system · May 6, 2010, 9:52am

Hi,

Even I faced similar situation.
The scenario was:
Table A from DB X
Table B from DB Y

My Query has to be like below:

select A.*, B.col3 from A,(select B.col1,B.col2,max(B.col3) as col3 from B group by B.col1,B.col2) B  where
A.col1=B.col1
and A.col2=B.col2;

When I designed the job and clicked on Optimized dataflow, I got 2 separate SQL from 2 datastores (because table A and B are from different DB). The 2 queries were

select * from A;
select B.col1,B.col2,max(B.col3) as col3 from B group by B.col1,B.col2;

I gave ranks and tried couple of things, but nothing worked. I was not getting Al_variable in the query (which i was expecting). Strangely, I gave distinct for Table B query and when i saw the optimized query, I could see the Al_variable1 and Al_variable2. Is this a bug, or is this the way it works?

rookie86 (BOB member since 2009-09-11)

system · May 6, 2010, 1:01pm

Good question. I guess the group by has precedence over the join rank? Frankly I don’t know.

Werner Daehn (BOB member since 2004-12-17)