Dataflow Using Two Datastores Creates One Query?

system · October 1, 2008, 10:14pm

I have a Dataflow that uses two tables as sources. Each table is from a different Datastore. The query transform that uses both of the tables has an outer join in it.

What I’m seeing is that DI is generating SQL that has BOTH tables in it. It’s blowing up of course. Oracle returns a “Table does not exist” error.

I’ve tried setting different join ranks, cache/no-cache, etc.

Shouldn’t DI submit two separate queries?

I’m using 12.0.0.0 against Oracle 11.

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 8:58am

Don’t really know how DI is behaving on a Oracle DB but as a workaround try to use an embedded DF and see what happens next.

redshark (BOB member since 2007-02-16)

system · October 2, 2008, 11:22am

DI should be generating two queries unless you specified that they are linked datastores in the properties of the datastore. If you did not, I would open a ticket with DI Support immediately.

Ernie

eepjr24 (BOB member since 2005-09-16)

system · October 2, 2008, 12:45pm

I tweaked a lot of setting yesterday trying to get this to work. I think I changed the linked datastores but just to make sure, this settting is on the properties of the Dataflow, right?

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 12:48pm

Nope. I was actually talking about the properties of the Datastore. Right click, Edit the datastore, then click the Advanced button and scroll down to Linked Datastores.

Ernie

eepjr24 (BOB member since 2005-09-16)

system · October 2, 2008, 1:51pm

Neither of the Datastores involved had the Linked Datastore property checked on.

When I do the “Display Optimized SQL” I see only one Datastore in the left frame and the SELECT statement on the right frame has both tables/Datastores in there.

While in the Dataflow I can click on the magnifier button to view the data for each table and I see the correct results.

[/code]

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 1:52pm

Sorry, out of options for you here. Perhaps someone else has run into it?

Have you opened a ticket with support already?

Ernie

eepjr24 (BOB member since 2005-09-16)

system · October 2, 2008, 1:57pm

No ticket opened yet. This happened late in the day yesterday.

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 2:13pm

I think I understand what is going on. Both Datastores involved in this Dataflow using the same TNSNames entry. Data Integrator is making a fatal assumption that because they both use the same TNSNames entry that connecting to one Oracle schema will give access to the other Oracle schema. These two schemas are distinctly different and they may not exist on the same server in the future.

I added a new TNSNames entry that uses the same connection information and switched one of the Datastores to use the new entry. Then everything started working right. Grrr.

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 2:18pm

Ahhhh. Performing as designed. There are other folks who use this feature for logical separation of physical schemas. And if the TNS entry is the same, I was pretty sure that the two cannot reside on different servers (since the TNS specifies the server name, port and service)?

Ernie

eepjr24 (BOB member since 2005-09-16)

system · October 2, 2008, 3:17pm

Right. But in this case connectivity to one schema doesn’t assure you of the ability to query the other schema. In my case being logged in to schema A does not give you any rights to query from schema B.

My options are:

Keep using two TNSNames entries
Set up a login that has rights to query both schemas
Figure out why DI is doing this when all the options I’ve set are suppose to tell it to NOT do it that way.

Within the DI Tech Manual it has this description for the job server option “Use Explicit Database Links”:

Jobs with imported
database links normally
will show improved performance
because Data
Services uses these
links to push down processing
to a database.
If you set this option to
FALSE, all data flows
will not use linked datastores.
The use of linked datastores
can also be disabled
from any data flow
properties dialog. The
data flow level option
takes precedence over
this Job Server level
option.

However, I don’t think this applies in this situation. I’m still looking for other switches to flip.

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 4:41pm

I would call support to ask if there is perhaps a DSConfig setting to do it. But using 2 TNS names seems to me to be both a harmless and easy to maintain solution? Why do you find it unworkable?

Ernie

eepjr24 (BOB member since 2005-09-16)

system · October 2, 2008, 4:46pm

I don’t like the two TNSNames entry work around because it shouldn’t be necessary and I’ll have to update the TNSNames file for all developers and job servers.

eganjp (BOB member since 2007-09-12)

system · October 2, 2008, 4:50pm

Okay. Anyone else know a workaround? I would still ping support, they sometimes know of DSconfig entries that are not documented that help stuff like this.

Ernie

eepjr24 (BOB member since 2005-09-16)

system · October 3, 2008, 3:37pm

Jim, could you please send me screenshots of the two datastores, screenshot of the dataflow and the display-optimized-sql? I wanna beat the developer with that. They told me this new behaviour we have since years.

The current behaviour is that if tnsnames AND the effective user (after aliasing) is the same, then we implicitly link datastores. The wanted behaviour is that user does not matter but the dataflow property “use database link” is considered.

In 12.0.0.0 there is a bug that if aliases are used, even to alias the user to the same username, implicit database links are not used. Try that.

Thanks

Werner Daehn (BOB member since 2004-12-17)

system · October 3, 2008, 3:58pm

The TNSNames entry is the same but the user is definitely different. As far as I know I’m not using any aliasing - I’ll have to look that up. I had played around with the Dataflow property “use database link” and I had the same problem with both the ON and OFF setting.

I’ll have to get back to you on screen shots. I’m not at the client site today.

eganjp (BOB member since 2007-09-12)

system · October 6, 2008, 9:32pm

Email sent with everything except the Dataflow screenshot. Let me know if you still need that.

eganjp (BOB member since 2007-09-12)

system · October 13, 2008, 5:50pm

I believe this is the same issue I struggled with in my first posts. If memory serves me, the point at which it breaks is when you alias the datastore. I, on the other hand, want BODI to pushdown one query since both datastores are readable by the each user…

As of 11.7, datastore aliases break pushdown of a query accessing two datastores using the same tns name.

tangel (BOB member since 2008-05-12)

system · October 17, 2008, 12:28pm

In 12.1 we use the “effective user” to decide if we pushdown or not.

So no matter what the table owner is according to the repo, after the alias got applied we check if the owners are the same and if they are, we pushdown, otherwise we don’t.

Not too happy with that constraint, because maybe the users are different but I have setup the grants so that the one user can see the other users tables. I have asked development to ignore the user but acknowledge the dataflow flag “use database links”. If that is set we implictly push down, otherwise we don’t.

Werner Daehn (BOB member since 2004-12-17)

system · April 1, 2012, 9:07pm

It is ‘funny’ that I found this topic from 2008…

I have exactly the same problem in BODS4 (SP1 & SP2, 2 installations) and had note 0000301714 2012 raised for this.

Oracle db, 2 schema’s, 1 TNS file, 1 TNS entry, no db link defined. BODS4 generates one statement for both schema’s, resulting in an Oracle error.

The flag ‘Use Explicit Database Links’ set to FALSE did the trick though.

Johannes Vink (BOB member since 2012-03-20)