I have a problem using the data transfer transform. I added a DT to pushdown a join operation to the DWH database.
Transfer type = Table; i also declared the table_name as a table in the same datastore as the other source table.
Printscreens in the attachements…
However, when i start the job it shows:
5252 1 DFCOMM 31-03-2010 15:45:26 The Automatic Data_Transfer transform <q_sel_aanlevering_q_sel_aanlev> has been resolved to a transfer object
5252 1 DFCOMM 31-03-2010 15:45:26 <DT__1234_5370_1_1(DSA_CH_ANZ.DWCHANZ)>.
5252 1 DFCOMM 31-03-2010 15:45:26 Starting sub data flow <DF_SCDT_DKN_VERZ_INSCHR_MANZ_transfer_1> on job server host , port <3500>. Distribution level
5252 1 DFCOMM 31-03-2010 15:45:26 .
It isn’t automatic! The transfer_type= table.
Following thing that happens:
It takes the first random DS with the “enable automatic data transfer” enabled" (and raises an error because the dbuser has insufficient priviliges)
The datastores ofcourse shouldnt have this option enabled… but the real question is: why is BODS choosing the automatic mode??
Things i tried:
rebuilding the DT transform
Creating the DT tabel manually and disabling the recreate option.
Didnt help…
I have received a reply to the call i logged to SAP:
After some investigation work to reproduce the issue, I found this is actually not a problem. The message <#5252 1 DFCOMM 31-03-2010 15:45:26 The Automatic Data_Transfer transform <q_sel_aanlevering_q_sel_aanlev> has been resolved to a <Table> transfer object# > is actually referring to
a tempory data tranfer transform named [b]#q_sel_aanlevering_q_sel_aanlev #[/b] (not the one designed in the dataflow with name #Data Transfer#), [b]this
internal data transfer is added to optimize the dataflow to utilize the
pushdown operation[/b]. So the problem here is to create a datastore intended for automatic data transfer with proper DB privilege. In my environment, after setup all necessary datastores, I can run this job successfully.
The supportdesk states BODS automatically adds data transfer objects to DF’s when it ‘thinks’ it is faster. Sow… In the presented example we would have 2 data transfers (DT):
The defined DT
The internal/ automatic created DT
This is totally new for me. I haven’t ever seen BODS does a trick like this. Can somebody confirm this is indeed default BODS behaviour?
If sow… why?? In this case it isn’t even faster because only extra I/O is generated. Can i disable it?
Imagine you have one dataflow like yours and you want to split it at the point where you added the DataTransfer manually. As your flow has two paths coming from a split and merged later, the first dataflow needs to stage the data of the second path somewhere. We add a second DataTransfer automatically in such a case.
Can you prevent that? No, if there should be two dataflows and the split should happen at that location, it would be logically impossible.
Hi Werner, i understand there has to be a point where the splitted dataflow has to store its data. However, i don’t understand why bods internally creates a data_transfer and decides to store it in a table in a datastore which is on a whole different database (it seems to pick one randomly). It would make more sense if it used a table in the same datastore as one of tjhe joining tables
Questions:
Can you explain when the data_transfer is automatically used and how it decides to do a file or table DT type?
Am i right if i say the previous bodi versions never (automatically) used a table as a data_transfer?
1a) Automatically only in above case a DT in one path but not in the other.
1b) How the automatic mode works, so which datastore to use or files etc I have no answer for.
No, that was the feature design right from the start, I rememver all the meetings with development about that.
Let me make a suggestion: You add a second Data_Transfer for the second path with the settings of your choice. That would give you back full control. Agreed?
The support for Data Services has been very impressive over the years. I have worked with Microsoft DTS/SSIS and Informatica PowerCenter … I could only dream of an interactive platform with the product managers or the development team to discuss possible bugs etc!!
But at Microsoft, the product managers are too busy to create a new set of shiny power points for the next TechEd session on how wonderful SSIS is…
Werner, as you suggested moving the data_transfer back worked! (Actually this is a better place for the DT because now the two flows can pushdown their sql.)
After moving the DT i think i found a different bug. Printscreens attached. Error messsage:
21203 1 DBS-054003 27-04-2010 15:58:55 > failed. The database error message is <SQL submitted to Oracle Server resulted in error <ORA-00972: identifier is too long
Bods aliasses the DT but generates an alias which is too long for Oracle (>30 characters). Here is the part of the sql which is generated (and fails):
FROM "DWAPP"."DKN_DT_VERZ_INSCHR_TEMP" "Data_Transfer_q_sel_aanleverin_1"
, "DWAPP"."DKN_VERZ_INSCHR" "DKN_VERZ_INSCHR_1"
Data_Transfer_q_sel_aanleverin_1 is too long. The workaround is easy: rename the prefacing transform (in this case) to a shorter name. “DATA_TRANSFER”_ has to be max 30 chars (in case of an Oracle RDBMS).
The bug found with the identifier to long:
"For the identifier too long issue, this is something we know about. The
problem is already fixed in the 11.7.3.x code line, the same bug will also be fixed in 12.2.3.0 in November. ADAPT number is ADAPT01387906.
"
The other thing where the jobserver chooses a more or less random DS for the automatic data transfer – issue is being adressed by development. Is it not a bug, so perhaps it is improved in a next release.
It is years ago and all we have is the 11.7.0.0 code. We take that and create a new branch for 12.0.0.0. A year later when 12.0.0.0 is released, there have been fixpacks for the 11.7 release in the meantime, so we are in 11.7.2.0 for example. At multiple points in time before the testing really starts with the new release, all 11.7.2.0 code is rebased with the 12.0.0.0 code to make sure the same issues fixed in 11.7.2 are fixed in 12.0 as well. To some degree this is a manual process as one source code file might have been changed for 11.7.2 and for 12.0.
From then on, both branches are completely separate, everything you do in 11.7.3 is not automatically merged with 12.0. Therefore we either fix issues only in the new version or we have to modify both code lines. There is an higher probability something is overlooked then.
That’s disappointing. Bad codebranching and configuration management. In general i am displeased with the software quality of the latest BODS versions. It used to be only the BOE/BOXI guys with the I-cant-deliver-because-of-bugs issues… Past 1-2 years i am joining their club. Bods 3.1 isn’t supported anymore but it is much more stable compared to the latest versions.
@Johannes, next week i’ll do some testing if the issues are resolved (again) in Bods 4.2. I already noticed there is a new bug regarding the datatransfer. We can’t change /set the name for a datatransfertable.
Probably only in the BODS 4.0 SP3 stack at the end. Patch 3 does not have the problem, patch 5 does. Otherwise the forum would be full with complaints…
Do you happen to remember the ADAPT for the automatic data transfer behaviour? The ADAPT for the alias too long for Oracle is listed in this topic.