SQL does it faster than BODS

system · June 27, 2014, 6:16am

Folks,

I’m caught in a weird conundrum. I’m attempting to load around 130,000(Correction : 1.2 million records) records from an SQL table from one Datastore to a staging table in another datastore. It takes almost 4 hours without any optimization.

There are no transforms. It’s a simple push down operation with 1-1 mapping.

After plugging in Data Transfer transfer transform, it reduces it to 1.5 hours. however, the catch is when I copy the file from one SQL database to my staging database, using T-SQL command, it’s merely a few mins.

insert into RMA_ENTITY select * from Publix.dbo.RMA_ENTITY

RMA_ENTITY is the table name with same schema in source and staging databases.

Can someone guide me as to why there is such a sluggish response from BODS server and while merely inserting the data through a regular SQL is simply electric. Your prompt response will be appreciated.

stolenpyjamas (BOB member since 2013-04-12)

system · June 27, 2014, 7:54am

What is the SQL generated by BODS? Is it really fully pushed down with an insert/select?

Johannes Vink (BOB member since 2012-03-20)

system · June 27, 2014, 9:05am

Well, the push down looks something like this.

I learnt from an earlier post that there is a shortcoming with BODS server that of the two databases are on the same server, then push down operation takes a lot of time. Thus, the solution proposed is that I create a view. However, even after creating the view it is still struggling. The view name is PUSH_DOWN_VIEW.

Correction to my earlier post. There are actually 1.2 million records and not 130,000.

SELECT “PUSH_DOWN_VIEW”.“BUSINESS_TYPE_CODE” , “PUSH_DOWN_VIEW”.“COUNTY” , “PUSH_DOWN_VIEW”.“NATURE_OF_BUSINESS” , “PUSH_DOWN_VIEW”.“SIC_CODE” , “PUSH_DOWN_VIEW”.“SIC_CODE_DESC” , “PUSH_DOWN_VIEW”.“WC_FILING_NUMBER” , “PUSH_DOWN_VIEW”.“ENTITY_ID” , “PUSH_DOWN_VIEW”.“ENTITY_TABLE_ID” , “PUSH_DOWN_VIEW”.“LAST_NAME” , “PUSH_DOWN_VIEW”.“LAST_NAME_SOUNDEX” , “PUSH_DOWN_VIEW”.“FIRST_NAME” , “PUSH_DOWN_VIEW”.“ALSO_KNOWN_AS” , “PUSH_DOWN_VIEW”.“ABBREVIATION” , “PUSH_DOWN_VIEW”.“COST_CENTER_CODE” , “PUSH_DOWN_VIEW”.“ADDR1” , “PUSH_DOWN_VIEW”.“ADDR2” , “PUSH_DOWN_VIEW”.“CITY” , “PUSH_DOWN_VIEW”.“COUNTRY_CODE” , “PUSH_DOWN_VIEW”.“STATE_ID” , “PUSH_DOWN_VIEW”.“ZIP_CODE” , “PUSH_DOWN_VIEW”.“PARENT_EID” , “PUSH_DOWN_VIEW”.“TAX_ID” , “PUSH_DOWN_VIEW”.“CONTACT” , “PUSH_DOWN_VIEW”.“EMAIL_TYPE_CODE” , “PUSH_DOWN_VIEW”.“EMAIL_ADDRESS” , “PUSH_DOWN_VIEW”.“SEX_CODE” , “PUSH_DOWN_VIEW”.“BIRTH_DATE” , “PUSH_DOWN_VIEW”.“PHONE1” , “PUSH_DOWN_VIEW”.“PHONE2” , “PUSH_DOWN_VIEW”.“FAX_NUMBER” , “PUSH_DOWN_VIEW”.“DTTM_RCD_ADDED” , “PUSH_DOWN_VIEW”.“DTTM_RCD_LAST_UPD” , “PUSH_DOWN_VIEW”.“UPDATED_BY_USER” , “PUSH_DOWN_VIEW”.“ADDED_BY_USER” , “PUSH_DOWN_VIEW”.“DELETED_FLAG” , “PUSH_DOWN_VIEW”.“SEC_DEPT_EID” , “PUSH_DOWN_VIEW”.“TRIGGER_DATE_FIELD” , “PUSH_DOWN_VIEW”.“EFF_START_DATE” , “PUSH_DOWN_VIEW”.“EFF_END_DATE” , “PUSH_DOWN_VIEW”.“PARENT_1099_EID” , “PUSH_DOWN_VIEW”.“REPORT_1099_FLAG” , “PUSH_DOWN_VIEW”.“MIDDLE_NAME” , “PUSH_DOWN_VIEW”.“TITLE” , “PUSH_DOWN_VIEW”.“NAICS_CODE” , “PUSH_DOWN_VIEW”.“RM_USER_ID” , “PUSH_DOWN_VIEW”.“FREEZE_PAYMENTS” , “PUSH_DOWN_VIEW”.“ORGANIZATION_TYPE” , “PUSH_DOWN_VIEW”.“NPI_NUMBER” , “PUSH_DOWN_VIEW”.“MMSEA_TIN_EDT_FLAG” , “PUSH_DOWN_VIEW”.“AUTO_DISCOUNT” , “PUSH_DOWN_VIEW”.“DISCOUNT_FLAG” , “PUSH_DOWN_VIEW”.“FROI_ACORD_PASSWORD” , “PUSH_DOWN_VIEW”.“MMSEA_REPRTER_TEXT” , “PUSH_DOWN_VIEW”.“MMSEA_OFE_STE_TEXT” , “PUSH_DOWN_VIEW”.“COMMENTS” , “PUSH_DOWN_VIEW”.“HTMLCOMMENTS”
FROM “DBO”.“PUSH_DOWN_VIEW” “PUSH_DOWN_VIEW”

stolenpyjamas (BOB member since 2013-04-12)

system · July 1, 2014, 6:03am

Can you try using SQL transform and see the outcome? It works well for me.

BODSDW (BOB member since 2011-01-19)

system · July 8, 2014, 8:22am

Where is your job server located?

Judging from your post, your source and target databases are on the same Microsoft SQL Server instance?

By default, Data Services will treat these as separate databases, on separate servers - so they might as well be on different machines all together - and as such, there will be NO pushdown SQL at all.

To load the data, all your 1.3 million records will be routed through the DS Job Server. This is not ideal but that certainly shouldn’t be taking 4 hours… unless your DS Job Server is geographically separated from your database server or is running on a very poor performing (virtual?) server.

I’ve seen some shocking performance from virtual DS Job Servers only to discover that the host server configuration was flawed and that the DS server hardly had any resources allocated to it. At other times, I’ve seen broken network cards causing very poor throughput on physical DS Job Servers, anti-virus software causing tremendous I/O overhead because the DS directories (especially the logging directories) were not excluded from on-access and on-write scanning, etc.

Once you have verified that your DS Job Server is configured correctly and is performing well - have a look at your DF. Are you using any bulk loading? Try that and see if that makes a difference. If you cannot use bulk loading, increase the number of loaders (assuming your target database can support the additional connections), partition the incoming data and increase the degrees of parallelism.

Also ensure that you are not using any custom functions in your query transform between the source and target tables and if you do, ensure that the custom function has parallel execution enabled.

If you want to achieve full SQL push down ( MERGE or INSERT INTO) between the two databases, you will have to create a linked server on your SQL Server (essentially linking the SQL Server with itself) - this will enable Data Services to generate SQL across both databases. If your DBA isn’t very happy about doing so, you can also create views in your Staging database that query the largest tables in your source database. Because these views are part of the same data store as your target staging database, Data Services may also be able to achieve full SQL pushdown.

ErikR (BOB member since 2007-01-10)

system · July 9, 2014, 9:09am

The Job server and Database server are same.

I’ve tried bulk loading, increased DOP, increased number of loaders as well, however, it still feels like giving a wild goose chase.

I will try Linked server, but I’m not so sure if that would do wonders.

I’m going to try to create the staging Datastore in another server and see if that has an effect.

There are no transformations. Just 1-1 mapping. The data transfer doesn’t seem very helpful too.

stolenpyjamas (BOB member since 2013-04-12)

system · July 10, 2014, 2:02pm

what is complete version of DS ?
is both the databases on same SQL Server Instance ? if yes, are you able to copy the data using INSERT INTO…SELECT statement from Query Aanalyzer ? if yes, then to eliminate if its an issue with network or other resources between DS and DB Server, create a simple job with script and use SQL() function to execute the INSERT INTO … SELECT statement and check the performance

in case of SQL Function DS simply submits the sql to database

even if DS is not pushing down the complete operation I doubt if it would take hours to load 1.2M records

do you have columns with DataType NVARCHAR ? I remember seeing an issue in early releases of 12.2 where due to incorrect datatype binding when making ODBC calls in DS, the ODBC Driver are doing a conversion (codepage) and was also causing rebuild of index for every row that was getting inserted

manoj_d (BOB member since 2009-01-02)