BODS 4.2 job took very long to complete

system · July 17, 2017, 8:23am

Hi,

We have one job that is run weekly on weekends and two dataflows took 4.6 and 7.8 hours to process 94785525 and 16506039 records respectively but it takes less than 20 minutes normally.
Please can you advise what would be the cause. Which logs I should look at the detailed analysis.

In one EIMAdaptiveProcessingServer log, I see some error lines like:-

|5D462A3F656946EF8E32DC49AE817CE72|2017 07 09 04:57:52.793|+0900|Error| |>>|E| |aps_BODS.EIMAdaptiveProcessingServer|11208| 176|service builder-1| ||||||||||||||||||||||2017/07/09 4:57:52 com.sap.dataservices.rfcservice.server.RFCServiceLogger logTrace
|5D462A3F656946EF8E32DC49AE817CE73|2017 07 09 04:57:52.793|+0900|Error| |>>|E| |aps_BODS.EIMAdaptiveProcessingServer|11208| 176|service builder-1| ||||||||||||||||||||||���: DS.RFCService: Startup…
|5D462A3F656946EF8E32DC49AE817CE74|2017 07 09 04:57:52.793|+0900|Error| |>>|E| |aps_BODS.EIMAdaptiveProcessingServer|11208| 176|service builder-1| ||||||||||||||||||||||2017/07/09 4:57:52 com.sap.dataservices.dsjoblauncher.service.JobLauncherLogger logMessage

|5D462A3F656946EF8E32DC49AE817CE71f7|2017 07 09 04:57:58.902|+0900|Error| |==|E| |aps_BODS.EIMAdaptiveProcessingServer|11208| 241|pool-6-thread-1 | ||||||||||||||||||||com.acta.db.ActaConnectionPool||2017/07/09 4:57:58 com.acta.db.ActaConnectionPool setMaxLimit

Fatal: A server with type pjs is not found in the CMS BODS 01.comp.local: 6400 and the cluster @ BODS.comp.local: 6400 of DS.RF CService. Such a server may be down or it may be invalidated by the administrator. (FWM 01014)
Org.apache.axis2.AxisFault: Server with type pjs service is not found in CMS BODS.comp.local: 6400 with service DS.RFCService and cluster in cluster @ BODS.comp.local: 6400. Such a server may be down or it may be invalidated by the administrator. (FWM 01014)

But these does not help why it took long time for job to complete.

its_ranjan (BOB member since 2011-02-16)

system · July 18, 2017, 5:37pm

I usually blame the DBA for running memory/cpu/network intensive operations on the weekend.

Seriously, check for that issue FIRST. It’s the easiest thing to troubleshoot. A database backup running at the same time as an ETL job can certainly cause performance problems.

eganjp (BOB member since 2007-09-12)

system · July 18, 2017, 8:08pm

Hi,

Also check the trace to see in what point of the flow take so long. Check the ping (in ms) between your sap ds and your DB.

cbonilla (BOB member since 2017-07-18)

system · July 19, 2017, 6:36am

Hi Jim,

Thanks for the response

The database logs(SQL Server 2012) are huge and will continue to explore :x
There is some log in event viewer I could see .

Database access error. Reason: [Microsoft] [ODBC SQL Server Driver] [SQL Server] transaction (process ID 64) deadlocked with other processes on locked resources, this transaction was subject to that deadlock. Please re-execute the transaction. (FWB 00090).

its_ranjan (BOB member since 2011-02-16)

system · July 19, 2017, 2:48pm

I can see where a database looks like it is running slow but what’s really happening is that it is waiting for locks to be resolved.

I’ve seen this in Oracle shops where a developer opens SQL Developer, runs a query and lets it sit there (causing a read lock). Then in another session someone tries to truncate the table. The truncate will just sit there waiting for the read lock to be released. That’s the simple description.

eganjp (BOB member since 2007-09-12)

system · July 20, 2017, 9:12am

Thanks Jim

its_ranjan (BOB member since 2011-02-16)

system · July 24, 2017, 7:17am

I got to know that this error happened for quite sometime earlier too and there was no database backup activity done on weekends when the job took long time to complete.

its_ranjan (BOB member since 2011-02-16)

system · July 24, 2017, 2:32pm

Check with the sysadmin group to see if an O/S backup was being done. Have you looked at the timing of the error? Does it always happen at about the same time?

eganjp (BOB member since 2007-09-12)

system · July 25, 2017, 5:05am

Hi Jim,
No OS backup was also done on that day.
This issue happened just once and we need to find root cause analysis on this.

its_ranjan (BOB member since 2011-02-16)

system · August 22, 2017, 2:27am

Late reply, but root cause analysis is nearly impossible without incomplete data. Are you running logging (like Nagios) on the DB and DS servers? That would at least give you a chance.

E

eepjr24 (BOB member since 2005-09-16)

system · September 1, 2017, 5:06am

Sorry for late response.
I have had look at SQL Server execution plan and found following:-

Delayed DF2- INNER JOIN-COST 21%, CLUSTERED INDEX SCAN-COST 30%(TBBL_BO_HAL_MST_INF)-CLUSTERED INDEX SCAN-COST 16%-

Normal DF2- INDEX INSERT(TBL_BO_CHAIN_WEEK…)-27%,CLUSTERED INDEX INSERT(TBL_BO_CHAIN_WEEK…)-47%,

Delayed DF3-INNER JOIN-COST 35%,INNER JOIN-COST 16%,CLUSTERED INDEX SCAN-COST(TBL_SFA_ZBI1000)-26%

Normal DF3-INDEX INSERT(TBL_BO_CHAIN…) 27%, CLUSTERED INDEX INSERT(TBL_BO_CHAIN…) 47%,

If possible, please help.

its_ranjan (BOB member since 2011-02-16)

system · September 1, 2017, 1:52pm

Are you inserting into this table when it is empty or are you adding to it?

Tell me about the clustered index and the data going into it.

eganjp (BOB member since 2007-09-12)