performance tuning in BODS

hi all,

pls let me knw the performance tuning techniques in BODS to reduce the execution time.

will be a great help…


lukkys (BOB member since 2010-05-19)

Hi,

You can refer SAP BusinessObjects Data Services
Performance Optimization Guide (PDF) which comes along with the tool or you can downlaod it from SAP Help.

Regards,
Sameer


sameer81 :new_zealand: (BOB member since 2007-09-15)

https://wiki.sdn.sap.com/wiki/display/BOBJ/Identify+a+Bottleneck+in+a+Dataflow


Werner Daehn :de: (BOB member since 2004-12-17)

Hi,

I also have the same problem…
My job is taking too long to execute…
I have set Run as separate process option to YES in all the queries which are doing Group_By, Joins or Selecting Distincts…
But merely setting this option to YES or distributing the dataflows does not seem to reduce the execution time…
Can anyone please help me out…

Thanks,…


Shivbaba :us: (BOB member since 2010-06-01)

In my experience run as separate process only increased the run time :rotf:


JeroP :us: (BOB member since 2009-11-30)

Hi,

:yesnod:
:frowning:
So can you please tell me what did you do to reduce the execution time…

Thanks…


Shivbaba :us: (BOB member since 2010-06-01)

You can only leverage the parallel processing capability if you have the hardware for it. Say, if you set your DOP to 8 and you don’t have a quad core processor to support it. Then its meaningless to set it in the first place.

  1. As you have a group by try and see if you can leverage table partition that is created for the table.
  2. Pushdown as much as possible to the databasee.

divyaprashanth :uk: (BOB member since 2010-09-04)

That’s only true if job server processors are your bottleneck. If your job has a lot of parallel WFs and DFs increasing the DOP will mean that more queries can run in parallel on the source/target databases. e.g. If you have 20 DFs all doing pushed-down insert - select queries (which don’t involve any processing within the job-engine beyond setting up a session and submitting the query) then DOP > 20 will allow all 20 DFs to run simultaneously.

Obviously you need to make sure the DB servers can support this activity; I have had to reorganise jobs to throttle back the number of simultaneous queries against source databases in order to avoid compromising user response times.


dastocks (BOB member since 2006-12-11)

I still haven’t gotten the outcome of this test from you…


Werner Daehn :de: (BOB member since 2004-12-17)

I can’t let this statement hang out here without it being addressed. As a general statement this is complete BS. Parallel processing and DOP in many instances have very little to do with the number of physical or virtual CPU’s available.

Off the top of my head I can think of two examples where parallel processing works great without regard to the number of CPU’s available.

  1. Running multiple Dataflows in parallel. Dataflows can have different sources and if multiple Dataflows are running in parallel against these different sources then its a great way to improve the overall exeution time of your job. Further, not every Dataflow returns a result set at the same time or even at the same rate. Once again, running them in parallel will allow your job to get more done in a shorter amount of time.

  2. A Dataflow uses a lookup_ext() which does not perform any caching. Without caching the Dataflow must submit a SELECT statement for every row that is in the input schema. While this is SLOW it was a legitimate use. Setting the DOP to 10 allows the Dataflow to perform 10 lookup_ext() requests in parallel. I have seen this Dataflow run and the DOP made a huge difference on a system that was CPU constrained.

CPU’s are fast. Networks and drives are slow. Most of the machines hosting job servers that I have seen at client sites are using very little CPU. Something like 20% utilization. I am much more concerned with memory utilization and network throughput than I am with overloading the CPU.


eganjp :us: (BOB member since 2007-09-12)

I was able to reduce time of a job taking 3 days to complete to 1hr and 40 minutes using DI. There are many ways to reduce execution time. We have not used a lot of push down to DB as it made our DB to crash down. If you have a specific requirement then let me know, we can surely get the time reduced.


AJIT K PANDA (BOB member since 2011-03-09)

Ajit,

Good to hear that you tuned the job from 3 days to an hour. Can you help me in sharing the tuning methods you followed. Currently we are using salesforce rapid marts to load salesforce data into data warehouse. The job is taking nearly 14 hours which is incremental. Is there is anyway I can reduce it by 8 hours atleast

I tried commit size/array fetch size but no luck
Tried cleaning unwanted data flows which helped to some extent

Please sugget me other alternatives.


sushmitha (BOB member since 2007-05-31)

I just started with DI. I’ve been given an ETL that is really messy. It contains 6 main jobs, all where taking from 1h30 to 3h00 to execute. Now they run in an average of 10 minutes. All i did was setting the right cache option for lookups and table comparison, and only extract the changed data when possible. The trick is, like Werner said, identify the bottlenecks. If you go in the management console, under batch job, performance monitor, you can see witch part of your job is taking long.

EDIT:
also, in every querries, there was function calls that where redundant. Some queries where calling sysdate(). Some was calling it twice, i even saw one that was calling 5 times sysdate(), two times system_user_name(), and two times job_name(); this one was taking 5 minutes. I’ve put those 3 calls in 3 variables. Now it takes 20 seconds.


Derf :canada: (BOB member since 2011-05-16)

To the various posters on this thread:

You really need to do more homework/preparation – it’s too open-ended a question for us to answer here. Read the links that wdaehn posted above (both the wiki and the PDF).

Then come back here with a specific sample of a dataflow that’s taking a long time (with screenshots or the ATL output), and maybe someone will be able to help at that point.


dnewton :us: (BOB member since 2004-01-30)