Slow SQL 2005 Stored Procedure when using SQL()

Hi, I’m a newbie to DI so still finding my feet and apologies if this is an obvious one but it’s frustrating me at the moment.

I have a SQL 2005 stored procedure which runs in about 1.5 mins when executed from a Management Studio query window.

I have created a script in DI which calls the SP using

print(‘starting SQL call’);
sql(‘CIP_Staging’, ‘EXEC SP_GENERATE_GOLDENRECORD’);
print(‘finished SQL call’);

and by looking in the log I can see it takes 3 minutes to run.

I can’t see why this is taking so long? Data Integrator and the SQL Server are both on the same box - it’s a 3Ghz Quad Core with 20 GB RAM and CPU usage never goes above 30% in Task Manager when the SP is executing and memory use is negligible really.

This is going to cause me issues going forward since the SP is currently only executing against a small data set and it will grow in the near future.

Can anyone suggest why the 100% increase in time taken to execute the SP?

Thanks in advance


mattr (BOB member since 2008-02-15)

You can import a stored procedure into the datastore as well and us it as “new function call” directly in the query.

sql() does a connect, disconnect for each call - for each row in you example.


Werner Daehn :de: (BOB member since 2004-12-17)

Thanks for the prompt reply - I had seen this suggested in other posts but when I try and search for the SP it doesn’t appear. I am going to DataStores tab, choosing the correct Datastore then right-clicking and choosing Search.

Whether I select Table or Function in the Type dropdown I cannot find the SP in the list.

I also cannot see it when I just do OPEN on the datastore.

Any ideas why this may be?

regs


mattr (BOB member since 2008-02-15)

With open and search we don’t show procedures. Choose “Import by Name”, change the dropdown box from table to function and specify the name of your stored procedure.


Werner Daehn :de: (BOB member since 2004-12-17)

Thanks Werner,

That works fine but I’ve just realised there may have been some confusion as to the SP purpose - the SP isn’t being used as a function - rather to perform some dynamic sql on some tables.

So if I understand correctly then importing as a function won’t help me - I don’t want to use the SP in a query but rather as a standalone task at the end of a Workflow.

regs

Matt


mattr (BOB member since 2008-02-15)

Yes, I was heading into the wrong direction then. Still, you can call the stored procedure from a script, too, without the need of a sql() function. But the advantage is limited.

But then, what is the difference between you executing the function from within SQL server or DI. Except the connect-disconnect obviously.


Werner Daehn :de: (BOB member since 2004-12-17)

How do you do that? I thought all commands sent to the database from a script go through a sql() function.

I think the question being asked is why does the SP run slower when DI calls it. Does DI do anything that would degrade the performance? If the SP is moving data around and called from DI it won’t be reading in a recordset, so network issues can be ignored, in fact the SP does not return a recordset [I know Matt and have spoken to him about this problem].

As far as I can tell it must be a resource contention issue of some kind. (SQL and DI on the same machine). If CPU does not go above 30% its unlikely to be processor, and with 20GB memory that’s less likely (no parallel workflows/dataflows in the job). Any suggestions on what to look for (i/o?)


HuwD :uk: (BOB member since 2007-04-19)

The first one I can answer. Once you imported the function into the datastore, go into the script object, click on the functions button and search for your datastore,…


Werner Daehn :de: (BOB member since 2004-12-17)

We have found that the difference in performance can sometimes be because of transactions. If you make a SQL() call that is a sproc, the SQL that gets issued is:


BEGIN TRAN
exec dbo.MySproc
COMMIT

Wrapping a transaction around the stored procedure can slow it down, depending upon what it is doing inside.

The way to confirm this is to try doing the same: Open up Query Analyzer, and do a BEGIN TRAN/COMMIT around your sproc call as I did above. If it’s slower in that scenario, then you’ve found the problem.

We’ve added code at the beginning of some sprocs which discards the transaction that BODI has opened, like:


IF @@TRANCOUNT>0
BEGIN
  COMMIT
END

dnewton :us: (BOB member since 2004-01-30)

Werner - ah, didn’t realise you were referring to having imported it as a (DI) function. That makes perfect sense.

DNewton - I didn’t realise DI stuck a TRAN in a sql call (mind you I didn’t think to profile the DI calls to the database). From looking at the SP I think a transaction around the SP code is very likely to cause the difference in duration. Will have to see what Matt comes back with after some testing.


HuwD :uk: (BOB member since 2007-04-19)

Hi,

Many thanks for all your help - I spend some time this morning looking at Profiler trace logs to see what was happening and
think that I’ve uncovered a bit more about the issue which people may find useful.

I started by adding the “IF @@TRANCOUNT>0” statement into my sproc (thanks for this suggestion DNewton) but unfortunately found it didn’t have any
real affect - while it did commit a transaction, another was being generated almost immediately.

I then noticed that a “SET IMPLICIT_TRANSACTIONS ON” statement preceeded the execution of my sproc and I believe this is the cause of my issue.

By adding a “SET IMPLICIT_TRANSACTIONS OFF” statement to my sproc the execution time is now comparable to that when executed from Query Analyser.

Thanks again to wdaehn, dnewton and huwd - really appreciated your help with this.


mattr (BOB member since 2008-02-15)

Can we learn anything out of that? Anything DI should do differently. Obviously we could just disable the commits within the sql() function but then the other group of people might be surprised if they do an insert and nothing happens.


Werner Daehn :de: (BOB member since 2004-12-17)

personally I would prefer DI not to set implicit transaction on.

We should have the choice to use transaction or not. By default on the table loaders these do not use transactions, but you can go and enable these if you wish (and you get a nice big transaction order number so you can see at a glance that it is using transactions). Likewise with the sql command I would expect it to send whatever I put into that command to the database and nothing else.

If I want specific transactions via the sql command I’d put in a BEGIN TRAN into the sql I am passing through and handle them within the sql code :slight_smile:


HuwD :uk: (BOB member since 2007-04-19)

Regardiong the loaders… Actually, I don’t think this is quite true. I believe DI is using transactions “everyplace” on the targets. That checkbox “include in transaction” only controls whether or not you want a group of targets to be included in a larger transaction. Correct?

Whether DI is doing implicit or explicit transactions in that scenario, I don’t know.

Regarding the behavior:

Some background… http://technet.microsoft.com/en-us/library/ms188317.aspx

No self-respecting SQL Server DBA has Implicit Transactions turned on at the database level. :slight_smile: And remember, a transaction is not necessary for your INSERT or UPDATE to take place. You could have implicit transactions turned off, and do this series of steps:

  • Open connection to database
  • Do an INSERT statement
  • Disconnect

and your change will be applied in the database, even though there was no COMMIT anyplace. Transactions only guarantee that work gets done as a particular roll-back-able “set”. Leave out transactions, and the work is still done, and still logged in the database log files, etc. You just have less control over them.

If we have the ability to set this behavior at the Datastore level (for backwards compatibility), it could be a real boon for performance.


dnewton :us: (BOB member since 2004-01-30)

By the way, if DI is doing a SET IMPLICIT TRANSACTIONS to ON with every connection it makes to SQL Server (including those for regular Readers, Loaders, and for DI Web Administrator), that could help explain something. You know how I’ve been whining for months now that DI is leaving connections open all over the place, where SQL Server is indicating there’s an “open transaction” in progress, and yet there shouldn’t be? Maybe this is the reason.


dnewton :us: (BOB member since 2004-01-30)

hmm, might have to do some testing with a preload command “SET IMPLICATION TRANSACTIONS OFF”


HuwD :uk: (BOB member since 2007-04-19)

Hi All,

It was good to see all the Replies to the Original Question but I was not able to figure the bottom line yet. Here’s the scenario I am in :

  1. I have a Job with a Workflow in turn containing 7 Workflows in Parallel. Each of the 7 Workflows call a Stored Procedure (Oracle 10G) using the DI Script. When I run the Job all 7 Stored Procedure calls are executed in Parallel and it takes the Job 30 mins to finish.

  2. Say the Job mentioned above has ‘A’ as one of the SP’s. When Stored Procedure A is called through the JOb mentioned above it takes 35 mins to finish. But if I executed this SP directly on the Database (without any other SP running in parallel) it take 3 mins for the SP to finish.

Q 1 : If at all the Issue is with DI , what could be the probable solution for this scenario ?

Q 2 : Can running Objects in Parallel actually increase the execution time of individual objects ?

Please share your thoughts,

Thanks in advance,
Rahul


rahul.johari :india: (BOB member since 2006-07-04)

I don’t use Oracle, so can’t comment on the Oracle-specific answer.

Q1 - I assume DI works the same for all databases, so will be opening a transaction around each procedure it calls. Presumably you are moving some data around in the procedures, which will be slowed down by being within a transaction. When you are executing these direct I assume you are not wrapping them in a transaction…so the increased time is most likely due to the transaction. Change the procedure to do the oracle equivalent to the MS-SQL SET IMPLICATION TRANSACTIONS OFF

Q2 - depending on what you are running in parallel it can slow down the total query. If you have several procedures all trying to read data from the same table you may get blocking issues. You will know better what you are trying to do in parallel and whether you are likely to encounter any locking problems if run in parallel.


HuwD :uk: (BOB member since 2007-04-19)

I have no idea why an Oracle stored procedure call from a script should have different execution times than calling directly via SQLPlus. Does you SQLPlus session pick up any session parameters, don’t know, optimizer goals or whatsoever? Does it use stored outlines? …? For SQL Server I agree, the begin/end transaction could be a problem. But not for Oracle.


Werner Daehn :de: (BOB member since 2004-12-17)

Q2 - absolutely. Your stored procedure is probably consuming some amount of CPU and memory, and if it’s doing any disk I/O, then sure, running 30 instances of it in parallel will of course slow things down. (Especially if the sproc is all reading/writing to the same tables.)


dnewton :us: (BOB member since 2004-01-30)