BusinessObjects Board

Excel files slow to load

We are migrating to 4.1 from 3.2 data services.

Finding that loading a simple excel file ( 30,000 rows, 5 columns) into a database table is significanlty slower.

From about 4 seconds in 3.2 to 2 minutes in 4.1.

Excel File is an Excel 2003 file.

I have see a few questions in this forum asking a similar question but no outcome.

Any feedback welcome. I am working with SAP on this but no luck yet.


BobbyOrr :canada: (BOB member since 2009-06-16)

Is this an imported job? If so, have you tried recreating it from scratch (including the excel format object)?

  • E

eepjr24 :us: (BOB member since 2005-09-16)

I have re-created everything and no change.

I have determined that it appears to not be the reading of the excel file that is slow but writing to the database.

The database and table I’m loading is the same for 4.1 as it is for 3.2.


BobbyOrr :canada: (BOB member since 2009-06-16)

If you put a map operation after the excel file format and discard everything does it still run slow?

What patch level of 4.1?

Have you tried re-importing the table? Created a new datastore and new table pointing to the same location and replaced the table? What DB are you writing to and is your DS box in the same datacenter, etc as the old 3.2?


eepjr24 :us: (BOB member since 2005-09-16)

I did try the map operation with discarding and that is where I find its not slow. Its only when writing to the database.

Were at 4.1 sp02

The database is Oracle 11g

No need to re-create the table as the same table I’m writing to in 4.1 is the same as in 3.2.

Our database is not on the same server as Data services 3.2 or 4.1 servers.

It just seems that its procesing 100 to 125 records at a time. Been reviewing the Data Services Optimization Guide to find a clue.


BobbyOrr :canada: (BOB member since 2009-06-16)

Did you try reimporting the table? How about recreating the job from scratch? That will fix odd repository behavior sometimes.

Are you doing all inserts or is this an autocorrect load?

  • E

eepjr24 :us: (BOB member since 2005-09-16)

I’m not re-creating the table, that is not necessary. I did recreate the job a couple of times.


BobbyOrr :canada: (BOB member since 2009-06-16)

I am not asking you to recreate the table. Simply re-import it. That will refresh the repository metadata, which may have become corrupt during the import. To do this, go to the datastore, right click the table and click reimport. It will take about 30 seconds perhaps.

  • E

eepjr24 :us: (BOB member since 2005-09-16)

ok. Yes, I did do the re-import but no change.


BobbyOrr :canada: (BOB member since 2009-06-16)

Two more things to try.

  1. Export the repository. Use repository manager and recreate the repository over the top of itself. Then reimport the ATL.
  2. Click Edit on the repository in DS 4 and DS 3.2. Screen shot them both and compare to see if there are any differences (client version, user id, etc).

Beyond those I would be opening a ticket with SAP.

  • E

eepjr24 :us: (BOB member since 2005-09-16)

I have opened up a ticket with SAP but no suggestions help so far. SAP is not always the best source for help so 1 major reason is to try the BOB forum.

I have looked at the repositories between both and they are the same. Its worth me reviewing again though.

Writing from database tables is as fast as expected.

I’m convinced its an issue somewhere in a configuration but no luck so far.


BobbyOrr :canada: (BOB member since 2009-06-16)

I did load a text file which contained the same information as the excel file and it took 4 seconds.

I really don’t think its my database and/or network as SAP is thinking. Why would the database update records based on the text file be much faster than the excel file if the database was a problem? Shouldn’t it be a problem for both types of files. The text file is located in the same directory as the Excel file so I can’t see it being that much slower using the excel file. The text file is 1mb and the excel file is 10mb. The text file is 4 seconds, the excel file 120 seconds.

I think the reading of the excel file is fast based on the map operation test but there is some kind of delay when trying to update the database? The microsoft access Database engigne 2010 ? Anyone have problems with this?


BobbyOrr :canada: (BOB member since 2009-06-16)

Sounds like a problem with the Excel file to me. I thought SAP addressed the Excel file issue early in the 4.x release. Apparently not?


eganjp :us: (BOB member since 2007-09-12)

One thing I did not mention is that when I create the csv file, I read it from excel file and it took 2 minutes to do this. Just as much time as reading the excel file and writing to an oracle table.

To me, this suggests a writing issue. I have passed on my info to SAP but only been a couple of days since last heard from them.

eganjp, it sounds like you know about an excel file issue? Do you know of any known notes or knowledge articles on this? Frankly, I have found none other than others here in the BOB forum complaining of this.

Thanks


BobbyOrr :canada: (BOB member since 2009-06-16)

I only know about the Excel issues that were posted to the forum. I don’t use the SAP Notes site much.


eganjp :us: (BOB member since 2007-09-12)

could this be a Java issue?

doesnt DS use apache + poi to read excel files? … maybe its a java version problem.

just a wild guess :expressionless:


jlynn73 :us: (BOB member since 2009-10-27)

Or a Java resource problem? As in not enough memory is allocated to the Java heap?


eganjp :us: (BOB member since 2007-09-12)

4.0 SP2:
Read .xls files: fast
Read .xlsx: slow

Beats me. Looks like the initialization before receiving the rows from Excel takes a while.


Johannes Vink :netherlands: (BOB member since 2012-03-20)

We unloaded the Trend Micro virus checker from the server.

When we did that, we were able to process the excel file that I’m testing within 5 seconds versus 2 minutes.

So it is an issue with the virus checker. Were trying to isolate what exactly its doing. We have tried to exclude the folder where the files are being read from as well as the SAP Business Objects folder but no luck so far.


BobbyOrr :canada: (BOB member since 2009-06-16)

Finally resolved this.

Once we excluded the C:\ProgramData\SAP BusinessObjects\Data Services directory, the excel files could be read and processed very fast. 3-5 seconds versus 2 minutes. This was the aha moment.

I also excluded the source directory for the files as well as the C:\Program Files (x86)\SAP BusinessObjects\Data Services where we actually installed the Data Services server program. This did not seem to be necessary but I figured it would not hurt to do it.

I hope this can help others as it was a pain in the a$$ to figure out.


BobbyOrr :canada: (BOB member since 2009-06-16)