BusinessObjects Board

Bug loading empty Excel workbook?

I’m running into a bug, or unwanted behaviour, with DS 4.2 SP11 that I haven’t seen before yet.

One of our data sources is an externally maintained and supplied Excel 2003 (XLSX) sheet with multiple workbooks.

The file is landed in our DS environment and we have separate Data Flows to process each workbook in the sheet as separate source into our staging tables.

The workbooks have a complex header structure and we have set our Excel File Formats to skip the first 8 rows and to skip all empty rows.

The workbooks do not always have to contain data - there are valid business rules for a workbook to be empty.

And therein lies the problem, when we have a workbook that doesn’t contain any data, Data Services 4.2 goes into an endless loop and keeps reading hundreds of thousands of (empty) rows and it never completes - we have to abort the job.

The only way to fix this issue for us is to manually examine the sheet and put a 0 in the first column if a workbook is empty (which wouldn’t be a valid entry there anyway) and then we filter this out in our Data Flow by reading all rows except WHERE Column1<>‘0’ .

Needless to say this manual workaround isn’t ideal.

Has anyone seen this problem in DS 4.2 before?


ErikR :new_zealand: (BOB member since 2007-01-10)

yes - I’ve seen this.

It’s the end of file marker in excel. If the sheet is empty EOF is in row 1, as a result, if you skip that row bods never sees it so it carries on loading forever.
The way round it is not to skip rows, instead read all rows and add a gen rownum column, then in a second transform select only where rownum > however many rows you want to exclude.

in excel you can view the file metadata by renaming the file extension from .xlsx to .zip, then extract the contents of the zip file, this will show you where the eof marker is.


wil992 (BOB member since 2011-06-22)