When working with Longs BODS produces thousands of text file

I have a dataflow that writes 500 records out to a file, the datatype is a clob which is a long in BODS.

When I run the dataflow BODS produces a txt file per record and my file with 500 records.

How can I get BODS to stop producing these txt files, I assume it is a caching to disk thing?

Cheers
Badger


Badger (BOB member since 2008-12-17)

Oops that is weird. So the files won’t get erased once your job is done with? Which version of DI ? And where these files gets dumped?


ganeshxp :us: (BOB member since 2008-07-17)

The files get created in the dir I have setup for writing my text file out to, we are using 12.2.2.1 not sure if they det deleted, because the actual data I want to write out is 3 million records. So I kill the job so it does not fill up the file system.

I used 500 as an example in my original post.


Badger (BOB member since 2008-12-17)

Oh I don’t think it is a good idea to write millions of Data that too which contains CLOB in it. Is the data in the CLOB column is really that big? Or just for the purpose it would grow it has been defined so?

However I think DI would some special thingy for long columns. I know that if you want to process LONG columns to table then DI defaults the Commit Size to 1. So it is very much logical for me that can do such thing to get the data written to file.

How is the naming assigned for such temp files?


ganeshxp :us: (BOB member since 2008-07-17)

The naming of the file is the server name and the source table with a timestamp.

No it is not ideal to do this but it has to be done, no choice.

The only other way I am considering is PL/SQL and UTL_FILE but it would be good if I could find out why it creates these files and how to stop it from happening.


Badger (BOB member since 2008-12-17)

Will this function be of any help for you? So that we avoid creating files?

LONG_TO_VARCHAR(COLUMN_NAME,TARGET_MAX_SIZE,START_POSITION)

Be sure you don’t strip any data from that column. Did you ever identified the MAX LENGTH of data for that CLOB Column?


ganeshxp :us: (BOB member since 2008-07-17)

Thanks I have been using that function but get problems when using it related to temp files and such that I stopped using it.

BODS and CLOBS just do not mix in my experince.


Badger (BOB member since 2008-12-17)

processing of long data is done through files instead of loading the data into memory, there is a DSConfig.txt parameter which controls the size of long data that can be processed in memory by default its 2048

Max_Long_Data_In_Memory=2048

in your case you can try doubling this value (4096), but I wouldn’t recommend setting it to very high since the DF or Job may run out of memory

I need to check the life of these files whether its load or end of DF or Job, in any case if your job completes these files should get deleted


manoj_d (BOB member since 2009-01-02)

The idea is, BLOBs cannot be embedded into a column of a text file. How would a CSV file look like if column three contains the image data in JPEG format?
Therefore the dataflow is using a binary writer then, one where each row is one file.


Werner Daehn :de: (BOB member since 2004-12-17)

Thanks that is interesting, I think the main problem is it leaves the files where they are during the job, maybe I will have to shell out to the OS in a script and clean down as I go.

I understand why you would not want to put a BLOB into a text file, but it is the one time we have no choice but to provide a text file (SQL Insert statement hence having the CLOB in it).

All I can say ‘Rock and a Hard Place’. :frowning:


Badger (BOB member since 2008-12-17)

So it really is a CLOB? Then I would try the long_to_varchar conversion function as well.


Werner Daehn :de: (BOB member since 2004-12-17)

I don’t think you have to create shell script to delete DI generated files, if the files are left behind after the job completion then its a bug

can you give me some more details about your env, I can try reproducing the issue and file a bug

what is the OS on which the Job Server is running ?
vesion of DS ?


manoj_d (BOB member since 2009-01-02)

Manoj I too tried reproducing it.

Am not seeing any such files getting created. But my version is 11.7
Can there be a setting in DSConfig?

My Process:
Row Gen --> Q (VARCHAR_TO_LONG) --> TABLE
(I was not able to identify a Long Column Table :? in my place)

Then TABLE --> QUERY --> Flat File

In this I saw some .dat file got created in the LINK_DIR Folder. But they got disappeared when the job is done with.


ganeshxp :us: (BOB member since 2008-07-17)

processing long data using files is not available on 11.7


manoj_d (BOB member since 2009-01-02)

I wrote to file. 11.7.3


ganeshxp :us: (BOB member since 2008-07-17)

not writing the long data to file as target, from 12.x onwards to handle long data, DS will internally write the data to temp file (depending on the max size you can cache, default is 2048) and once the job complete, these internally created temp files will be deleted


manoj_d (BOB member since 2009-01-02)

The problem is BODS does not always clean up after itself or the file system reaches its limits.

The table I am manipulating with a clob is 3 million rows, this tries to create 3 million files. Which breaks the OS limit on the number of files in a dir.

So the only way for the job to run is to clean the files as I go so I do not break the file system limits.


Badger (BOB member since 2008-12-17)