Dynamic File Handling and Dynamic Pivoting

system · April 22, 2010, 10:00am

Hi,

I have a flat file which has unspecified number of columns.
If I have 3 Columns, the next time it maybe 5 Columns.
But the File format should read the data regardless of change in the no. of columns.

For Ex:
DAY 1
EMP COL1 TYPE COL2 TYPE
DAY 2
EMP COL1 TYPE

We also need to pivot the data and populate each column in to a row with the header.

SOURCE

EMP ADDR1 TYPE ADDR2 TYPE
12345 Name1 PDM City1 PDM
23456 Name2 PDM City2 PDM

TARGET

EMP ADDR TYPE ACTADDR
12345 ADDR1 PDM Name1
12345 ADDR2 PDM City1
23456 ADDR1 PDM Name2
23456 ADDR2 PDM City2

The catch here is the based on the number of columns we get in the source file, we have to pivot the appropriate columns as well.
It should be dynamic(Pivoting).
The same way the file handling should also be dynamic.

Reply back if you need any more info.

Kishore.[/b]

kish_etl (BOB member since 2009-05-20)

system · April 22, 2010, 12:02pm

You create a file format with 20 anonymous fields, by setting the error handling correctly you can tell that missing fields are okay.

You split the data to process the first row - the header row - separately from the others.

And in the pivot transform you use two pivot sets, one for the column one for the type. Rule seems to be that those two are always together.

After the pivot you join the row back with the pivoted header to get the appropriate column name and filter out all rows with null values.

Did I get your point?

Werner Daehn (BOB member since 2004-12-17)

system · April 23, 2010, 5:31am

Hi Werner,

This is what I need exactly.
Day1 the file looks like this and the same file will have extra columns on day2.
The need the two scenarios to be handled in the same dataflow.
The way I need to pivot should be dynamic too.

Thanks
Kishore.

Day1:

SOURCE
EMP ADDR1 TYPE ADDR2 TYPE
12345 Name1 PDM City1 PDM
23456 Name2 PDM City2 PDM

TARGET
EMP ADDR ACTADDR
12345 ADDR1 Name1
12345 ADDR2 City1
23456 ADDR1 Name2
23456 ADDR2 City2

Day2

SOURCE
EMP ADDR1 TYPE
12345 Name1 PDM
23456 Name2 PDM

TARGET
EMP ADDR ACTADDR
12345 ADDR1 Name1
23456 ADDR1 Name2

kish_etl (BOB member since 2009-05-20)

system · April 23, 2010, 6:10am

Oh, even better. You need just one pivot set. Please do what I suggested step by step to get familiar with the settings.

Step 1: Create a file format that can handle both source files without errors.

Step 2: Pivot it to get familiar with it

Step 3: Add the header information to each row

Werner Daehn (BOB member since 2004-12-17)

system · April 23, 2010, 11:40am

Thanks a lot Werner

I was able to achieve the result.

Kishore

kish_etl (BOB member since 2009-05-20)

system · April 24, 2010, 6:19am

Excellent! Not an easy task and you made it work? Not bad, not bad at all!

Werner Daehn (BOB member since 2004-12-17)

system · May 10, 2010, 7:17pm

Quick dumb question, how can you setup the error handling to process the row if it does not have the required number of columns?

Thanks
Gary

gopatinc (BOB member since 2006-02-24)

system · May 11, 2010, 5:43am

Validation transform and each column has a not-null rule?

Werner Daehn (BOB member since 2004-12-17)