Only a single column C1 with a string like “hajjcjehjhcehcj”
The target should be the above fieldnames with the string in C1 splitted as per the length of T1
Target = >
ABC—DEF —HIG—
h -----ajjcj—ehjhceh–
I tried using row gen transform but its skipping every alternate character after splitting. I tried another way of using a variable to capture length field from T1 and split using a while loop. But the output is always a single field.
You can build the ATL using a script. (I use Python)
load your column lengths up into an array in the script, and use that to generate your field mappings/data types.
It is still going to take you some time, but its going to reduce the chance for error down to virtually zero. Getting off one byte building a layout like this is not fun trying to track down.
If you build a simple source > query > target, with a field mapped for reference. You export that ATL and use that as a starting point for your script.
If the positions/lengths will change … you can use a script to build the in/out mappings for a udt. Then use a… for n in range(1085): … get+process+set fields.
If you pivot your data and concatenate it using 2 unique delimiters, you’ll have yourself a 2 dimensional array. Then you use a row generator transform, to act as your iterator.
As long as you’re stuck mapping those output fields, the solution is going to be more work than just doing each substr() mapping individually.
I do not have a delimiter in the record like comma etc. My delimiter is field length which is another table and to be used dynamically to split the text based on that value. I can capture the field length in a variable using the script, but only one.
You’re not following, which is ok because this is pretty convoluted.
If you pivot your mapping columns and concatenate them all together, with delimiters of your choosing… you have essentially created a two dimensional array.
The row generator transform can then be used as your looping mechanism to grab the specific index of this array using word_ext(row_gen#).
As I said … the solution is much more complicated than just doing all the data entry. (unless you can script the atl) my .02
Did you solve this? I ask because, if I read your initial post correctly, I’ve done what you’re trying to do.
I’ve parsed a long string of characters according to start positions and lengths stored in a table. I did this by cross-joining the two sets of data; you can cross-join by just joining two tables/queries without specifying a join. This duplicates each source record once for each record with the parsing specifics. So, in my case, I was parsing my string into seven pieces. Each of my source records contained one instance of the string. The cross join duplicated each row seven times, once for row in the table with the start position/length. I created a new output column for the parsed values and mapped it with a substring function, in which the source string was the full string value, and the start position and length came from the values of the cross-joined table. I ended up with seven new rows, each with the proper piece of the full string.
I have not done yet in DS. I managed to do that in DB with a stored proc. But your solution would be better as i can do that in DI itself.
One question if i understand correctly i have my t1 with string has 978 records and t2 with start, end and length has 1085 records. So when i cross join i will have (each string record) * length number of records?
Then apply substring of length to each records and pivot to generate columns? correct?
Wow. I wonder how long it’ll take to pivot 1 million records? I wonder, too, how it will compare with doing it in the stored procedure in the database.
Hey i have reduced the number of records from 1 mn to 19K, still higher though. But little confused with the second part. After the cross join, i am getting like this -
The final output should be Fieldname and Substring values under each fieldnames. As far as i understand the Reverse Pivot will not take dynamic values. Any suggestions appreciated.