I would like to be able to write a bit of code in Data Integrator so I can merge rows together. I got this working in MS Access but do not know enough about Data Integrator and VB/VBA. Below is the code that I used in an access Module, can I replicate this in Data Integrator…Option
Compare Database
Option Explicit
Public Function FixTable() As Boolean
On Error Resume Next
Dim db As DAO.Database, rst As DAO.Recordset, sSQL As String
Dim strColumn1 As String, strColumn2 As String
Set db = CurrentDb()
sSQL = "SELECT Column1, Column2 FROM tblOriginal " & “ORDER BY Column1, Column2 ASC”
Set rst = db.OpenRecordset(sSQL, dbOpenSnapshot)
If Not rst.BOF And Not rst.EOF Then
rst.MoveFirst
strColumn1 = rst!Column1
strColumn2 = rst!Column2
rst.MoveNext
Do Until rst.EOF
If strColumn1 = rst!Column1 Then
strColumn2 = strColumn2 & “" & rst!Column2
Else
sSQL = "INSERT INTO tblCopy (Column1, Column2) " _
& “VALUES(’” & strColumn1 & “’,’” & "” & strColumn2 & “_” & “’)”
db.Execute sSQL
strColumn1 = rst!Column1
strColumn2 = rst!Column2
End If
rst.MoveNext
Loop
’ Insert Last Record
sSQL = "INSERT INTO tblCopy (Column1, Column2) " _
& “VALUES(’” & strColumn1 & “’,’” & “" & strColumn2 & "” & “’)”
db.Execute sSQL
End If
From looking at the code, I think what you are syaing by “merge” is “collapsing multiple rows into a single row”. Right?
Check out the pivot/reverse pivot transforms in DI.
There are also other tricks you can do which involve converting the input rows to an XML stream and then changing the delimiter to make it look like a single space-delimited row. Search this forum, I believe it’s been asked before.
Generally, though, we use a stored procedure in the database to do this if the records per set are a variable amount.
With the pivot transform you have to hard code the values for column 2 , there are over 900 distinct values in column 2 and this could increase which means Pivot transform is out of the question. I did look into this but with my lack of experience the outcome on testing this appears to create a a new column per value from my column 2.
The VB code is exactly what is required and I need to identify if this could be scripted in DI (which I doubt) or whether or not I can use an alternative. I forgot to mention the data sits on an oracle 10g database. The testing I did with MS Access to get the code right.
I will look into XML but alas I fear this could take too long , I need the code fired up each morning to handle approx 5 millions rows, MS Access test took 8 minutes to handle 2.5 million rows.
I managed to do the VBA on a Desktop Intelligence report using DAO but from workstation to server the estimated time to complete was 4 hours. Far to long for a busy organisation such as hours , lol.
Example
From this
column1 , column2
1 , 1
1 , 2
1 , 3
2 , 1
2 , 2
3 , 1
Actually this is more of an aggregation than a merging. As far as I know, there are no custom aggregation functions in BODS. However, depending on your Database you can do it quite easaly on the DB, see for example the LISTAGG feature of Oracle 11gR2.
A good term to start googling would be “String aggregation [insert your database here]”
However, be aware that there is no guarantee of ordering when using any form of aggregation on the database (LISTAGG is an exception where you can actually specify the ordering within the group).
So in general your result might actually look like