I am relatively new to BODI and am using version 11.5.
I have done several DF with Query, Table Comparison without issue. All Table Comparison have ‘Detect deleted rows’ checked so that deleted rows are deleted in target table.
However, recent DF fails to insert all rows into an empty table when ‘Detect Deleted rows’ is checked. Debug shows that the rows have an op-code of ‘I’, but the rows do not exist in table upon job completion. If I uncheck ‘Detect deleted rows’, all rows load to table. Table comparison is followed by key_Generation, if that matters at all. About 50% of the rows are inserted if ‘detect deleted rows’ is checked and there is no pattern as to why certain rows are not inserted.
The Table comparison has primary key of three columns and there are three other columns designated for column comparison.
If an initial load of table requires that ‘Detect deleted rows’ be unchecked, that is fine, but I would like explanation as to why–since logically to me— deleted records should not be detected during an initial load of a table.
Thanks for the link, but I do not think it applies to my scenario.
My DF is simple: Query, Table Compare, Key Generation. Target
I am using same table for Table Compare, Key Generation and final Target steps. I am not doing any kind of map operation, or history preserving. I have no special column value updates to do after the table compere. There are no duplicate records going into the table compare and I have no fancy primary key or column compares within it. All my DFs are like this and all work for the initial load except the latest one. If I uncheck the ‘detect deleted rows’, everything initially loads (inserts) fine. Then if I check ‘detect deleted rows’ and run the data flow again, after updating, adding and deleting rows in the source table, all updates work just fine. I only seem to have trouble with the DF if the ‘detect deleted rows’ is checked on the initial load of the table.
I can certainly change the flow to be more complex like that suggested in the link, but it seems to be a lot of extra work and processing time just to handle what the table comparison is supposed to do, and appears to do successfully, most of the time.
I was hoping for a logical reason for the table comparison’s odd behavior, but perhaps it is just a bug or glitch that randomly occurs.
I will gladly adjust the DF, if someone provides a reason why such changes are necessary.
Still I fail to understand why detect deletes has the effect you are describing but given your settings I believe we are on the right track.
I suspect the three columns you defined as input-primary-key are actually not unique. So I would add another template table right before the TableComparison and then execute a
select re_id, cd, freq_cd, count() from template_table group by re_id, cd, freq_cd having count() > 1
If these three columns really are unique, above query should return no rows. But if I am right, and you haven’t set detect-duplicates in the TC transform, this together with the row-by-row setting can result in unstable results.
Detect-Deletes works by taking the complete target table data MINUS the source data at the end. And what’s left by this operation are those rows that exist in the target but no longer in the source, hence they had to have been deleted in the source. But without a PK…
The query returned no rows…the three fields I have for primary key are a unique combination. I was 99% positive sure of it, but checked anyway–for you and myself. I am now 100% sure.
But maybe this additional information I have found may shed some light: The target file has a key of cust_sctn_defn_id and this column is serial in the DB with a sequence. The sequence value for this column is always 1 no matter how I load the table—and I understand why. DI is not really using the sequence on the column when inserting rows and that is fine for now. However, when I check ‘detect deleted rows’ the starting sequence used for the inserted rows is always 1201 and the number of rows inserted is 388. When I uncheck the option, the starting sequence is 1 and 1588 rows are inserted. 1588 records is correct.
Any clue where the starting sequence of 1200 may be coming from when ‘detect deleted rows’ is checked? This is the first table that has had a sequence associated with the primary key column. Perhaps that has something to do with it? The target table is an Enterprise DB table—postgre SQL, if that sheds any more light.
Well, eventually subsequent loads on the table failed to insert all new rows. And the sequence value was jumping around on those that were inserted. I definitely suspect the sequence on the table is the issue. So, I am posting this reply to help some other poor soul who runs into the same issue.
I finally used the suggestion given ealier, of splitting the inserts from the updates and deletes with a Mop Operation. And after the MO for the inserts, I used a Query to obtain the next value from the sequence that is on the table. I created a function in the database to obtain the next sequence value. But I had to use an SQL select to call the function. I could import the function just fine, but I could not call the function from mapping on an output schema field, nor could I call the function from the output scheuma using ‘New function call’. Both produced the same error upon execution—parameters of 2 do not match requested parameters of 0 (something like that).
Calling the function direct is probably faster, but I could not get it to work at this time.
I noticed that when you use a MO after a TC with “Detect deleted rows” option, the “delete” operation code is correct after the MO.
But once you add a query transform after the MO, your “deleted rows” become “normal” rows… that may explain a “insert” error at the target…
My aim was to update rows in target that have been deleted in source (i.e. closing a validity period). I used a specific DF branch to manage only deleted records and add another MO (normal -> update) just before the target.
This solution is similar to the one described in topic 171576: https://bobj-board.org/t/171576
I guess delete operation is very special because you get data from target (comparison table) instead of from source.
I use BODS 4.0.
Thanks for the reply which reminded me to post a possible final solution to this problem.
I still had intermitten problems with the incorrect deletion of rows. All the rows would be inserted correctly, but then rows would be subsequently deleted for no reason. The number of deleted rows was always an even 1000s (2000, 3000, 4000, etc) leaving an odd number of rows remaining in the table. And I was having this issue on different tables at different times. This lead me to believe it was a committ issue of some sort—due to the deleteion of exactly 1000s of rows—always. So I divided the logic into two different types of data flows. One does the TC just for adds/updates. The second does the TC just for deletes. The second data flow immediately follows the data flow for add/updates. When I went through all my data flows to make this change, I found that some were already doing this type of flow due to the logic involved. The flows I ended up changing were the flows where the tables were having 1000s rows intermittenly deleted. I had tried everything else and could not get a 100% dependable solution. These two flows do take longer, but it seems to be working correctly and consistently at this writing. I will post again if this was not the final solution.