I am directly sourcing from the SAP System Tables via a R3 Dataflow.
I am not having much ideas on SAP system. Since now, we are using a Full Load from the Source System everytime. But now volume of data increase and infact it is going to increase exponentially due to various business reasons.
Please suggest me of how to make a incremental Load in this case. Because, my source system SAP is bound to get the records deleted/updated/
Please help me with ideas on Incremental Patterns that can be followed for a Load which is expected to happen every 1/2 an hour even
Use a Global Variable(which should be the last successful refresh date of the job) in query transformation’s where tab in R/3 Dataflow. When this is executed gets all the records which are inserted/updated after the previous successful refresh.
Thanks for your idea. And please bear with me that I have no idea on the SAP Source System.
But, I heard like, the DATE and TIME columns are in VARCHAR Format in my SAP Source. So, when I make direct sourcing from the SAP R3 DF, is it possible?
Someone, if you have ideas on SAP system, can we write VIEWS on the SAP side?
Basically, my aim to do CDC for the Source. Is SAP system much reluctant to CDC arrivals? Please present your ideas here
Do you have access to your SAP admin group - or any SAP basis admins?
Because you just can’t go into the SAP database and create a few views or something like. SAP just doesn’t work like that.
The answer is - yes you can. Alternatively, can’t the SAP team schedule a predefined ABAP program for you and you just pick up the results in a CSV file? Which isn’t all that different from BODI executing an R/3 DF … which calls an ABAP function, which produces a CSV-like result set.
How much data are we talking about? A few GB? A TB? If the data set also includes a filtering field (date, time, sequence number etc) that you would use for CDC anyway - you could just bulk load the complete data set into a staging table and then select the smaller subset of new records for downstream processing. Or you could use grep and other tools to build your delta result at file level (which can be faster than loading this into a database first) and process the resulting delta file instead.
Its my experience with (E)DW projects that you rarely get things setup the way you would like it so creativity is usually called for
It really depends on the data you are reading. For some tables there is a timestamp you can trust to find the rows touched recently. Other tables might have log-tables where you can see who changed what and when. And then there are tables where you have no option at all getting the changes on table level.
Bypassing the SAP level by using database mechanisms like CDC, triggers, replication logs is an option but one I would like to avoid as much as possible. Too many issues around upgrades of the SAP ERP version, table types, security.
What we are trying at the moment is to utilize the delta capabilities built into the ERP system for BW exclusively. So DataServices will look like a BW system to the source ERP and get the delta the same way using Extractors. As said, we are trying and it is not clear how complete that feature will be end of year. Aim is to cover ALL important cases.
Do you have an update on your effort to get the delta the same way as using Extractors? I’m using ABAP Data Flows to load delta from ECC tables and my challenge is, as you said, not all tables have a timestamp to indicate changes. Is there a better way to detect the changes if timestamp is not available?
Here are some of the tables I’m loading:
LIKP,LIPS,LTAK,LTAP,MAKT,MARA,MARM,MKPF,MSEG,VBAK,VBAP,VBPA,VBRK,VBRP,VBUK,VTTK,VTTP,KNA1