Actually, to begin with, my DF is a migrated one.Also a huge one too
→ I identified the Query Transform that caused this Error
→ Hence, created a Persistent Cache Table for the Output of the Query Transform that is just before this one that causes error
→ Now I began with that Persistent Cache Table as Source, this problem causing Query Transform and a Persistent Cache Table Target
==> As soon the Job reaches the PROCEED State, I get the Call Stack Error
Now I went up like splitting that Query Transform into 8 different Query Transform (sounds annoying for me!!! Anyhow to make it work I did this one )
Now it is getting loaded successfully.
Is something creeping in because of the Migration? Any suggestions please
To be honest, it sounds like your DF become corrupted during migration - because it is not normal for DI/DS to crash when using more than one lookup_ext function. (Like Werner, I’ve used plenty of these in a single Query transform without any problems).
I’ve had a similar problem once where an imported DF would give the weirdest of errors for no real reason. After spending 3 days on trying to fix it (and like you, found myself wrapping more and more tape around the DF to seal the leaks)… I just decided to scrap the DF and re-write it from scratch. In the end, it looked 100% the same and it worked just fine. (I had also designed the imported DF so that was of some help of course).
If you have a DF that’s huge and obviously defective (assuming that you only have this problem with THIS particular DF), I would seriously reconsider re-doing this DF. Personally, I always try to break DF’s up into smaller components so I’m not having those huge monstrosities to work with. So perhaps this could be an oppertunity for you to take a fresh look at that DF’s design and come up with something that works better as well?
Oh that is too sad you found the solution after 3 long years
Exactly what you say is correct I guess. It is something too huge and something is getting messed up during migration
Even in many other DF’s I noticed the Lookup usuage extensively
I started re-building the DF already. I just had a fear if I would miss/change something accidentally during the re-buld. Had I designed the DF, then it wont take much of my time to re-build. After all I am migration of what else one has done. So I felt it risky thatz it
I will come up with a result after re-build
UPDATE:
I re-builded the Job as a Fresh one and tried. No use
Same problem popping up exactly at the same place.
Hmmm I go for a solution of having those lookups in some serial queries.
I have had the exact same issue you are facing. I had multiple look ups in one data flow and the job would complain out of memory after a point. Thats probably the same error you are receiving. Eventually, I had to split into multiple data flows like you did, send the output to a template table at each step and incrementally building the output.
There are multiple reasons for this I think.
What type of cache are you using in the LKP? If you try and use NO_CACHE option for every lkp function call, I am sure you won’t end up in the memory issue, but your job will take forever to finish. If your computer is using memory in other processes, having several LKP calls with CACHE option could result in memory issues.
Size of LKP table. If your LKP tables are huge, again caching all that data might be an issue. One suggestion would be create smaller LKP tables (filter what you want from the original table) and try using that.
Create index on LKP fields so access is faster.
You could try RUN_AS_A_SEPERATE_PROCESS if you have multiple processors (I tried this and the job would fail).
OK. Why don’t you try eliminating that single LKP with more than 10000 rows and have all the others in the same DF and test?
Also, to catch the culprit, one suggestion is debug incrementally, meaning, have one LKP, run the DF, have the second one with the first one, run the DF and so on… This way probably you can spot the one causing the problem.
Do you have access to the Job Server host? While running your job, can you just take a look in the task manager (assuming you’re running a Windows Server) to see how much memory the AL_ENGINE process is consuming?
Under Windows, Data Services is still a 32 bit process that cannot address more than 2GB of memory for a single process. With 11.5.x, the Job would simply crash and burn when reaching the 2GB but with 11.7x and up, it should page to disk. (Very slow but at least it keeps running).
As you are using 12.x, I didn’t think that memory use would be an issue?
When populating my fact tables, I used our dimension tables for lookup functions (to retrieve the correct surrogate keys) and these dimension tables could contain millions of records. I never had this problem before - mind you that I was using the lookup_seq function without caching.
I have found that carriage returns in the function call areas of Lookup_ext cause some problems with parsing at run time. I haven’t tested it enough to put my finger on the limitation, but you could removing them. The code is less readable but seems to run without problems.
I did the same thing rebuilt the DF from scratch, but like any good coded I made things readable and replicated the problem also. I was trying to figure out what was different when I noticed that the one that was failing had a very complex formula in the function area of one of the joins.