We have a dataflow that sometimes (but not always) gets this error in production during our nightly DW load:
DFC-250014: |Dataflow df_Stage_Stage_MW_OPTOUT_LKUP
Job or data flow <df_Stage_Stage_MW_OPTOUT_LKUP> did not receive registration requests from its children within <30> seconds.
This dataflow has several data_transfer steps, which means the dataflow gets broken up into several sub-data-flows at runtime.
This dataflow runs fine when testing it by itself in the dev/test environment.
Here’s my theory: In production in our nightly job, we run several dataflows in parallel at any given time. I think this problem could be an interaction between the sub-dataflows and parallelism. The parallelism is capped at 4 (only 4 al_engines can be running at any given time) in our environment.
Each dataflow – or sub-data-flow? – gets its own AL_ENGINE.exe process on the server, so Dataflow A might call sub-dataflow B. Sub dataflow B might try to run but not be able to because it is waiting for a free AL_ENGINE that is capped at 4. Other dataflows C, D, and E already running in parallel might make it wait for a long time… And then you get this timeout error.
Unless DI is specifically designed to give priority in the processing queue to sub-data flows belonging to a parent data flow that is already running…
I don’t think this problem is specific to data_transfer - it can occur when using other features that create sub-data flows, and also when distribution level is set to ‘dataflow’. Our job servers are configured to allow 16 concurrent job engine processes and, running a job standalone in the dev. environment, there is no way I am running that number of processes.
There’s a DSCONFIG option to increase the timeout that the parent will wait for the child to complete. We’ve upped it from 30 to 300 seconds, maybe that will make a difference.
Curious, did this help? We get that error very infrequently but have always been able to handle it in the past by runnning that job at a time of lower server volume.
HI: We are having the same problem you had with the data tranfers. By your posting date a few months had passed after you made the change to your DFConfig file, so did changing the DSconfig file to 300 seconds helped? Did it affect anything else? We are ready to change ours to 100 seconds but want to make sure is not going to affect anything else in DI.
Very old thread, but for those that are still seeing timeout issues…
Here is the error I see in the log:
It isn’t clear, but I don’t think this is actually a database issue even though the first line indicates that. The second line indicates there is a child process that it has lost communication with. I think this is the issue, but I could be wrong since the Dataflow normally executes in less than 30 seconds. The child process issue is (in my experience) often seen when you have two independent processes in a Dataflow (two Query transforms feeding into a common third Query transform). The Dataflow is waiting on one of them and somehow it goes off into La-La land.
I could find nothing in errorlog.txt that correlates to this job and there are no dump files (DSConfig.txt has the setting TURN_DUMP_ON=BOTH so dump files should get created I believe).
The property that is referenced earlier in this thread is DFRegistrationTimeoutInSeconds. Mine is currently set to 300 and I’m still seeing timeout issues. For the Dataflow in the error 300 should be more than enough but I’m going to increase it to 600 just for giggles.
This is on DS 12.1.1.4 using an Oracle Source/Target with a SQL Server repository. I can stay logged in to my SQL Server database for days without executing a query and then execute one and get no timeout error. So I don’t think the SQL Server database is the issue.