How to stop a Job from using the Windows Cahche?

system · June 30, 2009, 12:54pm

Hi

I am getting the Access Violation Error, Sometimes Virtual Memory Error too.

For the Access Violation Error, I went through various pages and it has been asked to avoid using the Cache of Windows or change the DF.

I have no rights in changing the DF. So the only way is to deal with Memory Side.

I am unclear about how to switch off this Windows Cache memory usage.

Please clarify this
In my production system, the lines something like below is not coming which does means they dont use the Memory Cache?

(12.0) 06-29-09 13:59:33 (28320:20352) DATAFLOW: Cache statistics determined that data flow <DF_CHECK > uses <0> caches with a total size of <0> bytes. This
                                                 is less than(or equal to) the virtual memory <1608515584> bytes available for caches. Statistics is switching the cache type to
                                                 IN MEMORY.
(12.0) 06-29-09 13:59:33 (28320:20352) DATAFLOW: Data flow <DF_CHECK> using IN MEMORY Cache.

Directly it says DF starts and then DF completed successfully messages only.

Thank You
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · June 30, 2009, 6:02pm

In short, you have the job property “use collected statistics from optimization” turned on.

DI has two caching methods: in-memory and pageable-cache.

With in-memory DI allocates as much OS memory it wants. This method is simple but if the amount of data is large the system starts to swap or simply run out of memory.

With pageable-cache turned on we use the Oracle Berkeley database, an in-memory database that has the capability to overflow data to disk. So you set the maximum amount of memory you would allow to use and if more is required, data is overflown to disk.

The big question now is, what method to use, in worst case the same dataflow might require different settings as once it deals with large amounts and the other time it does not. So we implemented a feature to collect memory statistics in the repo during the execution and next time when executing the dataflow the engine can automatically decide what to pick.

You message now says that there are memory statistics available - you have executed the job with the execution property “collect statistics for optimization” - and this dataflow requires 0 memory. Secondly, you execute the job with the property “use collected statistics for optimization” and therefore the dataflow is set automatically to in-memory cache method - very unlikely a dataflow with nothing to cache runs out of cache memory.

Werner Daehn (BOB member since 2004-12-17)

system · July 1, 2009, 4:55am

Hi

Thanks for your reply.

Thanks for making me understand how this works

But I would like to know if this is newly implemented in Data Services.Because I can find none of those check boxes in 11.0.1 DI Version. Correct me if i am wrong.

If such is the case, how DI 11.0.1 has dealt this?

Hope you got what my doubt is!!!

Thank You
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · July 1, 2009, 9:37am

I was talking about 11.7 onwards. Everything below is using in-memory cache logic only and always.

But your message does not match then, this trace message has to come from an 11.7 or later version.

Is it possible you do the development on 11.0 but the prod server is running 11.7?

Werner Daehn (BOB member since 2004-12-17)

system · July 1, 2009, 11:15am

Ha Werner

My production Server is only 11.0.1

I tried this out in DEV 12.0. The message I attached is from 12.0!!!

Thanks Werner.

Inspite of this improved Memory Usage method, my Job is not giving Error in 11.0.1 (Production) and providing error in 12.0.

Does this mean, the Job Server Machine is running out of memory and I need to increase the Virtual Memory? Because sometimes I am getting Virtual Memory Error also

Thank You
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · July 1, 2009, 1:31pm

I would rather check the dataflow for improvements. Caching large volumes of data is not good performance wise anyhow. So post a screenshot of your dataflow and describe it a bit. Then we can suggest optimizations.

Werner Daehn (BOB member since 2004-12-17)

system · July 2, 2009, 5:01am

Thanks for your interest.

To be frank, I have no rights to change the Design !!! Thatz a sad side for me.

Anyhow I will put the same question in front of you.

If my production Server can Run the same job successfully in 11.0.1 Version of DI (where there is no concept of IN Memory Cache or Pageable Cache),
why my DS 12.0 is unable to Run this (inspite of the improved performance of the Application )

I am asking this because, if the problem is purely on Memory of the Server, then my first action should be asking to increase the Memory Size and not for looking into optimizing the DF (They will say itz not your job )

Help me yaar

Cheers
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · July 2, 2009, 7:58am

Assuming the amount of data is the same in both environments, most likely the execution plan chosen by the DI optimizer is different due to the different versions and their capabilities.

Werner Daehn (BOB member since 2004-12-17)

system · July 2, 2009, 11:28am

It’s unlikely adding memory to the server will fix the problem… it’s probably that the dataflow is set to memory-only (and not pageable) and you’re exceeding the 2Gb limit for a 32-bit process.

When the job is run, be sure “use collected statistics” is un-checked.

If you don’t have the power to change the job… what is your role here? Why are you debugging it? Can you find someone who does own the job/dataflow, and convince them they need to update the design?

The message to your employer should be: It’s unrealistic to expect a huge jump from 11.0 to 12.x would involve no design changes.

dnewton (BOB member since 2004-01-30)

system · July 2, 2009, 12:20pm

Thanks Much for your comments DNEWTON

Haha I do very well understand, but what to do, in this case. It is my access level here.

Anyhow I can suggest about these solutions.

In between, I wanted to highlight, the Access Violation Error pops up for the DF that is getting executed using the Pageable Cache and not IN Memory!!!

So I thought of changing the DF’s to use IN Memory directly and Run.

I will post the results once I am done with it :?

Cheers
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · July 2, 2009, 12:45pm

Yes, but consider this message in the logs:

Cache statistics determined that data flow <DF_CHECK > uses <0> caches with a total size of <0> bytes. This   is less than(or equal to) the virtual memory <1608515584> bytes available for caches. Statistics is switching the cache type to  IN MEMORY.

This means even though the dataflow might be set to Pageable, DI thinks, incorrectly, that the dataflow uses so little memory that it can switch and force a in-memory process only. So you need to run the dataflow/job with “collect statistics” ON, but “use collected statistics” OFF. That will get fresh stats for DI to use in future runs. Or always just run it with “use collected stats” OFF and never worry about it again.

dnewton (BOB member since 2004-01-30)

system · July 3, 2009, 4:18am

Hmmm DNEWTON accepted

Hey thatz fine. When I began this thread, I was unware about this concept!!!

I tried all means. But no luck.

But I understood, that by using Pageable Cache, the AL_ENGINE eats only less memory when compared to IN MEMORY. Thatz gr8.

To give a outline about my DF.
There are 2 Source Tables coming up from 2 different Oracle DB’s !!

Before the join of 2nd Table, the 1st table is added up with a Query Transform. In that Query Transform, they do LookUP with a Table that lies in another DB (a 3rd DB). Here for all lookup’s they does use Cache Spec : PRE-LOAD CACHE and when they join this Query Transform Output with the 2nd Table, there in the LOOKUP they does use Cache Spec to be NO CACHE

Can you help me with these Inputs?

To be frank I don’t know what functionally goes here.

Cheers
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · July 3, 2009, 9:12am

https://wiki.sdn.sap.com:443/wiki/display/BOBJ/Joining+tables+in+the+engine

Werner Daehn (BOB member since 2004-12-17)

system · July 3, 2009, 9:17am

Werner

I need a quick reply.

2 Source Tables :
1st Table has around 14000+ Records
2nd Table has only 334 records.

Does something helps if I make this 2nd one to get cached?

Cheers
Ganesh

ganeshxp (BOB member since 2008-07-17)

system · July 3, 2009, 10:00am

With those number of rows it does not matter, that’s processed within seconds. The problem has to be somewhere else, table 3, the cached lookups, the row-by-row lookup performance

Werner Daehn (BOB member since 2004-12-17)

system · July 6, 2009, 1:06pm

Yeah Werner

As you said, there were many LOOKUP Calls and to around 3 or 4 DS Connections, there were lookups done.

But there is one LOOKUP where I have a Doubt.
The Translate Table coming up has 1Million+ records.
Moreover that is not a Table and it is a VIEW. And when I tried “View Data” on that Table from the DS, it is approximately taking 3 to 5 mins to load. Sounds the problem is arising at this point.

So I made PRE_LOAD_CACHE from NO_CACHE here. Then also no use.

Anyone I am helpless at this juncture. Please pour in your ideas

ganeshxp (BOB member since 2008-07-17)

system · July 8, 2009, 1:09pm

Can you post the create-view statement? And the conditions used in the lookup and what the primary keys of the tables using in the view are.

Werner Daehn (BOB member since 2004-12-17)

system · August 12, 2009, 10:36am

Werner,

I will not be able to post the DF here. It is too too huge

Anyhow please help me in understanding 2 things

→ For DS 12.0, you said about the concept of Oracle berckley. But when this one is getting used, which memory will be used? Whether the System memory on which Job server runs or from somewhere?

→ For 32-bit application, I understand, a max of 2 GB mem only can be used. But if the system’s Virtual memory is getting used by other applications how we can tell 2GB is used by AL_ENGINE? It wont right?

Please share your thoughts!!!

ganeshxp (BOB member since 2008-07-17)

system · August 12, 2009, 2:14pm

The memory of the system where the jobserver is running.

2GB per process.

Werner Daehn (BOB member since 2004-12-17)

system · August 12, 2009, 3:34pm

This means the Hard Disk Memory or the Virtual Memory? Believe it is Virtual Memory only??

Also can you please help me in knowing about the PCache folder in %LINK_DIR%/log folder?

ganeshxp (BOB member since 2008-07-17)