Explanation of job server memory settings

Hi,
I’m a long time DBA that has now picked up BODS support as part of my role - I’m trying to understand how the memory parameters :

PAGEABLE_CACHE_BUFFER_POOL_SIZE_IN_MB=0
MAX_64BIT_PROCESS_VM_IN_MB=4096

Relate to the memory size/limit of the job server process. As far as i can make out the cache buffer size is set to some default value as it’s 0 (can i find out what this actually is - is it some auto tuned value?)

The second max parameter would seem to limit the memory to 4GB for a single process.

I’m not sure why there are 2 parameters - they would seem to do a similar thing?

Also - is the pageable cache size a ‘shared’ limit between all the processes or is that per process also?

And one final question:

DatabaseMaxArraySize=5000

That limits the amount of rows fetched in one pass - can this be safely increased above this size - in our case BODS and the DB run on the same server so network latency isn’t an issue - we just want the data processed as quickly as possible.

Thanks,
Rich


richardharrison (BOB member since 2013-08-19)

Lots of questions there and I can’t answer all of them.

Pageable cache is NOT a shared resource. Run too many Jobs/Dataflows and you’ll run out of memory. The pageable cache is for a Dataflow.

DatabaseMaxArraySize can be increased but it isn’t always a good thing to do. You can exhaust all the memory by using too high of a value. At least you can on a 32 bit O/S, harder to do on 64 bit. DS takes a pessimistic view on buffers. A 4K varchar column takes up 4K no matter if the actual data value is 1 byte or 4k bytes.

Use the setting that gives you the best performance but yet allows the job to “play well with others”.

And finally, be careful tweaking the DSConfig.txt settings. I rarely have to change any of the settings. There’s more bang for the buck in writing good ETL than there is in tweaking DSConfig.txt.


eganjp :us: (BOB member since 2007-09-12)

Hi Jim,
Thanks for the reply. The developers are (i hope) trying to push most things down to the database anyway but in the cases where the job server has to do the processing due to the functions used or the complexity of the transform we’re keen to make as much of that happen in memory.

We have a huge server (0.5TB of RAM) so we want to make BODI make best use of that.

From what you said pagaeble cache is not a system wide setting but a single process setting - this brings me back to the original question - what is the difference between these two parameters

PAGEABLE_CACHE_BUFFER_POOL_SIZE_IN_MB=0
MAX_64BIT_PROCESS_VM_IN_MB=4096

They seem to be doing the same thing?

But essentially what you are saying is ‘suck it and see’ - if you increase the memory and things all start crashing then it’s too big. It’s basically trial and error?

Thanks,
Rich


richardharrison (BOB member since 2013-08-19)

I think this means use the default:
PAGEABLE_CACHE_BUFFER_POOL_SIZE_IN_MB=0

You can specify a value and the Dataflow will only use that much physical memory and then everything else is on disk. So if you set this to a value of 2GB then the Dataflow will gobble up 2GB of RAM and then if it needs more it will go to disk.

.5 TB of RAM, eh? I’m thinking how I would change my approach to coding with that much memory… Nope, I wouldn’t change much. A Dataflow will only address about 8GB of RAM on a 64 bit O/S. So while that’s about 6 times what the 32 bit O/S will address it still won’t make a dent in .5 TB of RAM.


eganjp :us: (BOB member since 2007-09-12)

The one change I always make is to switch all the DUMP file settings to FALSE (or OFF) - this saves me from finding my disks have filled up with dumps - they are usually around 4GB each and I’ve never found a use for them.


dastocks (BOB member since 2006-12-11)

The one thing I would do a bit differently with that kind of RAM is more preload cache lookups with less looping. Obviously, it depends on the particular job / situation, that much RAM won’t help you much in the typical copy with minor transforms but it will be a boon if you end up having to do target based CDC. You may find that preload cache lookups with custom SQL will perform better than a database pushdown if the lookups are on non or poorly indexed fields in the database, especially when you apply simple business knowledge to them.

  • E

eepjr24 :us: (BOB member since 2005-09-16)