AL_HISTORY, AL_HISTORY_INFO and AL_STATISTICS table cleanup

system · June 19, 2012, 9:05pm

Greetings!

Disclaimer: I am a Database Administrator so please forgive me if I use the wrong terminology.
We have been searching this forum and the web for the last couple days trying to find a recent answer;
I apologize if I am asking a question that has already been asked…

About six months ago we upgraded from Data Integrator (32bit) to Data Services 4 (64bit).
Migrated might be a better term because we did a clean parallel installation on new
hardware and on a new database server.

During the day we have eight looping workflows that process incoming transactions
all day, one for each subject area in our marts. In the past these jobs were stopped
each night for about two hours and then restarted. A little over a month ago we
switched those jobs over to only restart once a week.

We noticed that our Data Services schema in Oracle was extending quite often,
especially in the last month, where it has extended by over 50GB.
Analyzing the tables in the schema revealed that we had added 250 million rows
to the AL_STATISTICS table in 30 days. We had our retention set to 90 days
at the time. When we found this, the AL_STATISTICS contained 614 million rows,
had 6 million rows that were older than 90 days and had orphaned records that
did not have corresponding records in the AL_HISTORY table.

Last night we took Data Services offline and removed all but the last 30 days worth
of valid records in those two tables. We did set the retention down to 30 days in the
GUI last night as well.

A check this morning revealed that we are adding over 10k rows to the AL_STATISTICS
table every few seconds.

We are trying to get an idea of what may have changed in our migration from
Data Integrator to Data Services… we suspect that we must have something
configured different on the server as far as logging levels go, since most of
the actual workflows have not significantly changed between versions.

We are trying to find out:

When does Data Services perform its maintenance and cleanup process?
If it is at the end of the job and the job fails because of a hard error,
does the maintenance and cleanup still happen?
What is the information in AL_STATISTICS used for and how important is it?
We see references to others truncating the log tables, but this seems extreme
and seems like it would be hard to do with multiple workflows writing to the
table at the same time…
Would you have any tips for determining WHAT is causing so many entries in the
AL_STATISTICS so we can dial the logging back a little?

Obviously all of those inserts are creating a lot of redo and archive logs, as does the
act of deleting those records unless we take extra steps. That in turn means that our
backup sets are substantially larger than they need to be, so we would really like
to figure out a solution.

Thank you

Jim Scheitel
Database Administrator
Knife River Corporation

jscheitel (BOB member since 2012-06-19)

system · June 20, 2012, 9:49pm

I’ll give this my best shot but some of this is likely internal SAP knowledge.

When does Data Services perform its maintenance and cleanup process?

I believe there is a (hidden) job that does this. I’m not sure what the frequency is but it is definitely better than once a day.

If it is at the end of the job and the job fails because of a hard error,
does the maintenance and cleanup still happen?

The cleanup is NOT part of each production job execution. See below for information about orphaned jobs.

What is the information in AL_STATISTICS used for and how important is it? We see references to others truncating the log tables, but this seems extreme and seems like it would be hard to do with multiple workflows writing to the table at the same time…

The AL_STATISTICS table is of little use on a day-to-day basis. As a consultant I make extensive use of it when I first go on site to get a quick look at how bad (or good) things are. When properly queried (it isn’t easy) I can find a lot of actionable items.

In my opinion, you are free to truncate the table on a periodic basis. Of the very few reasons to use SQL Server for repositories this is one of them. I can set the recovery model in SQL Server so that the logs don’t have to be maintained. Yes, that’s bad for disaster recovery but it’s not like you’re adding code to a production repository every 5 minutes.

Would you have any tips for determining WHAT is causing so many entries in the AL_STATISTICS so we can dial the logging back a little?

I’m not aware of any way to stop or reduce the logging. The orphaned statistics are a problem. They are orphaned because a job has suffered a catastrophic failure from which it abnormally ends (it died a horrible death). The status of the job often still shows as “Running”. Because of that the cleanup process leaves it alone. So the job has the potential to live forever in the history/statistics tables.

eganjp (BOB member since 2007-09-12)

system · June 20, 2012, 9:57pm

Thank you Jim those are really valuable insights. We are also following up with support, but we have been discussing strategies for a coordinated truncate of that table.

Best Regards
James

jscheitel (BOB member since 2012-06-19)

system · June 20, 2012, 10:33pm

The data in AL_STATISTICS is for Dataflow execution only. It doesn’t have anything in there but Dataflow “stuff”. If you are seeing problems in 4.0 that weren’t there in 3.x then I would begin to think that maybe Data Services is tracking additonal measures. I wouldn’t expect the change to your restart frequency to have made a drastic change. It’s hard to say for sure since I’m troubleshooting this from a couple states away.

I (sort of) hate to admit this to a DBA but there could be a design flaw in the ETL jobs that are looping. If a Dataflow is part of the loop and it executes without regard to if there is data to work on or not then statistics are (most likely) still gathered for each execution. I would look into putting the Dataflow into a conditional and only run the Dataflow if there is something to actually work on.

I’m a DBA when I have to be but most of my clients want me for my ETL. Here are some random thoughts I had:

As far as DDL goes there isn’t much you can do. Rows are added to AL_STATISTICS most likely with regular INSERT statements. So changing the table to NOLOGGING will have no benefit.
You could drop the indexes on AL_STATISTICS and that would reduce the impact on the archive log. As near as I can tell the Job Server only inserts to that table (during job execution) so the indexes are only necessary if you are querying it adhoc or when the cleanup process runs. Making changes to a vendor’s DDL is usually a bad idea but I have tweaked the repository indexes in the past. AL_STAT_INDX2 is a good candidate. AL_STAT_INDX1 not so much.
If you are managing the size of the AL_STATISTICS table with frequent truncates then the indexes on the table won’t matter as much. If the cleanup job finds very few rows then a full table scan isn’t that costly.
Setting AL_STATISTICS to COMPRESS won’t do you any good unless you’re using the advanced compression. Standard compression only works with direct path loads.

eganjp (BOB member since 2007-09-12)

system · June 27, 2012, 8:14am

Hi all, I was also looking in this table for some other reasons.

1526245 - Which tables in the local repository store job execution statistics? - Data Services / Data Integrator

Symptom
Where are job statistics stored?
Which tables keep a history of job execution data?
Can AL_HISTORY, AL_HISTORY_INFO, or AL_STATISTICS be safely truncated?

Environment
Data Services
Data Integrator
Local Repository

Resolution
Warning: Always back up the local repository before truncating a table.

There are three tables in which job execution statistics are stored:

AL_HISTORY
AL_HISTORY_INFO
AL_STATISTICS
AL_HISTORY maintains the execution statistics of a job at the job level along with Job Server information.

AL_HISTORY_INFO maintains log file information.

AL_STATISTICS maintains the execution statistics of a job at the dataflow and transform level.

All three tables are related via primary and foreign keys even though these constraints are not defined in the database schema. OBJECT_KEY in AL_HISTORY is the primary key for OBJECT_KEY columns in AL_HISTORY_INFO and AL_STATISTICS.

There are times when users wish to truncate one or more of these tables (the table is very large, very old data is still in the table, etc). Truncation should only be done after creating a backup of the local repository and examining the data to ensure only irrelevant data will be deleted.

1577251 - Running Data Quality job that exports reports is timing out - Data Services

Symptom

Timeout Error
Slow report creation
Export reports during job execution
Error: RUN-248009: Call to web service <Export_Report> has timed out. Currently service client call time out is set to 300000. You might need to increase this time out appropriately.

Environment
Data Services

Resolution
If a timeout error is encountered or it is noticed that reports are running appreciably slower, the DBA should update the database statistics on the repository database, primarily the AL_STATISTICS ADDRINFOCODEDATA ADDRSTATUSCODEDATA table. In Oracle, this is called “Gather statistics.”

If an Oracle database is used for the local repository, here are some additional options that the DBA can modify to try to correct the issue (other databases may have similar options):

Increase the Processes value to 400 or greater.
Increase the PGA Memory to 194MB or greater.
Set the Cursor_Sharing value to Force.

I do not know what those Oracle settings do… I am not a real DBA, but maybe it is usefull?

We are going to keep 14 days of history in AL_STATISTICS, after that it is deleted.

Johannes Vink (BOB member since 2012-03-20)

system · June 28, 2012, 4:10pm

the AL_STATISTICS tables can grow in case you are running Dataflows in loop or running jobs frequently

if none of yor reports are using these run time statistics, you can disable the statistics collection by changing the following DSConfig.txt entries

set the following to FALSE to completly diable the stats collecting, by default it’s true
Enable_Statistics=TRUE

set the following to TRUE and able param to true if you want to collect stats at DF level and not transformation level
Disable_Transform_Statistics=FALSE

manoj_d (BOB member since 2009-01-02)

system · June 28, 2012, 4:27pm

Can you provide more details on what statistics will be gathered with the following setting?

Enable_Statistics=TRUE
Disable_Transform_Statistics=TRUE

Is it as simple as anything between a source and target object won’t have statistics (but the source and target will have statistics)?

If Enable_Statistics=FALSE does that mean that the system function table_attribute() won’t produce the correct answer when using a parameter such as ‘Number_Of_Inserts’?

eganjp (BOB member since 2007-09-12)

system · February 5, 2014, 9:30pm

HI Jim Scheitel,

were you able to resolve the issue. We are running into similar issue where AL_STATISTICS table is growing. We are adding 10 -15 mill records per day on this table.we are not sure how to project the size or an option to avoid logging each process step.

Can you please post the solution.

Thanks

DI.net (BOB member since 2007-07-26)

system · February 7, 2014, 5:35am

Have you tried turning off the audit option when running jobs? I don’t know for sure if that will work, but I suspect that it might.

eganjp (BOB member since 2007-09-12)

system · April 3, 2014, 3:59pm

Here are two things you can do:

in the job properties set Monitor sample rate to the max instead of the 1000 default. 64000 definitely works, you may get more on later versions of DS. This considerably reduces the rate at which AL_STATISTICS will grow. This should be an installation standard that is applied to all jobs before they move to the production environment.
in a previous life where we had CDC jobs running in a continuous loop we had to truncate AL_STATISTICS in a script at the top of the loop, otherwise the job slowly shuddered to a halt as the table filled up. If you are doing this in a production job you should wrap the script in a try-catch because the TRUNCATE might fail occasionally if another job is writing to the table at the same time and has locked it. You could just write a stand-alone production housekeeping job to do this, or do it via a scheduled job outside Data Services.

dastocks (BOB member since 2006-12-11)