BusinessObjects Board

Queries for performance monitoring - database execution time

Hello,

As part of the SLA that we provide to our client we’ve set up a few queries on WEBI to monitor performance. We use a Perl script that logs in as a test user to WEBI, reads an enterprise document, refreshes it, and then logs out. Of course I can get the end-to-end execution time of each phase. What I do not know is how much time is spent on the database tier. Do you think that there is a way to measure this?

I do have auditing activated. Until recently I used a log files but last week, on client demand, I started logging to the database instead. Of course, there is some information available in the audit log, but the fact that the audit tables are updated rather randomly ( ?) makes things a bit more difficult. Most of the events in the log (OBJ_A_EVENT_LOG) have `-1’ as the duration. The events that do show some information are 18, 19, and 25:

  • Query 1:
    • 18 = 6945 (Compute Result)
    • 19 = 6951 (Refresh Document)
  • Query 2:
    • 18 = 46746 (Compute Result)
    • 25 = 46752 (Refresh Document After a Prompt)
  • Query 3:
    • 18 = 48619 (Compute Result)
    • 25 = 48659 (Refresh Document After a Prompt)

I assume that the second result includes the first one, given how close they are. Can I assume that the time on the database is approx.

refresh time measured in my script - duration of 19/25

or are there other hidden costs? (The size of the document does affect the result since the resulting HTML needs to be retrieved, but the effect is minimal given that I execute my script on the host where the HTTP server is located.)

The thing is I have to retrieve the audit log refresh times asynchronously, whenever WEBI decides to update the audit tables… What I’d prefer is a more direct solution… if I could find one.

I was thinking of creating a trigger on the database that would measure the duration directly. That could even be used to turn on extended SQL trace that I could analyze with tkprof (yes it’s an Oracle DB :wink: to get more detailed information when there’s a serious issue. However, I don’t know if in a trigger I have access to the text of the SQL query. The user that executes the query is available, but since all WEBI queries are executed with the same Oracle user…

Thanks in advance for any help!

Best regards,

Frank Olsen
Steria


folsen (BOB member since 2003-09-30)

Frank,

I’ve always done this kind of monitoring off the audit logs stored in a database since its the easiest, but it really depends on what your SLA contains. If you really care about login time, you won’t get this from the audit logs, since the duration is -1 (not caputured) as you already know. However, if your SLA is really just concerned with report run times, then you may get what you need from the logs.

The audit logs are flushed to the database approximately every 5 minutes by default I believe. Its going to be pretty tough to do real time performance monitoring given the delay. By the way, I think you can force this flush to occur more often, but there’s always going to be some delay. Be careful you don’t slow the system down by forcing the cache to flush too often – you don’t want your monitoring to cause performance issues! We haven’t noticed much of a peformance impact just leaving the interval set to the default.

Have you seen the information contained in the WebIntelligence System Administrator’s Guide, Chapter 9 Tracing System Activity, specifically the “Monitored Events” section? This is very detailed and pretty accurate. However be warned that there are differences in the detail captured for WebI vs. Full-Client reports run under WebI, specifically you may not get the duration and number of lines returned for both “application” types in event 19 and event 25.

One more thing – be careful of the event IDs (the primary key between the event and event detail table). Every time you install a new version of WebI, the event IDs get reset back to 1. You either want to reset the ID back to what it was before the upgrade, or leave it at 1 and delete all the audit logs to avoid duplicate key problems in your queries. This applies to both Windows and Unix versions of WebI, but the ID is stored in different places on each platform.

By the way, another approach might be to right your own app in either JSP or ASP that uses the SDK to login, get the corp doc list, open a refresh on open report with prompts, run a report, etc. and log the times for each action yourself. You could do this in 100-200 lines of code, using WIJSP/WIASP as a start. This would be more detailed version of your PERL script. The idea would be to run this app e.g. every 10 minutes to measure performance and then email/page someone if a problem is detected. If you have access to a monitoring tool that can monitor Web apps, you could probably accomplish this same thing in the tools scripting capabilities.

Good luck,
-Craig


craigpeters :us: (BOB member since 2003-05-09)

Hello Craig,

I’m not sure that just looking at the audit logs would be an option because the client wants a result at the end of the month that gives an indication of the evolution of the performance of the system. Choosing a known set of three reports that we refresh every half-an-hour seems the best way to get a consistent picture, since we know exactly how long it takes to refresh those reports when there’s no other activity. I’m not sure that the interpretation of the audit logs would as clear and simple. Do you calculate an average of all events or only certain events (like refreshes)?

Admittedly, it may be that the reports chosen by the client for this performance monitoring may run just fine even though there is a performance problem for other reports… Until now, whenever we’ve had a serious performance issue, our three reports have degraded as well, but we may have missed our on other issues.

One advantage of our approach is the, as a side-effect, we can use the same scripts to send an alert if performance degrades too much or if the script can’t log in to WEBI, or if a refresh fails, and so on.

In the end, we implemented what the client asked for, but we could of course suggest a different way of doing things. Clearly, adding a false load on the system consumes resources that could be used for other things. Maybe a combined approach would be the best?

I do know about the audit information, and I know that I get a result for 19 and 25 at least for our three reports. For other documents I haven’t checked. Are there other significant, time-consuming events to monitor?

Concerning the SKD approach, I don’t think that we’ve got a licence, but I have asked the client to check anyway. I think there is a separate licence for the SDK, is that right?

Anyway, I’m not sure that I could get much more information than from by Perl scripts. Just refreshing one report takes something like 15-20 HTTP requests on WEBI. In the script I decompose the execution time into login time, time to read the report from cache, time to refresh the report, and time to log out again. Can I get any other information using the SDK? What is really missing is to know how much time is spent in WEBI and how much is spent in Oracle. Refresh time - duration of 19/25?

As an aside: has anyone got a set of scripts / SQL queries for retrieving information from the audit database? We’ve got the audit universe and the canned reports, but I don’t think that the client will pay for the Audit Kit. Personally I’d prefer retrieving this information directly from the database.

Thanks again for your help!

Best regards,

Frank Olsen
Steria


folsen (BOB member since 2003-09-30)

You don’t have to pay for the ‘audit kit’ to use the audit universe. I have a slightly upgraded audit universe and report or two in the downloads area located here.


Cindy Clayton :us: (BOB member since 2002-06-11)

Hello Cindy,

Thanks for the scripts! Sure, I know that you can use the audit universe, it’s just that there are only three pre-defined reports. Then again, maybe the audit kit does do a lot more than you can do yourself with some effort. I haven’t checked it out in any detail.

Finally it turns out that we do have a licence for the SDK (silly me for not verifying), but I don’t know if I can get any more information then I already get from my Perl scripts on the HTTP level. What would be interesting is if the SDK provides more detailed exception information than what is returned as HTML. Last week (first monday of the month as it happens :wink: my scripts often couldn’t even read the cached report, never mind refreshing it. Maybe the SDK would give me more options for catching and interpreting the exceptions? If not, I’ll have to put on traces at the start of next month to try and find out what is going on.

If I do use the SDK, is there an API that lets me read and refresh an existing report in the document list which is what I do on the HTTP level?

Returning to the time spent in the Oracle DB I see two possibilities: write a login trigger, but I’ll have to trigger for all connections to the DB given that all WEBI queries are executed as a single user. The other idea may be to use the Oracle audit trail feature to log each query that matches my report refreshes. In the latter case, I’d still need to scan the log after running my script. Hopefully Oracle writes this information synchronously.

Best regards,

Frank Olsen
Steria


folsen (BOB member since 2003-09-30)

Frank,

We can maybe help you on this. We are specialized in BO monitoring full and thin client.

We have a module that allows you to link an Oracle Session to a Business Objects Session (both full and thin client).

This allows you to track “database costs” for reports.

If you are interested we can probably help.


ClaireB :de: (BOB member since 2002-08-09)

Hello Klaus,

That sounds interesting: what is the name of the product (Enterprise Manager, …)? (I’ll have a look, but I can’t promise to convince anyone to spend money :wink:

Best regards,

Frank Olsen


folsen (BOB member since 2003-09-30)

Hi Frank,

Yes Enterprise Manager is the tool.
I know your point of view but it may help to better manage the SLA and this can be very useful.

We also work with your colleagues in Switzerland.


ClaireB :de: (BOB member since 2002-08-09)

Folsen,

I doubt you’ll get significantly better information from the SDK. In fact, if something isn’t working, the SDK layer will act the same way and just stop at whatever SDK call was last made in a lot of cases. Think of the SDK-layer exceptions as being at a bit of a higher level than what regular WebI displays (or doesn’t display!).

I don’t see a big difference between what you’re doing and what I proposed in terms of capturing detail. The SDK route might let you get some more details, but probably not worth the effort to rebuild what you’ve already done. [For example, on a huge report it could probably help you to see how long the report ran in Oracle vs. generating hundreds of pages of HTML to display those result.] I’d stick with the PERL scripts for a while and see how it goes.

Related, I’m assuming you’re doing some kind of server monitoring on the WebI, repository, and data DBMS servers? This will also help you understand performance issues.

-Craig


craigpeters :us: (BOB member since 2003-05-09)

Craig,

We have many customers who want to cross-charge the database costs to departments based on their usage. Today this is only possible if you know how long a query has been running on the database and what kind of resources have been used (memory, IO, …)

This can only be achieved by matching bo reports to specific database sessions.

You won’t know why webi fails, is unstable, displays white pages, …
For that you have to work on the Webi servers and use/build monitoring processes around which will detect the errors and deal with them.


ClaireB :de: (BOB member since 2002-08-09)

Hi Klaus,

If you don’t mind, I’d be interested in knowing who are the colleagues in Switzerland in order to exchange some information.

Best regards,

Frank Olsen


folsen (BOB member since 2003-09-30)

Hi Craig,

At the moment, given that I thought that we didn’t have the licence, I haven’t checked out what you can do with the SDK. If it won’t allow me to detect errors more precisely there is less of an interest from my point of view. However, what you say about seeing how long the report ran in Oracle does interest me: that was really my main question at the start of this thread. Do you have a reference to the API calls needed to show the database costs?

Sure we do. What we notice is that the Oracle server is much more heavily loaded than the two WEBI servers, both in terms of memory and CPU utilization. Maybe the sizing wasn’t done properly at the start.

But… The worry is that the graphs of systems resources do not show up anything that is very different during the first monday of the month. In spite of this, my Perl scripts, as I said earlier, frequently don’t even manage to read the cached document on the server. (I don’t know too much about the end-users’ experience since they don’t report to me directly, but rather to the super-users. Sometimes the super-users don’t say anything until a few weeks later, but only vaguely that “there were some user complaints, sometime”.) I do suspect that some WEBI parameters are wrong.

What I’ll do is to activate the WEBI traces before the next “first-monday-of-the-month”. I’ll also put aside all the logs created by my scripts.

(Apart from the first monday of the month, the only other problem is a Report Viewer Error from time-to-time. I’ll finish reading through the previous threads on that problem before asking anything.)

Best regards,

Frank Olsen


folsen (BOB member since 2003-09-30)

Hi Craig,

I’m starting to look through the SDK documentation. So far, I haven’t seen any examples that look like what I’d like to do… Do you have (links to) any existing code that I could look at? I’m not asking for a ready-made solution, but at least an idea about the WIBean classes and methods that would be needed. Notably, what class would give me access to an existing report?

Has any of you skipped the use of JSPs to instead call the WIBean API directly? Any potential issues?

You mentioned that you could access information about the time spent on the database during a refresh. Which class gives access to this information?


folsen (BOB member since 2003-09-30)