Random hanging of webI after clicking on a report

I have been browsing these forums for the last few days to try and solve a problem that has been occuring on both our development and production servers ever since we launched our management reporting tool. It seems that many other people have the same problem as us as described in the following topics:

The problem is that our servers randomly hang, sometimes during the day and more often overnight. Logging on is fine and a user can browse the reports as normal but when a user tries to access a report, the screen goes white and eventually we get a timeout error. This is happening most mornings and once or twice in the daytime during the week (it has got worse since the servers were upgraded to SP3). As a temporary solution we have a process that restarts production server every morning after the backups have run. This has greatly improved the morning hangs but the server still goes down during the day 2-3 times a week.

Software:
We are running Windows 2k server SP3, BO 5.1 & webI 2.7

Business Objects and webI are installed on seperate servers

Hardware:
We have 3 NICs on our server (which seems to be a sticking point in some forum discussions), 8 processors, 4gb RAM, pleny HDD space.

I have already tried:

[list]Adding OSAGENT_ADDR = “127.0.0.1” to the system variables

Adding OSAGENT_ADDR = “[local IP]” to the system variables

Adjusting the ‘application protection’ parameters in IIS

Tried adjusting virtual memory settings (but can’t go higher than 4GB anyway)[/list]

We have noticed that when the server goes down we have extra DLLHOST.exe processes which if we kill can sometimes recover the webI reporting tool. The INETHOST.exe also needs to be killed occasionally to allow us to kill the DLLHOST.

We would really appreciate any help and suggestions anyone can give us as we are going a bit mad :crazy: . Business Objects have no resolution for this yet and so we really are struggling to find a resolution of the issue. Our service level to our customers often takes a hit when the software goes down.

Also, if you have the same problem as this then leave a message on this forum saying so, and then maybe the BO support team may raise this issue as more of a priority.

Thanks

Stuart Charles


scharl16 (BOB member since 2003-05-14)

You say BO 5.1, I say W2K SP3 was only supported from 5.1.6 onwards. Not saying thats your problem, but if you’ve got a dev environment its something you can check. Any chance of ripping 2 NICs out to see if they have anything to do with your report?


Nick Daniels :uk: (BOB member since 2002-08-15)

Have you tried the resolution 13350 from the Business Objects knowledge base? We have a similarly configured server and this resolution did the trick.

Hope that helps,
Gary


Gary Andrusiek :canada: (BOB member since 2003-04-22)

Thanks for the tips guys. Great stuff cheers.

We are running BO 5.1.4 and so an upgrade is something we will definitely push through - that would explain at least why webI became more unstable after SP3 was installed on our server (it went from crashing 1-2 times a fortnight pre SP3 to every morning now).

Unfortunately our server is located in a centralised data centre and so removing the extra NICs would be a little more difficult. It is something we may consider in the future however if all else fails.

Gary - I have tried searching for resolution 13350 from the BOKB but I can’t seem to find a way to go straight to a specific resolution ID and the search doesn’t pick anything up for “13350”. I don’t know whether i’m missing something obvious - can you post the link or the thread title?

Thanks again!

Stuart


scharl16 (BOB member since 2003-05-14)

Click on the Advanced KB Search link. In the resulting screen you can specify an ID#. I found this


Nick Daniels :uk: (BOB member since 2002-08-15)

I was looking in the knowledge exchange! :expressionless:

Found your entry and have altered the registry settings. Will wait and see if it crashes overnight. Fingers crossed!

I must admit, I haven’t noticed multiple WIStorage managers in task manager but its worth a go.

Is the problem similar to the one you encountered then Nick?

Thanks

Stuart


scharl16 (BOB member since 2003-05-14)

No, I was just trying to clarify how you can search for a resolution number on the tech support site - we don’t have this problem.


Nick Daniels :uk: (BOB member since 2002-08-15)

I am also having slow performance problems with Webi. Document lists are taking more than 30 seconds to show up…

Any ideas.

I am using BO 5.16 on win 2000 SP3 with oracle 9.2 client. One cluster manger two nodes( one webi and one BCA)


JaiGupta (BOB member since 2002-09-12)

Its not really a performance issue we have as when the server hasn’t crashed we get the reports up in a few seconds.

Unfortunately webI was down again this morning :reallymad:

I suppose we will just have to wait until such times as I can convince my manager and system administrators it is worth upgrading the software. They are not convinced that an upgrade will fix our problem though, as business objects have no idea what is wrong with our setup and thus a fix without a diagnosis is very unlikely.

One other thing though, we can fix the problem without restarting the server by killing a few processes in process explorer (a freeware tool from http://www.sysinternals.com/). If we kill inetinfo.exe and the two dllhost.exe processes the server recovers and we are ready to go again. I have no idea what these processes actually do but it may give someone a clue as to why our servers keep crashing.

Thanks

Stuart

:hb:


scharl16 (BOB member since 2003-05-14)

Hello!
did you ever get this working? I strongly suspect the SP3 upgrade and your multiple nics as the issue - you can prove this to yourself this way if you want…

go to a node (eventually each) and do this… Drop to DOS, run OSFIND > results.txt

edit results.txt - do you see the cluster manager referenced more than once, maybe by a different name / IP? if so, the OSAGENT_ADDR variable is the right approach, but

that variable should point to the IP of the correct card on your manager box - not your localhost ip. That fixed us up - reboot the whole mess and run the above test again.

If that does NOT work, you have to manually reset the bindings on the nodes - not easiest thing in the world, but I’ve done it (so cant be 2 hard).

Go to the Settings, Network / Dial Up connections screen, and then select Advanced, Advanced Settings - you can set the bindings there - put the desired card at the top of that list. Do all the boxes and reboot the cluster.

Good luck!!
Brent


bdouglas :switzerland: (BOB member since 2002-08-29)

one other thing to look at - I remember an issue with IIS - try changing the protection level you’ve got set - i think the default is medium-pooled, try changing that to Low-IIS …

Also, check for drwatson errors on your manager box* - we had some issue with Storagemanager and categories - if someone selected a personal category or added on (i think?), that would cause Storage Manager to abort and restart - bad from then on out.

What service pack of Webi, 2.7 what?

Good Luck,
Brent

  • should be a drwatson.log file on your C: drive somewhere… look thru it for webi processes that may be failing

bdouglas :switzerland: (BOB member since 2002-08-29)

What do you mean by this ?

I am facing severe problems with BO 5.1.6 on my Webi server.
Initially I have 1 Cm and two nodes. In that configuration I had problems. So I disabled both the nodes and then testesd the cluster manager and to my surprise that is also not working fine.

I am not able send any full client document form corporate documnets tp others plus document lists are getting slower. And all kinds of problems.

I am using BO 5.16 on win2k Sp3 with oralce 9.2.0.1 client.

Any ideas?


JaiGupta (BOB member since 2002-09-12)

Regarding slow loading of corporate documents list - how many documents are in the list? You could also check the corporate.txt file in case this has corrupted.

Ann.


Rummers :uk: (BOB member since 2002-09-02)

Hi Brent thanks for your tips.

Because we have a relatively small user base we don’t actually use nodes. We tried the OSFIND command anyway but the cluster manager wasn’t mentioned in the exported text file as you would expect.

However, I do think it is something to do with our network configuration as you mentioned…

I said we have 3 NICs on the server:

We have one 10/100 ethernet link to the Ford corporate LAN. This is what users log onto the machine through.

We have one 1000 G/bit fibre optic link that retrieves the actual data from our SQL/analysis server box. This card has an internal IP address.

We also have a NIC that is used solely for the overnight backup process. We have have eliminated this as a possible cause of the hanging.

I have changed the OSAGENT_ADDR variable to the internal IP of our G/bit card and moved the LAN card to the top of the binding list (replacing the fibre optic card) , more in hope than anything else!

We have tried playing with the security settings before on advice from BO support, but this did not resolve our problem.

I will let you know how I get on on Monday.

Thanks

Stuart


scharl16 (BOB member since 2003-05-14)

I just ran osfind on a single box cluster we have, and the first two lines are
one agent found at 111848
host rpc1848.com

one oad in your domain
host rpc1848

If you’re not seeing that, you’re right to focus in on the NICs…

I think I would try with the OSAGENT_ADDR pointing at the same address that the users would come in thru, and then I would try without that variable set - I don’t think you should need it if you don’t have nodes.

I would look at the website in IIS - force the website to stick to the one IP address, not first available. I’d also look at setting the protection level for WI to the lowest level (IIS process vs Pooled).

And then look for drwatson errors - search for .log files that have been updated recently - that may steer you towards something more specific.

Keep posting your findings, I’d love to help with this. Good luck!

Brent


bdouglas :switzerland: (BOB member since 2002-08-29)

Server is still hanging :cry:

I ran the OSFIND program and I did get the text you mentioned in the log file, although I ran it when the server was up and running as it should.

I have checked the Dr Wartson error logs but they have not been updated recently.

I have now altered the protection level and I will see what happens to the server overnight, although I think someone may have tried altering this setting before on the advice of BO.

I don’t suppose you have anyomore ideas Brent? Anyone else encountered this problem and found a resolution???

Thanks

Stuart


scharl16 (BOB member since 2003-05-14)

Which text, did you see more than 1 reference to any single box? Any box referenced on the wrong subnet?

If you want to send me the output from that file, I’d take a look at it - I’m a little curious - bd150002@ncr.com

Let me know what you see after a day or so with the protection level changed - that was surprisingly effective on one issue we had, but I’m not sure if this is what you’re describing.

Good luck,
Brent


bdouglas :switzerland: (BOB member since 2002-08-29)

Hi Brent

I was going to send you the text file today but on inspection of the system this morning, webI was up and running!!! :rotf:

Maybe the protection level does have an impact on our system - it was a long time since we tried altering it and the problem has changed slightly since then.

The box has occasionally been found running in the morning in the last few months so I can’t say for sure whether the problem is solved or not.

Will let you know tomorrow - Since SP3 was installed a couple of months ago the webI box has not ever been running two mornings in a row so fingers crossed!

thanks

Stuart


scharl16 (BOB member since 2003-05-14)

Well, after a week of testing, the server has been much more stable although the problem hasn’t gone away completely as it went down once this week on the Wednesday.

I have also now changed the webI protection levels to low.

I am keeping a log of how often it crashes and I will update this forum with the results in a couple of weeks.

Thanks to Brent and everyone else who has helped me with this!

Stuart


scharl16 (BOB member since 2003-05-14)

Hi Stuart,

that sounds fairly promising, no? What are the symptoms when it goes down, do you see any processes running unabated, do you get errors accessing documents (wi0506), etc?

Keep us posted as to what you see and when it occurs - Good Luck!

Brent


bdouglas :switzerland: (BOB member since 2002-08-29)