BO Cluster - Active CMS(s) found without cluster link ...

Hey,
I have ran into some strange problems durring a cluster deployment of 2 nodes in BO XI R2.

Durring failover testing we have came across a random error (eventlog)from CMS on first server, Server1. It states:

Active CMS(s) found without cluster link. Attempting to reconnect with Server 2.

There is another starger error then this, this one says:

Cluster connection with Server2 has been broken on notifier side for reason: (Notification resulted in exception IDL:omg.org/CORBA/COMM_FAILURE:1.0, minor code 1330577418). Check network connections to CMS machines, and test responsiveness of system database.

Have any of you came across such cases ?

Thanks.
FP


farrukh.pasha (BOB member since 2008-04-03)

Do you have the CMS on both servers configured to access the cluster by name (@ClusterName). To detmine this, go to the Central Configuration Manager on each server. Double-click on the CMS and go to the Configuration tab. In the middle of the page, there should be a line that reads “CMS belongs to cluster X” where X is the name of the cluster. If it’s not the same name, then the CMS’s are probably not pointing to the same database.

When clustering, all of the servers in the cluster MUST use the same physical CMS database and MUST have their Input and Output File Repository Servers pointing to the same physical location using the same UNC file path. All of the other BO servers installed on the physical server must have the cluster name in the “CMS Name” field on their Configuration tabs.

-Dell


hilfy :us: (BOB member since 2007-04-16)

Hi farrukh.pasha,

we do have the same issue and i see the same message in the Event viewer… Did you resolved the problem and if so can you please share how did you fixed it.

Thanks,
Raj


jar80 (BOB member since 2005-10-06)

Let me throw in a vote for a real fix to this. We have been suffering with this error for more than 1 year now in our production environment only. :hb: We are running BO XI R2, SP 4, FP 3. Our CMS is Oracle 10g and I am getting confirmation on whether it is Oracle RAC or not. The ORacle DB is dedicated and on clustered HP-UX servers; all of the BO servers are Windows 2003. Web interface is Java through WebLogic.

We have been working with BO Support Engineers on and off the whole time and nothing really improves in a sustainable fashion. In many cases the cluster remains function and recovers itself, but upon reviewing our very active scheduled job activity of WebI reports we see lots of errors that seem to loosely correspond to this instability.

I should add that the errors come and go, but they seem to be related directly to usage or rather activity on the system Weekends have much fewer errors. Users also note the instability sometimes in the errors they receive and workaround.

We have reduced from 4 clustered CMS to just 2 physical servers sitting next to each other and on the same subnet and the number of errors has reduced, but I would say reduced from 500 per week to 300. So it is still unacceptable.

If anyone has any idea how this error can be solved or brought to reasonable levels please do share. THANKS!


dajabon (BOB member since 2003-09-09)