Quantcast
Viewing all articles
Browse latest Browse all 19

Scenario#42 – CUC not working after adding Secondary Server

I came across an interesting scenario few days back where customer added a CUC Secondary server back to the Cluster but it was not working as it should.

Customer had two CUC servers in the cluster in a load-balanced configuration with half of the ports on unity-02(sub) and other half on unity-01(Pub). For some hardware failure the secondary failed and they had to re-build it from Scratch. The problem they were facing was that unity was not taking any calls as the first 80 ports were on unity-02 and for some reason it was not happy.

I checked the replication status first and found this:

admin:utils dbreplication runtimestate

DB and Replication Services: ALL RUNNING

Cluster Replication State: Replication status command started at: 2012-10-18-11-44
Replication status command COMPLETED 1 tables checked out of 425
Processing Table: typedberrors with 982 records
No Errors or Mismatches found.

Use ‘file view activelog cm/trace/dbl/sdi/ReplicationStatus.2012_10_23_11_44_00.out’ to see the details

DB Version: ccm7_1_2_20000_2
Number of replicated tables: 425

Cluster Detailed View from PUB (2 Servers):

PING                     REPLICATION    REPL.     DBver&                REPL.     REPLICATION SETUP
SERVER-NAME IP ADDRESS        (msec)  RPC?     STATUS                QUEUE TABLES                 LOOP?  (RTMT) & details
———–             ————           ——      —-         ———–             —–       ——-    —–       —————–
UNITY-01                x.x.0.35            0.049     Yes         Connected         0             match   N/A       (2) PUB Setup Completed
unity-02                x.x.0.36             0.254     Yes         Off-Line               N/A       0      N/A        (4) Setup Completed

admin:file view activelog cm/trace/dbl/sdi/ReplicationStatus.2012_10_23_11_44_00.out

SERVER                 ID STATE    STATUS     QUEUE  CONNECTION CHANGED
———————————————————————–
g_unity_01_ccm7_1_2_20000_2    2 Active   Local           0

————————————————-
I can see (4) with Secondary which means not good!

I then found Secondary server complaining about some CDR records. A little search at Cisco and I found a well known defect. Bug Id: CSCta15666 for CUC 7.1.2.20000-140.

- – -

In CDR Define logs (file list activelog /cm/trace/dbl/*)
We got exception in Cdr define
Ignoreable exception occurred will continue. Value:92

In CDR output broadcast logs (file list activelog /cm/trace/dbl/*)
Error 17 while doing cdr check, will cdr deleteTime taken to do cdr check[1.92180991173]

Exception from cdr delete e.value [37] e.msg[Error executing [su -c 'ulimit -c 0;cdr delete server g_nhbl_vo_cl1fs02_ccm7_1_2_10000_16' - informix] returned [9472]]

The steps taken to fix this were the following:
- utils dbreplication stop all on publisher
- utils dbreplication dropadmindb on both servers
- utils dbreplication forcedatasyncsub on subscriber
- utils dbreplication reset all
- rebooted the subscriber

After this, the dbreplication was fixed.


Image may be NSFW.
Clik here to view.
Image may be NSFW.
Clik here to view.

Viewing all articles
Browse latest Browse all 19

Trending Articles