In order to discover potential issues with HA Cluster early, all nodes in the Cluster should be monitored on regular basis using the following techniques:
Broker log files scrapping for WARN or ERROR entries and operational log entries like:
MST-1007 : Store Passivated. It can indicate that Master virtual host has gone down.
MST-1006 : Recovery Complete. It can indicate that a former Replica virtual host is up and became the Master.
Disk space usage and system load using system tools.
Berkeley HA node status using
DbPingutility.Example 13.4. Using
DbPingutility for monitoring HA nodes.java -jar je-5.0.73.jar DbPing -groupName TestClusterGroup -nodeName Node-5001 -nodeHost localhost:5001 -socketTimeout 10000Current state of node: Node-5001 from group: TestClusterGroup Current state: MASTER Current master: Node-5001 Current JE version: 5.0.73 Current log version: 8 Current transaction end (abort or commit) VLSN: 165 Current master transaction end (abort or commit) VLSN: 0 Current active feeders on node: 0 Current system load average: 0.35
In the example above
DbPingutility requested status of Cluster node with name Node-5001 from replication group TestClusterGroup running on host localhost:5001. The state of the node was reported into a system output.Using Qpid broker JMX interfaces.
Mbean
BDBHAMessageStorecan be used to request the following node information:NodeState indicates whether node is a Master or Replica.
Durability replication durability.
DesignatedPrimary indicates whether Master node is designated primary.
GroupName replication group name.
NodeName node name.
NodeHostPort node host and port.
HelperHostPort helper host and port.
AllNodesInGroup lists of all nodes in the replication group including their names, hosts and ports.
For more details about
BDBHAMessageStoreMBean please refer section Qpid JMX API for HA