13.9. Monitoring cluster

In order to discover potential issues with HA Cluster early, all nodes in the Cluster should be monitored on regular basis using the following techniques:

  • Broker log files scrapping for WARN or ERROR entries and operational log entries like:

    • MST-1007 : Store Passivated. It can indicate that Master virtual host has gone down.

    • MST-1006 : Recovery Complete. It can indicate that a former Replica virtual host is up and became the Master.

  • Disk space usage and system load using system tools.

  • Berkeley HA node status using DbPing utility.

    Example 13.4. Using DbPing utility for monitoring HA nodes.

    java -jar je-5.0.97.jar DbPing -groupName TestClusterGroup -nodeName Node-5001 -nodeHost localhost:5001 -socketTimeout 10000
    Current state of node: Node-5001 from group: TestClusterGroup
      Current state: MASTER
      Current master: Node-5001
      Current JE version: 5.0.97
      Current log version: 8
      Current transaction end (abort or commit) VLSN: 165
      Current master transaction end (abort or commit) VLSN: 0
      Current active feeders on node: 0
      Current system load average: 0.35

    In the example above DbPing utility requested status of Cluster node with name Node-5001 from replication group TestClusterGroup running on host localhost:5001. The state of the node was reported into a system output.

  • Using Qpid broker JMX interfaces.

    Mbean BDBHAMessageStore can be used to request the following node information:

    • NodeState indicates whether node is a Master or Replica.

    • Durability replication durability.

    • DesignatedPrimary indicates whether Master node is designated primary.

    • GroupName replication group name.

    • NodeName node name.

    • NodeHostPort node host and port.

    • HelperHostPort helper host and port.

    • AllNodesInGroup lists of all nodes in the replication group including their names, hosts and ports.

    For more details about BDBHAMessageStore MBean please refer section Qpid JMX API for HA