Menu Search

This section first describes the behaviour of the group in its default configuration. It then goes on to talk about the various controls that are available to override it. It describes the controls available that affect the durability of transactions and the data consistency between the master and replicas and thus make trade offs between performance and reliability.

Let's first look at the behaviour of a group in default configuration.

In the default configuration, for any messaging work to be done, there must be at least quorum nodes present. This means for example, in a three node group, this means there must be at least two nodes available.

When a messaging client sends a transaction, it can be assured that, before the control returns back to his application after the commit call that the following is true:

If there were to be a master failure immediately after the transaction was committed, the transaction would be held by at least quorum minus one replicas. For example, if we had a group of three, then we would be assured that at least one replica held the transaction.

In the event of a master failure, if quorum nodes remain, those nodes hold an election. The nodes will elect master the node with the most recent transaction. If two or more nodes have the most recent transaction the group makes an arbitrary choice. If quorum number of nodes does not remain, the nodes cannot elect a new master and will wait until nodes rejoin. You will see later that manual controls are available allow service to be restored from fewer than quorum nodes and to influence which node gets elected in the event of a tie.

Whenever a group has fewer than quorum nodes present, the virtualhost will be unavailable and messaging connections will be refused. If quorum disappears at the very moment a messaging client sends a transaction that transaction will fail.

You will have noticed the difference in the synchronization policies applied the master and the replicas. The replicas send the acknowledgement back before the data is written to disk. The master synchronously writes the transaction to storage. This is an example of a trade off between durability and performance. We will see more about how to control this trade off later.

Node priority can be used to influence the behaviour of the election algorithm. It is useful in the case were you want to favour some nodes over others. For instance, if you wish to favour nodes located in a particular data centre over those in a remote site.

A new master is elected among nodes with the most current set of log files. When there is a tie, the priority is used as a tie-breaker to select amongst these nodes.

The node priority is set as an integer value. A priority of zero is used to ensure that a node cannot be elected master, even if it has the most current set of files.

For convenience, the Web Management Console uses user friendly names for the priority integer values in range from 0 to 3 inclusive. The following priority options are available:

Node priority is configured as an attribute priority on the virtualhost node and can be changed at runtime and is effective immediately.

Important

Use of the Never priority can lead to transaction loss. For example, consider a group of three where replica-2 is marked as Never. If a transaction were to arrive and it be acknowledged only by Master and Replica-2, the transaction would succeed. Replica 1 is running behind for some reason (perhaps a full-GC). If a Master failure were to occur at that moment, the replicas would elect Replica-1 even though Replica-2 had the most recent transaction.

Transaction loss is reported by message HA-1014.

This controls the required minimum number of nodes to complete a transaction and to elect a new master. By default, the required number of nodes is set to Default (which signifies quorum).

It is possible to reduce the required minimum number of nodes. The rationale for doing this is normally to temporarily restore service from fewer than quorum nodes following an extraordinary failure.

For example, consider a group of three. If one node were to fail, as quorum still remained, the system would continue work without any intervention. If the failing node were the master, a new master would be elected.

What if a further node were to fail? Quorum no longer remains, and the remaining node would just wait. It cannot elect itself master. What if we wanted to restore service from just this one node?

In this case, Required Number of Nodes can be reduced to 1 on the remain node, allowing the node to elect itself and service to be restored from the singleton. Required minimum number of nodes is configured as an attribute quorumOverride on the virtualhost node and can be changed at runtime and is effective immediately.

Important

The attribute must be used cautiously. Careless use will lead to lost transactions and can lead to a split-brain in the event of a network partition. If used to temporarily restore service from fewer than quorum nodes, it is imperative to revert it to the Default value as the failed nodes are restored.

Transaction loss is reported by message HA-1014.

This attribute only applies to groups containing exactly two nodes.

In a group of two, if a node were to fail then in default configuration work will cease as quorum no longer exists. A single node cannot elect itself master.

The allow to operate solo flag allows a node in a two node group to elect itself master and to operate sole. It is configured as an attribute designatedPrimary on the virtualhost node and can be changed at runtime and is effective immediately.

For example, consider a group of two where the master fails. Service will be interrupted as the remaining node cannot elect itself master. To allow it to become master, apply the allow to operate solo flag to it. It will elect itself master and work can continue, albeit from one node.

Important

It is imperative not to allow the allow to operate solo flag to be set on both nodes at once. To do so will mean, in the event of a network partition, a split-brain will occur.

Transaction loss is reported by message HA-1014.