1.13. Active-passive Messaging Clusters (Preview)

1.13.1. Overview

This release provides a preview of a new module for High Availability (HA). The new module is not yet complete or ready for production use, it being made available so that users can experiment with the new approach and provide feedback early in the development process. Feedback should go to user@qpid.apache.org.

The old cluster module takes an active-active approach, i.e. all the brokers in a cluster are able to handle client requests simultaneously. The new HA module takes an active-passive, hot-standby approach.

In an active-passive cluster, only one broker, known as the primary, is active and serving clients at a time. The other brokers are standing by as backups. Changes on the primary are immediately replicated to all the backups so they are always up-to-date or "hot". If the primary fails, one of the backups is promoted to be the new primary. Clients fail-over to the new primary automatically. If there are multiple backups, the backups also fail-over to become backups of the new primary.

The new approach depends on an external cluster resource manager to detect failure of the primary and choose the new primary. The first supported resource manager will be rgmanager, but it will be possible to add integration with other resource managers in the future. The preview version is not integrated with any resource manager, you can use the qpid-ha tool to simulate the actions of a resource manager or do your own integration.

1.13.1.1. Why the new approach?

The new active-passive approach has several advantages compared to the existing active-active cluster module.
  • It does not depend directly on openais or corosync. It does not use multicast which simplifies deployment.
  • It is more portable: in environments that don't support corosync, it can be integrated with a resource manager available in that environment.
  • Replication to a disaster recovery site can be handled as simply another node in the cluster, it does not require a separate replication mechanism.
  • It can take advantage of features provided by the resource manager, for example virtual IP addresses.
  • Improved performance and scalability due to better use of multiple CPU s

1.13.1.2. Limitations

There are a number of known limitations in the current preview implementation. These will be fixed in the production versions.

  • Transactional changes to queue state are not replicated atomically. If the primary crashes during a transaction, it is possible that the backup could contain only part of the changes introduced by a transaction.
  • During a fail-over one backup is promoted to primary and any other backups switch to the new primary. Messages sent to the new primary before all the backups have switched could be lost if the new primary itself fails before all the backups have switched.
  • When used with a persistent store: if the entire cluster fails, there are no tools to help identify the most recent store.
  • Acknowledgments are confirmed to clients before the message has been dequeued from replicas or indeed from the local store if that is asynchronous.
  • A persistent broker must have its store erased before joining an existing cluster. In the production version a persistent broker will be able to load its store and avoid downloading messages that are in the store from the primary.
  • Configuration changes (creating or deleting queues, exchanges and bindings) are replicated asynchronously. Management tools used to make changes will consider the change complete when it is complete on the primary, it may not yet be replicated to all the backups.
  • Deletions made immediately after a failure (before all the backups are ready) may be lost on a backup. Queues, exchange or bindings that were deleted on the primary could re-appear if that backup is promoted to primary on a subsequent failure.
  • Better control is needed over which queues/exchanges are replicated and which are not.
  • There are some known issues affecting performance, both the throughput of replication and the time taken for backups to fail-over. Performance will improve in the production version.
  • Federated links from the primary will be lost in fail over, they will not be re-connected on the new primary. Federation links to the primary can fail over.
  • Only plain FIFO queues can be replicated. LVQ and ring queues are not yet supported.

1.13.2. Configuring the Brokers

The broker must load the ha module, it is loaded by default when you start a broker. The following broker options are available for the HA module.

Table 1.18. Options for High Availability Messaging Cluster

Options for High Availability Messaging Cluster
--ha-cluster yes|no Set to "yes" to have the broker join a cluster.
--ha-brokers URL URL use by brokers to connect to each other. The URL lists the addresses of all the brokers in the cluster [a] in the following form:
		url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)*
		addr = tcp_addr / rmda_addr / ssl_addr / ...
		tcp_addr = ["tcp:"] host [":" port]
		rdma_addr = "rdma:" host [":" port]
		ssl_addr = "ssl:" host [":" port]'
	      
--ha-public-brokers URL URL used by clients to connect to the brokers in the same format as --ha-brokers above. Use this option if you want client traffic on a different network from broker replication traffic. If this option is not set, clients will use the same URL as brokers.

--ha-username USER

--ha-password PASS

--ha-mechanism MECH

Brokers use USER, PASS, MECH to authenticate when connecting to each other.

[a] If the resource manager supports virtual IP addresses then the URL can contain just the single virtual IP.


To configure a cluster you must set at least ha-cluster and ha-brokers

1.13.3. Creating replicated queues and exchanges

To create a replicated queue or exchange, pass the argument qpid.replicate when creating the queue or exchange. It should have one of the following three values:

  • all: Replicate the queue or exchange, messages and bindings.
  • configuration: Replicate the existence of the queue or exchange and bindings but don't replicate messages.
  • none: Don't replicate, this is the default.

Bindings are automatically replicated if the queue and exchange being bound both have replication argument of all or confguration, they are not replicated otherwise. You can create replicated queues and exchanges with the qpid-config management tool like this:
      qpid-config add queue myqueue --replicate all
    
To create replicated queues and exchangs via the client API, add a node entry to the address like this:
      "myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}"
    

1.13.4. Client Fail-over

Clients can only connect to the single primary broker. All other brokers in the cluster are backups, and they automatically reject any attempt by a client to connect.

Clients are configured with the addreses of all of the brokers in the cluster. [1] When the client tries to connect initially, it will try all of its addresses until it successfully connects to the primary. If the primary fails, clients will try to try to re-connect to all the known brokers until they find the new primary.

Suppose your cluster has 3 nodes: node1, node2 and node3 all using the default AMQP port.

With the C++ client, you specify all the cluster addresses in a single URL, for example:

	qpid::messaging::Connection c("node1:node2:node3");
      

With the python client, you specify reconnect=True and a list of host:port addresses as reconnect_urls when calling establish or open

	connection = qpid.messaging.Connection.establish("node1", reconnect=True, "reconnect_urls=["node1", "node2", "node3"])
      

1.13.5. Broker fail-over

Broker fail-over is managed by a cluster resource manager. The initial preview version of HA is not integrated with a resource manager, the production version will be integrated with rgmanager and it may be integrated with other resource managers in the future.

The resource manager is responsible for ensuring that there is exactly one broker is acting as primary at all times. It selects the initial primary broker when the cluster is started, detects failure of the primary, and chooses the backup to promote as the new primary.

You can simulate the actions of a resource manager, or indeed do your own integration with a resource manager using the qpid-ha tool. The command

	qpid-ha promote -b host:port
      

will promote the broker listening on host:port to be the primary. You should only promote a broker to primary when there is no other primary in the cluster. The brokers will not detect multiple primaries, they rely on the resource manager to do that.

A clustered broker always starts initially in discovery mode. It uses the addresses configured in the ha-brokers configuration option and tries to connect to each in turn until it finds to the primary. The resource manager is responsible for choosing on of the backups to promote as the initial primary.

If the primary fails, all the backups are disconnected and return to discovery mode. The resource manager chooses one to promote as the new primary. The other backups will eventually discover the new primary and reconnect.

1.13.6. Broker Administration

You can connect to a backup broker with the administrative tool qpid-ha. You can also connect with the tools qpid-config, qpid-route and qpid-stat if you pass the flag --ha-admin on the command line. If you do connect to a backup you should not modify any of the replicated queues, as this will disrupt the replication and may result in message loss.



[1] If the resource manager supports virtual IP addresses then the clients can be configured with a single virtual IP address.