RabbitMQ Cluster - Are queues on more than one node? and how to create failover

Queues in a RabbitMQ cluster are only on one of the RabbitMQ nodes, along with all of it's content. But this is just the default. Which can be changed using mirrored queues. However all queues behave as if they are on all nodes. Actions such as handling consumers or routing messages can be done on any of the nodes. The nodes will route the actions to the node where the queue actually is. So it is fully transparent for the users. Exchanges and bindings are on all nodes. Let us take a look at mirrored and non-mirrored queues.

Mirrored queues

You can mirror queues in RabbitMQ. This consists of a master and one or several mirrors. All messages are sent to the master and the mirrors. In a case where the master node goes down one of the mirrors will take over. So no messages are lost.

You may think that you are now missing a mirror. However RabbitMQ will make one of the nodes a mirror of the new master. So you always have the same amount of mirrors - except if there are no more nodes to make mirrors of.

Having mirrors creates High availability but not loadbalancing. Since it is always the master node servicing the consumption. When the master node has handled a message the mirrors automatically delete the message as well. However all nodes (even nodes that are not mirrors) can still route all requests to the master. So any node can be used to contact the master - not just mirrors.

Mirrors are created and configured using policies. In a policy you can set how many mirrors you want. More details on how to set this up can be found here.

So in this scenario you have a queue on multiple nodes which serves as failover.

Non-mirrored queues

What happens to a non-mirrored queue when a node goes down depends on whether the queue is durable or not. If the queue is not durable it will simply be deleted and can be redeclared by another node.

Should the queue be durable then it will become unavailable. All actions on this queue will cause the following error: "operation queue.declare caused a channel exception notfound". Rendering the queue useless until the node is back online. Therefore queue mirroring is advisable if you want high availability.

Durable queues that are not mirrored in a cluster setup have a single point of failure (the node)

Wrapping it up

Did I miss something or is anything unclear, let me know in the comments :)