Qdisk

From Alteeve Wiki
Jump to navigation Jump to search

 AN!Wiki :: Qdisk

If you have a cluster of 2 to 16 nodes, you can use a quorum disk. This is a small partition on shared storage device that the cluster can use to make much better decisions about which nodes should have quorum when a split in the network happens.

Unfortunately, qdisk does not work well on DRBD.

How It Works

The way a qdisk works, at it's most basic, is to have one or more votes in quorum. Generally, but not necessarily always, the qdisk device has one vote less than the total number of nodes (N-1).

Two Scenarios

  • In a two node cluster, the qdisk would have one vote.
  • In a seven node cluster, the qdisk would have six votes.

Imagine these two scenarios; First without qdisk, the revisited to see how qdisk helps.

  • First Scenario; A two node cluster, which we will implement here.

If the network connection on the totem ring(s) breaks, you will enter into a dangerous state called a "split-brain". Normally, this can't happen because quorum can only be held by one side at a time. In a two_node cluster though, this is allowed.

Without a qdisk, either node could potentially start the cluster resources. This is a disastrous possibility and it is avoided by a fence dual. Both nodes will try to fence the other at the same time, but only the fastest one wins. The idea behind this is that one will always live because the other will die before it can get it's fence call out. In theory, this works fine. In practice though, there are cases where fence calls can be "queued", thus, in fact, allow both nodes to die. This defeats the whole "high availability" thing, now doesn't it? Also, this possibility is why the two_node option is the only exception to the quorum rules.

How It Helps

So how does a qdisk help?

Two ways!

First;

The biggest way it helps is by getting away from the two_node exception. With the qdisk partition, you are back up to three votes, so there will never be a 50/50 split. If either node retains access to the quorum disk while the other loses access, then right there things are decided. The one with the disk has 2 votes and wins quorum and will fence the other. Meanwhile, the other will only have 1 votes, thus it will lose quorum, and will withdraw from the cluster and not try to fence the other node.

Second;

You can use heuristics with qdisk to have a more intelligent partition recovery mechanism. For example, let's look again at the scenario where the link(s) between the two nodes hosting the totem ring is cut. This time though, let's assume that the storage network link is still up, so both nodes have access to the qdisk partition. How would the qdisk act as a tie breaker?

One way is to have a heuristics test that checks to see if one of the nodes has access to a particular router. With this heuristics test, if only one node had access to that switch, the qdisk would give it's vote to that node and ensure that the "healthiest" node survived. Pretty cool, eh?

  • Second Scenarion; A seven node cluster with six dead members.

Admittedly, this is an extreme scenario, but it serves to illustrate the point well. Remember how we said that the general rule is that the qdisk has N-1 votes?

With our seven node cluster, on it's own, there would be a total of 7 votes, so normally quorum would require 4 nodes be alive (((7/2)+1) = (3.5+1) = 4.5, rounded down is 4). With the death of the fourth node, all cluster services would fail. We understand now why this would be the case, but what if the nodes are, for example, serving up websites? In this case, 3 nodes are still sufficient to do the job. Heck, even 1 node is better than nothing. With the rules of quorum though, it just wouldn't happen.

Let's now look at how the qdisk can help.

By giving the qdisk partition 6 votes, you raise the cluster's total expected votes from 7 to 13. With this new count, the votes needed to for quorum is 7 (((13/2)+1) = (6.5+1) = 7.5, rounded down is 7).

So looking back at the scenario where we've lost four of our seven nodes; The surviving nodes have 3 votes, but they can talk to the qdisk which provides another 6 votes, for a total of 9. With that, quorum is achieved and the three nodes are allowed to form a cluster and continue to provide services. Even if you lose all but one node, you are still in business because the one surviving node, which is still able to talk to the qdisk and thus win it's 6 votes, has a total of 7 and thus has quorum!

There is another benefit. As we mentioned in the first scenario, we can add heuristics to the qdisk. Imagine that, rather than having six nodes die, they instead partition off because of a break in the network. Without qdisk, the six nodes would easily win quorum, fence the one other node and then reform the cluster. What if, though, the one lone node was the only one with access to a critical route to the Internet? The six nodes would be useless in a web-server environment. With the heuristics provided by qdisk, that one useful node would get the qdisk's 6 votes and win quorum over the other six nodes!

A little qdisk goes a long way.

Further Reading

Red Hat's Rob Kenna has a fantastic article on qdisk. It is somewhat dated, but is still very much worth reading.

Quorum disks are limited to cluster with no more than 16 nodes. This limitation is due to inherent storage latency with larger clusters.

You can learn more about quorum disk below:

 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.