Failover of servers

By failover of servers, we mean here the creation of clusters (simple or complexà, allowing applications to be highly-available.

Simple clusters can be created with the help of UCARP. UCARP implement a patent-free version of VRRP called CARP. The purpose of this small application is to exchange VRRP multicast messages with it's peer(s) to detect if one of them is missing. One of the nodes running UCARP is elected to be the master, if it dissapear, this will be detected by a slave that will promote itself as a master. UCARP implements only the multicast exchange (an heartbeat protocol), the master-slave management (promotion, demotion, ...) and two actions when it becomes master (UP) or slave (DOWN). It is up to the administrator to create the scripts that will be used by the UP and DOWN action. Any consistency checks, fencing, monitoring, ... should take place into these scripts and are not part of the UCARP itself stricto-sensu.

Complex clusters can be created with the help of Pacemaker+Corosync. This is a full cluster framework implementing the OCF standards (Open Cluster Foundation). In standard, it delivers various scripts allowing to manage the most common application or resource you can need. It provides all components to make a cluster, so it is just a matter of configuration to have a cluster up and running. But on the other hand, because it can do a lot, it is ratter complex to configure.
Some features (non-exhaustive list) of Pacemaker are :
  • fencing and STONITH support
  • multiple node configuration (more than 2)
  • possibility to have multi-state resources : resources than can run on more than one node in various states. Practical example: the DRBD replication where a process must be started on two nodes, one as Primary and the other one as Secondary
  • cloning of resource allowing a single defined resource to run on more than one node with the same parameters
  • ordering and colocation of resources : creating a dependency tree of resources to start them in a given order on the same host for instance
  • preferred location of resources in the cluster (to force a resource to execute on a given node if this one is online)
  • (...)
Pacemaker will be helpful if you need to create a complex chain of resources, with inter-dependancies and various statuses. You will use UCARP in very simple situation where the resource to manage can be easily started, stopped and checked.