What is Fault Tolerance in Vmware

Posted on 9:48 PM by Bharathvn

What is VMware Fault Tolerance?
VMware Fault Tolerance is a feature that allows a new level of guest redundancy. Information regarding this feature can be found in Understanding VMware Fault Tolerance (1010601).
How do I turn it on?
The feature is enabled on a per virtual machine basis. Instructions for enabling Fault Tolerance can be found in the Turning on Fault Tolerance for Virtual Machines section of the vSphere Availability Guide.
What happens when I turn on Fault Tolerance?
In very general terms, a second virtual machine is created to work in tandem with the virtual machine you have enabled Fault Tolerance on. This virtual machine resides on a different host in the cluster, and runs in virtual lockstep with the primary virtual machine. When a failure is detected, the second virtual machine takes the place of the first one with the least possible interruption of service. More specific information about how this is achieved can be found in the Protecting Mission-Critical Workloads with VMware Fault Tolerance whitepaper.
Why can't I turn Fault Tolerance on?
VMware Fault Tolerance can be enabled on any virtual machine that resides in a cluster that meets the necessary requirements. If you have difficulty enabling Fault Tolerance for a specific virtual machine, see The Turn on Fault Tolerance option is disabled (1010631).
How do I turn Fault Tolerance off?
Instructions for disabling Fault Tolerance can be found in the article in Disabling or Turning Off VMware FT (1008026).
How do I tell if my environment is ready for Fault Tolerance?
The VMware SiteSurvey Tool is used to check your environment for compliance with VMware Fault Tolerance. It can be downloaded at http://www.vmware.com/download/shared_utilities.html.
Where do I find the product's website?
VMware has a website for the Fault Tolerance product available online here at http://www.vmware.com/products/fault-tolerance/.
What happens during a failure?
When a host running the primary virtual machine fails, a transparent failover occurs to the corresponding secondary virtual machine. During this failover, there is no data loss or noticeable service interruption. In addition, VMware HA automatically restores redundancy by restarting a new secondary virtual machine on another host. Similarly, if the host running the secondary virtual machine fails, VMware HA starts a new secondary virtual machine on a different host. In either case there is no noticeable outage by an end user.
What is the logging time delay between the Primary and Secondary Fault Tolerance virtual machines?
The actual delay is based on the network latency between the Primary and Secondary. vLockstep executes the same instructions on the Primary and Secondary, but because this happens on different hosts, there could be a small latency, but no loss of state. This is typically less than 1 ms. Fault Tolerance includes synchronization to ensure that the Primary and Secondary are synchronized.
In a cluster with more than 3 hosts, can you tell Fault Tolerance where to put the Fault Tolerance virtual machine or does it chose on its own?
You can place the original (or Primary virtual machine). You have full control with DRS or VMotion to assign to it to any node. The placement of the Secondary, when created, is automatic based on the available hosts. But when the secondary is created and placed, you can VMotion it to the preferred host.
What happens if the host containing the primary virtual machine comes back online (after a node failure)?
This node is put back in the pool of available hosts. There is no attempt to start or migrate the primary to that host.
Is the failover from the primary virtual machine to the secondary virtual machine dynamic or does Fault Tolerance restart a virtual machine?
The failover from primary to secondary virtual machine is dynamic, with the secondary continuing execution from the exact point where the primary left off. It happens automatically with no data loss, no downtime, and little delay. Clients see no interruption. After the dynamic failover to the secondary virtual machine, it becomes the new primary virtual machine. A new secondary virtual machine is spawned automatically
Where are Fault Tolerance failover events logged?
All failover events are logged by vCenter.
I encountered an error message that I can't find in the knowledge base. Where else should I check?
The vSphere Availability Guide contains a list of known errors in the Fault Tolerance Error Messages section.