replay.supported = "true"During the beta of vSphere 5, I did enable vFT but on an offline virtual machine to conserve on unnecessary compute resources. Today there was a question on the beta community around configuring vFT for vSphere 5 and I wanted to quickly validate the configurations still hold true. I ran into a interesting error when trying to enable vFT, the power on process for the secondary virtual machine failed with the following error:
replay.allowFT = "true"
replay.allowBTOnly = "true"
This was not an error I had seen before in vSphere 4 and looking at the vmkernel and vmware.log files, I noticed the following:
2011-07-31T17:31:39.314Z| vcpu-0| [vob.vmotion.stream.keepalive.read.fail] vMotion migration [ac1e0050:1312133702562144] failed to read stream keepalive: Connection closed by remote host, possibly due to timeoutI tried changing the advanced option on the vESX(i) host to increase the vMotion timeout but continued to hit the same error. I decided to look more into the first error message "failed to read stream keepalive" and found an advanced ESX(i) setting called /Migrate/VMotionStreamDisable, this advanced option has been available since ESX(i) 4.x.
2011-07-31T17:31:39.314Z| vcpu-0| [msg.checkpoint.precopyfailure] Migration to host <> failed with error Connection closed by remote host, possibly due to timeout (0xbad003f).
2011-07-31T17:31:39.324Z| vcpu-0| Migrate: secondary failure during migration: error Connection closed by remote host, possibly due to timeout.
I decided to disable vMotion Stream and to my surprised, it allowed FT to power on the secondary virtual machine and no longer ran into that error.
Note: You may or may not run into this error message and the configuration may not be necessary. If you enable vFT on an offline VM, you should not have any issues as long as you meet the minimum Fault Tolerance requirements.
You can configure the advanced ESXi option using either esxcli or legacy esxcfg-advcfg commands:
- esxcli system settings advanced set -o /Migrate/VMotionStreamDisable -i 0
- esxcfg-advcfg -s 0 /Migrate/VMotionStreamDisable
It is important to understand that even though one can setup a vESX(i) hosts and test and play with some of the advanced functionality such as vMotion and FT that the actual behavior is unpredictable as these configurations are unsupported by VMware. This of course is also great feature for home labs and studying for VMware certifications such as VCP and VCAP-DCA, but that should be the extent of leveraging these unsupported configurations.




We were trying to setup FT across physically separate sites for disaster recovery. However, since FT only allows 1 secondary VM, we are unable to have a coinfig in which we have can have a local failover and if a site is lost then a failover to the secondary site.
ReplyDeleteWe thought if we can run an FT VM within an FT VM, we may get close to what we need. Parent VM failure can be handled with a local failover and if both primary FT VM and its nested FT VM fail, the nested FT VM's secondary that is running on the secondary site will continue processing.
What do you think?
If you're looking for site failover, you may want to look into something like SRM. Not quite sure I follow how a nested FT VM would even work for you nor would I rely/recommend it. This article was to show you how you could enable it for testing it out and seeing it in action and specifically for studying for certifications.
Delete