• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

virtuallyGhetto

  • About
  • Privacy
  • VMware Cloud
  • Home Lab
  • Nested Virtualization
  • Automation
    • VMware Kickstart
    • VMware API/SDK/CLI
    • VMware vMA/VIMA
    • VMware OVF / OVFTOOL
  • Apple Mac
  • VCSA
  • VSAN

William Lam

How to Decrease iSCSI Login Timeout on ESXi 5?

10/14/2011 by William Lam 6 Comments

VMware recently identified an issue where the iSCSI boot process may take longer than expected on ESXi 5. This can occur when the iSCSI targets are unavailable while the ESXi host is booting up and the additional retry code that was added in ESXi 5 can cause a delay in the host startup. It has been noted that the number of retry iteration has been hard coded in the iSCSI stack to nine and this can not be modified.

UPDATE1: VMware just released the patch for iSCSI delay in ESXi 5 Express Patch 01 - kb.vmware.com/kb/2007108

Here is what you would see in /var/log/syslog.log after ESXi host boots up:

Being the curious person that I am, I decided to see if this hard coded value can actually be modified or tweaked. I believe I have found a way to decrease the delay significantly but this has only been tested in a limited lab environment. If you are potentially impacted by this iSCSI boot delay and would like to test this unsupported configuration change, I would be interested to see if this fact reduces the timing.

*** DISCLAIMER ***

This is not supported by VMware, please test this in a staging/development environment before pushing it out to any critical system
*** DISCLAIMER ***

My initial thought was check out the iscsid.conf configuration file, but noticed that VMware does not make use of this default configuration file (not automatically backed up by ESXi), instead it uses a small sqlite database file located in /etc/vmwre/vmkiscsid/vmkiscsid.db.

Since the vmkiscsid.db is actively backed up ESXi, any changes would persist through system reboot and does not require any special tricks.

UPDATE: Thanks to Andy Banta's comment below, you can view the iSCSI sqlite database by using an unsupported --dump-db and -x flag in vmkiscsid.

To dump the entire database, you can use the following command:

vmkiscsid --dump-db

To execute a specific SQL query such as viewing a particular database table, you can use the following command:

vmkiscsid -x "select * from discovery"

Again, this is an unsupported flag by VMware, use at your own risk but it does save you from copying the iSCSI database to another host for edits.

To view the contents of the sqlite file, I scp'ed the file off to a remote host that has sqlite3 client such as the VMware vMA appliance. After a bit of digging with some trial and error, I found the parameter  discovery.ltime in the discovery table would alter the time it takes for the iSCSI boot up process to complete when the iSCSI targets are unavailable during boot up. Before you make any changes, make a backup of your vmkiscsid.db file in case you need the original file.

To view the sqlite file, use the following command:

sqlite3 vmkiscsid.db
.mode line *
select * from discovery;

The default value is 27 and what I have found is that as long as it is not this particular value, the retry loop is not executed or it exits almost immediately after retrying for only a few seconds.

To change the value, I used the following SQL command:

update discovery set 'discovery.ltime'=1; 

To verify the value, you can use the following SQL command:

.mode line *
select * from discovery;

Once you have confirmed the change, you will need to type .quit to exit and then upload the modified vmkiscsid.db file to our ESXi 5 host.

Next to ensure the changes are saved immediately to the backup bootbank, run the /sbin/auto-backup.sh which will force an ESXi backup.

At this point, to test you can disconnect your iSCSI target from the network and reboot your ESXi 5 host, it should hopefully decrease the amount of time it takes to go through the iSCSI boot process.

As you can see from this screenshot of the ESXi syslog.log, it took only 15 seconds to retry and the continue through the boot up process.

In my test environment, I setup a vESXi host with software iSCSI initiator which binded to three VMkernel interfaces and connected to ten iSCSI targets on a Nexenta VSA. I disconnected the network adapter on iSCSI target and modified the discovery.ltime on ESXi host and checked out the logs to see how long it took to get pass the retry code.

Here is a table of the results:

discovery.ltime iSCSI Bootup Delay
1 15sec
3 15sec
6 15sec
12 15sec
24 15sec
26 15sec
27 7min
28 15sec
81 15sec

As you can see, only the value of 27 causes the extremely long delay (7 minutes in my environment) and all other values all behave pretty much the same (15secs roughly). VMware did mention hard coding the number of iterations to 9 and when you divide 27 into 9 you get 3. I tried using values that were multiple of three and I was not able to find any correlation to the delay other than it not taking as long as using the value 27. I also initially tested with 5 iSCSI targets and doubled it to 10 but it did not seem to be a factor in the overall delay.

I did experiment with other configuration parameters such as node.session.initial_login_retry_max, but it did not change the amount of time or iterations of the iSCSI retry code. Ultimately, I believe due to the hard coding of the retry iterations, by modifying discovery.ltime it bypasses the retry code or reduces the amount of retry all together. I am not an iscsid expert, so there is a possibility that other parameter changes could decrease the amount of wait time.

UPDATE: Please take a look at Andy Banta's comment below regarding the significance of the value 27 and the official definitions of the discovery.ltime and node.session.initial_login_retry_max parameters. Even though the behavior from my testing seems to reduce the iSCSI boot delay, there is an official fix coming from VMware.

UPDATE1: VMware just released the patch for iSCSI delay in ESXi 5 Express Patch 01 - kb.vmware.com/kb/2007108

I would be interested to see if this hack holds true for environments with multiple network portals or great number of iSCSI targets. If you are interested or would like to test this theory, please leave a comment regarding your environment.

Share this...
  • Twitter
  • Facebook
  • Linkedin
  • Reddit
  • Pinterest

Filed Under: Uncategorized Tagged With: esxi5, iscsi, vSphere 5

How to Create a vCenter Alarm to Monitor for root Logins

10/12/2011 by William Lam 6 Comments

Another interesting question on the VMTN forums this week, a user was looking for a way to trigger a vCenter alarm when a someone would login to an ESX(i) host using the root account. By default there are several dozen pre-defined vCenter alarms that you can adjust or modify to your needs, but it does not cover every single condition/event that can be triggered via an alarm. This is where the power of the vSphere API comes in. If you browse through the available event types, you will find one that corresponds to sessions called sessionEvent and within that category of events, you will see a UserLoginSessionEvent.

Now that we have identified the particular event we are interested in, we simply just create a new custom alarm that monitors for this event and ensure that "userName" property matches "root" as the user we are trying to alarm on. I wrote a vSphere SDK for Perl script called monitorUserLoginAlarm.pl that can be used to create an alarm on any particular user login.

The script requires only two parameters: alarmname (name of the vCenter alarm) and user (username to alarm on). Here is a sample output for monitoring root user logins on an ESX(i) host:

The alarm will be created at the vCenter Server level and you should see the new alarm after executing the script.

Note: The alarm action is currently to alert within vCenter, if you would like it to perform other operations such as sending an email or an SNMP trap, you can edit the alarm after it has been created by the script.

Next it is time to test out the new alarm, if you click on the "Alarms" tab under "Triggered Alarms" and login to one of the managed ESX(i) host using a vSphere Client with the root account, you should see the new alarm trigger immediately.

If we view the "Tasks/Events" tab for more details, we can confirm the login event and that it was from someone using the root account.

As you can see even though this particular event was not available as a default selection, using the vSphere API, you can still create a custom alarm to monitor for this particular event.

I do not know what the original intent of monitoring for monitoring root logins, but if there is a fear of the root  account being used, the easiest way to prevent this is to enable vCenter Lockdown Mode for your ESXi host.

Share this...
  • Twitter
  • Facebook
  • Linkedin
  • Reddit
  • Pinterest

Filed Under: Uncategorized Tagged With: alarm, api, root, vsphere sdk for perl

How to Generate VM Remote Console URL for vSphere 5.x Web Client

10/11/2011 by William Lam 64 Comments

There was a question last week on the VMTN community forums about generating a shortcut URL to a virtual machine's remote console in the new vSphere 5 Web Client. For those of you who have used the vCenter's Web Access may recall the option to generate a desktop shortcut to a particular virtual machine's remote console which includes ability to obfuscate the generated URL that could then be provided to your users.

With the updated vSphere 5 Web Client, there is not an option to generate the remote console URL but there is a link that you could manually copy and provide to your users. This of course is not ideal but after a tinkering, I was able to figure out how to generate the remote console URL for any virtual machine in the new vSphere 5.x Web Client.

I also created a vSphere SDK for Perl script awhile back called generateVMRemoteConsoleURL.pl which helps users automate the URL generation for vSphere 4.x environments, it has now been updated to support vSphere 5.

Here is an example of what the URL looks like for vSphere 5.0:

https://reflex.primp-industries.com:9443/vsphere-client/vmrc/vmrc.jsp?vm=EE26E7F6-591B-4256-BD7A-402E5AC9E0A8:VirtualMachine:vm-1506

Here is an example of what the URL looks like for vSphere 5.1 & 5.5:

https://reflex.primp-industries.com:9443/vsphere-clien/vmrc/vmrc.jsp?vm=urn:vmomi:VirtualMachine:vm-1506:EE26E7F6-591B-4256-BD7A-402E5AC9E0A8

There are basicallythree important components to the URL:

  • Hostname of the vCenter Server - reflex.primp-industries.com
  • The vCenter instanceUUID which used to uniquely identify a vCenter Server - EE26E7F6-591B-4256-BD7A-402E5AC9E0A8
  • The MoRef ID of the virtual machine - vm-1506

Since the Web Client does not support URL customization, you only need to provide the name of a virtual machine to the generateVMRemoteConsoleURL.pl script and the URL will be generated for you.

Here is an example execution of the script:

Now you can take the URL output from the script and enter it into a supported web browser. You will then be asked to authenticate before it allows you to access the remote console of a particular virtual machine.

You can now provide links to specific virtual machines

Share this...
  • Twitter
  • Facebook
  • Linkedin
  • Reddit
  • Pinterest

Filed Under: VMRC, vSphere, vSphere Web Client Tagged With: remote console, vmrc, vSphere 5, web client

How to Automate the disabling of the VAAI UNMAP primitve in ESXi 5

10/01/2011 by William Lam 5 Comments

I just saw an interesting article on Jason Boche's blog about VMware's recall of the new VAAI UNMAP primitive for vSphere 5. VMware released a new KB article KB2007427 documenting the details along with a recommendation to disable the VAAI UNMAP primitive for now (a patch will be released in the future to automatically disable this until the issue is resolved).

One thing that caught my in the VMware KB article is to disable the VAAI UNMAP primitive, you need to manually login to ESXi 5 Tech Support Mode (SSH) to run the local version of esxcli command. This is trivial if you have several hosts, but it can be time consuming if you have several hundred hosts to manage. Even though the "VMFS3.EnableBlockDelete" is a hidden parameter that can not be seen using any of the supported utilities, you can enable and disable the property using the remote version of esxcli which is part of vMA 5 or vCLI 5.

Here is an example of the command when connecting directly to an ESXi 5 host:

esxcli --server himalaya.primp-industries.com --username root system settings advanced set --int-value 1 --option /VMFS3/EnableBlockDelete

Here is an example of command when connecting to vCenter Server:

esxcli --server reflex.primp-industries.com --vihost himalaya.primp-industries.com --username administrator system settings advanced set --int-value 1 --option /VMFS3/EnableBlockDelete

As you can see, you can easily wrap this in a simple for-loop to disable the VAAI UNMAP primitive. So here is a script to help with exactly that called vaaiUNMAP.sh

The script accepts 4 parameters:

  1. A list of ESXi 5 hosts to enable or disable VAAI UNMAP primitive
  2. Name of the vCenter Server managing the ESXi 5 hosts
  3. vCenter auth file which contains the username/password
  4. 0 to disable or 1 to enable VAAI UNMAP primitive

The auth file is just a file that contains the following:

VI_USERNAME=administrator
VI_PASSWORD=y0mysuperdupersecurepassword

Here is an example of disabling VAAI UNMAP primitive on 3 ESXi hosts being managed by a vCenter Server:

To help extract all ESXi 5 hosts from your vCenter Server, you can use the following vSphere SDK for Perl getESXi5Hosts.pl

Here is an example of running the script and just save the output to a file:

One of the unfortunate thing about the VMFS3.EnableBlockDelete is that it is a hidden parameter along with others, so it will not automatically display when using local or remote ESXCLI, but thanks to Craig Risinger, you can still get the information using the remote ESXCLI by specifying the --option parameter which is great because you do not need to login to the ESXi Shell to retrieve the information.

esxcli --server himalaya.primp-industries.com --username root system settings advanced list --option /VMFS3/EnableBlockDelete

I would also recommend adding an entry into your ESXi 5 kickstart to automatically disable VAAI UNMAP by default until a fix is released.

Shell
1
2
3
4
5
%firstboot --interpreter=busybox
 
 
#disable VAAI UNMAP primitive
esxcli system settings advanced set --int-value 0 --option /VMFS3/EnableBlockDelete

Share this...
  • Twitter
  • Facebook
  • Linkedin
  • Reddit
  • Pinterest

Filed Under: Uncategorized Tagged With: esxcli, esxi5, unmap, vaai, vSphere 5

How to Install VMware VSA with Running VMs

09/26/2011 by William Lam 1 Comment

For those of you who want to quickly test out the new VMware VSA (vSphere Storage Appliance) will notice that you can not just throw a few ESXi 5 hosts that have running virtual machines on them. If you try to proceed with the VSA installation, you will see an error message regarding the presence of virtual machines whether they are running or not.

This can make it difficult to evaluate or test the new VSA if you do not have additional hosts that can be easily re-deployed as vanilla ESXi 5 installations. While working on the previous article How to Install VMware VSA in Nested ESXi 5 Host Using the GUI, I decided to test out the behavior of a few other configuration variables found in the dev.properties file for the VSA Manager. It turns out that you can actually disable the host audit check which includes the validation of running virtual machines by changing the host.audit variable from "true" to "false" using the same trick documented here. You will need to restart the VSA Manager and then the vCenter Server service for the change to go into effect.

**** DISCLAIMER: This is not supported by VMware and there maybe specific checks that are now bypassed by disabling the host.audit parameter. Please use at your own risk and test before deploying on actual systems **** 

One interesting observation made while testing this in a nested ESXi configuration is that even though there is a message warning the user that any data found on the local VMFS volumes will be deleted, I did not see any process that was kicked off to do so. This does not mean this was not the original intention, but there was no reformatting of the local VMFS or removal or powering off of the running virtual machines. While testing both a "supported" and "ghetto" installation of the VSA, I found that several advanced settings were updated as part of the VSA installation, you should see the same if you look in the vmkernel.log of one of the ESXi 5 hosts:

2011-09-23T17:36:33.030Z cpu0:3475)Config: 346: "HostLocalSwapDirEnabled" = 0, Old Value: 0, (Status: 0x0)
2011-09-23T17:38:00.971Z cpu0:3258)Config: 346: "HeartbeatPanicTimeout" = 60, Old Value: 900, (Status: 0x0)
2011-09-23T17:38:07.069Z cpu1:2851)Config: 346: "EnableSVAVMFS" = 1, Old Value: 0, (Status: 0x0)
2011-09-23T17:38:07.090Z cpu1:2851)Config: 346: "VmkStressEnable" = 0, Old Value: 1, (Status: 0x0)
2011-09-23T17:44:22.163Z cpu1:3477)Config: 346: "SIOControlFlag2" = 1, Old Value: 0, (Status: 0x0)

One that sparked my curiosity is EnableSVAVMFS which is a hidden setting found on the ESXi host but one can view it using vsish. Per the limited documentation found in vsish, this parameter is to enable some sort of optimization with the local VMFS volume.

Thanks to @VMwareStorage (Cormac Hogan, VMware Technical Marketing for Storage) for the quick answer to my question on twitter, it looks like this parameter does the following:

"Forces linear allocation of VMDKs on local VMFS for VSA. Improves mirroring performance across VSAs apparently" 

There was nothing in the vmkernel.log that would indicate the local VMFS was reformatted or files had to be delete to support the VSA installation. I can understand why VMware wanted a vanilla installation which included no running VMs to simplify the installation process. Another reason that I can think of is by having some initial storage consumption, it can offset the amount of "available" storage that needs to be setup on VSA cluster. The amount of available storage per host must be equal on all two or three node cluster, to ensure there is sufficient space for replication. As long as you understand by having running virtual machines on one or more ESXi nodes, the node with the smallest amount of free physical storage is what the rest of the VSA nodes will be configured to.

You potentially may also find yourself in a chicken and the egg problem if VSA installation fails to install and reverts it's changes, which includes putting the ESXi host into maintenance mode. This will cause it to fail on the node that is running the vCenter Server and VSA Manager, another reason you would want to run the management system outside of the VSA cluster.

Without further ado, I recorded a quick 6minute video demonstrating the installation of the new VMware VSA on ESXi 5 hosts that has running virtual machines which includes the vCenter Server and VSA Manager running on one of the nodes (video is awesome when you bump up the audio):

Installing VMware VSA with Running VMs from lamw on Vimeo.

Not only is this not supported, but it is also NOT a best practice to run the vCenter Server and VSA Manager within the VSA cluster because you may potentially have issues with replication if vCenter and VSA Manager goes down. In my testing, I found that I could take down vCenter and VSA Manager and the NFS volumes continue to function and the cluster continues to churn away. Any virtual machines running on the VSA volumes will automatically be restarted by vSphere HA. Once the VSA manager has recovered, it'll automatically ensure the volumes have all synchronized and re-protect the VSA cluster.

Note: It is important to understand that even though you can install the VMware VSA with running virtual machines using the hack above, the requirement of a vanilla ESXi 5 installation is still 100% mandatory. You MUST still have only a single vSwitch (vSwitch0) with only a single vmnic (vmnic0) connected to the vSwitch with only two default portgruops that must exists: "VM Network" and "Management Network", there is no workaround for this requirement. If you you have a host that you plan on running VMs prior to VSA installation, make sure they are on the "VM Network" portgroup as additional portgroups are not supported prior to installation of the VSA.

I am hoping that some of these requirements are relaxed in a future release of the VMware VSA and possibly a version that would work with the vCVA (vCenter Virtual Appliance). For now if have limited hardware or would like to use existing ESXi 5 host with running virtual machines (needs to be configured like a vanilla installation of ESXi 5) then you can run everything on either a two or three node cluster, just be aware of the caveats.

For more in-depth information and details about the new VSA, please check out the VMware Storage Blog - vSphere Storage Appliance Links and be sure to follow Cormac Hogan on twitter at @VMwareStorage

Share this...
  • Twitter
  • Facebook
  • Linkedin
  • Reddit
  • Pinterest

Filed Under: Uncategorized Tagged With: esxi5, evc, nested, vsa, vSphere 5

How to Query VM Disk Format in vSphere 5

09/25/2011 by William Lam 5 Comments

Prior to vSphere 5, it was not trivial to identify the particular disk format for a given virtual machine's disk. Using the vSphere Client, you would see a virtual machine's disk be displayed as either thin or thick. The problem with this is that the "thick" format can be either:

  • zeroedthick - A thick disk has all space allocated at creation time and the space is zeroed on demand as the space is used
  • eagerzeroedthick - An eager zeroed thick disk has all space allocated and wiped clean of any previous contents on the physical media at creation time. Such disks may take longer time during creation compared to other disk formats.

Users would not be able to distinguish the exact type using the vSphere Client or the vSphere 4 APIs. With the release of vSphere 4, VMware did introduce a new property in the vSphere 4 API called eagerlyScrub which was supposed to help identify whether a virtual disk was allocated as an eagerzeroedthick disk. Unfortunately there may have been a bug with the property as it never gets modified whether a disk is created as zeroedthick or eagerzeroedthick.

The only method that I was aware of to truly figuring out the disk format would be to manually parse the virtual machine's vmware.log file to identify the disk type which I wrote a script for in 2009.

During the vSphere 5 beta, I had noticed the vSphere Client UI now properly displays all three virtual machine disk format: zeroedthick (displayed as flat), thin and eagerzeroedthick (displayed as thick).

Seeing that VMware now displays the three different formats, I wanted to see if it was possible to extract this using the vSphere 5 APIs and not have to rely on the hack of reading the vmware.log files. It turns out that the eagerlyScrub property is now functioning properly when a VMDK is provisioned or has been inflated/converted to the eagerzeroedthick format. I wrote a simple vSphere SDK for Perl script called getVMDiskFormat.pl which allows you to extract the disk formats of all virtual machines connecting to either vCenter or directly to an ESX(i) host.

The script allows for two types of output: console (directly on the console) or csv (creates .csv file)

If you select csv output, by default it will be stored in a file called "vmDiskFormat.csv". You also have the option of specifying the filename by using the --filename flag and providing a name of your choosing.

You can then load the csv file into excel and easily sort through the various disk format types.

All this is already included in the latest version of the VMware vSphere Health Check Report 5.0 if you want a centralize report that includes virtual machine disk format.

Share this...
  • Twitter
  • Facebook
  • Linkedin
  • Reddit
  • Pinterest

Filed Under: Uncategorized Tagged With: api, eagerzeroedthick, esxi 5, thin, vmdk, vSphere 5, vsphere sdk for perl, zeroedthick

  • « Go to Previous Page
  • Go to page 1
  • Interim pages omitted …
  • Go to page 174
  • Go to page 175
  • Go to page 176
  • Go to page 177
  • Go to page 178
  • Interim pages omitted …
  • Go to page 202
  • Go to Next Page »

Primary Sidebar

Author

William Lam is a Senior Staff Solution Architect working in the VMware Cloud team within the Cloud Services Business Unit (CSBU) at VMware. He focuses on Automation, Integration and Operation for the VMware Cloud Software Defined Datacenters (SDDC)

  • Email
  • GitHub
  • LinkedIn
  • RSS
  • Twitter
  • Vimeo

Sponsors

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy