javascript:void(0)

Monday, March 28, 2011

How to automatically add ESX(i) host to vCenter in Kickstart

While recently updating my Automating Active Directory Domain Join in ESX(i) Kickstart article, it reminded me an old blog post by Justin Guidroz who initially identified a way to add an ESXi host to vCenter using python and the vSphere MOB. The approach was very neat but was not 100% automated as it required some user interaction with the vSphere MOB to identify certain API properties before one could potentially script it within a kickstart installation.

I decided to revisit this problem as it was something I had investigated awhile back. There are numerous ways on getting something like this to work in your environment, but it all boils down to your constraints, naming convention and provisioning process. If you have a well defined environment and utilizing a good naming structure and can easily identify which vCenter a given ESX(i) host should be managed from, then this can easily be integrated into your existing kickstart with minor tweaks. This script was tested on vCenter 4.1 Update 1 and ESXi 4.1 and 4.1 Update1.

There are a few steps that are necessary before we get started and a recommended one for those that have security concerns around this solution.

Step 1 - You will need to extract some information from the vCenter server in which you would like your ESX(i) hosts to join. You will need to generate an inventory path to the vCenter cluster which will take the form of: [datacenter-name]/host/[cluster-name], this will automatically locate the managed object ID of your vCenter cluster which is required as part of the host add process. This was a manual process in Justin's original solution.
In this example, I have a datacenter called "Primp-Skunkworks" and a cluster under that datacenter called "Primp-Skunkworks-Cluster", the inventory path will look like the following:  
"Primp-Skunkworks/host/Primp-Skunkworks-Cluster"
You will need this value to populate a variable in the script which will be described a little bit later

Step 2 - As you may have guessed, to add an ESX(i) host to vCenter, you will need to connect to vCenter server and use an account that has the permission to add a host. It is recommended that you do not use or expose any administrative accounts for this as the credentials are stored within the script unencrypted. A work around is to create a service account whether that is a local account or an Active Directory account with only the permission to add an ESX(i) host to a vCenter cluster. You will create a new role, in this example I call it "JoinvCenter" and you just need to provide the Host->Inventory->Add host to cluster privilege.
Once you have created the role, you will need to assign this role to the service account user either globally in vCenter if you want to add to multiple cluster or a given datacenter/cluster.

Now that we have the pre-requisites satisfied, we will need to populate a few variables within the script which will be used in your %post section of ESX(i) kickstart configuration file.

This variable defines the name of your vCenter server, please provide the FQDN:

This variable define the vCenter cluster path which was generated earlier:

These variables define the server account credentials used to add an ESX(i) host to vCenter. You will need to run the following command to encode the selected password. You will need access to a system with python interpreter to run the following command:

python -c "import base64; print base64.b64encode('MySuperDuperSecretPasswordYo')"

Note: This does not encrypt your password but obfuscate it slightly so that you are not storing the password in plain text. If a user has access to the encoded hash, it is trivial to decode it.

These variables define the ESX(i) root credentials which is required as part of the vCenter add process. If you do not want to store these in plain text, you will also need to encode them using the command in previous section:

We are now all done and ready to move forward with the actual script which will be included in your kickstart configuration. As a sanity check, you can run this script manually on an existing ESX(i) host to ensure that the process works before testing in kickstart. You should also ensure this is the very last script to execute as I ran into a race condition while the root password was being updated automatically from the default 999.* scripts. To ensure this is the very last script, set the --level to something like 9999 in your %firstboot stanza


To aide in troubleshooting, the script also outputs the details to syslog and on ESX(i), it will be stored in /var/log/messages and you can just search for the string "GHETTO-JOIN-VC". If everything is successful, after %firstboot section has completed, you should be able to see an ESX(i) host join vCenter and the following in the logs.
Tips: You should only see "Success" messages, if you see any "Failed" messages, something went wrong. If you are still running into issues, make sure your ESX(i) host has it's hostname configured with FQDN and you should see an error on your vCenter server if it fails to whether it's due to hostname and/or credentials. You can also redirect the output of the script to local VMFS volume for post-troubleshooting.

Depending on your provision process and how you determine which ESX(i) host should join which vCenter/cluster, you can easily add logic in the main kickstart configuration file to automatically determine or extract from a configuration file and dynamically update joinvCenter.py script prior to execution.

I would like to thank Justin Guidroz and VMTN user klich for their contributions on the python snippets that were used in the script. 

FYI - I am sure the python code could be cleaner but I will leave that as an exercise for those more adept to python. My python-fu is not very strong ;)

UPDATE (03/29/2011): Updated the IP Address extraction to use gethostbyname and added proper logout logic after joining vCenter.

28 comments:

  1. Funny that you post this today. Over the weekend I did some automation around PXE installation with kickstart and was thinking that connecting to VC was the one missing piece.

    I really like the idea of a bare bones user account. Instead of the somewhat hacky python script I was thinking about using 'wget' to for example download ruby and rbvmomi to do this 'the proper way'.
    Alternatively I was thinking of just posting an event on a RabbitMQ server using a small python script where the ESX announces its IP. A proper (and more powerful) rbvmomi script could then pick the rest up from there.

    ReplyDelete
  2. @Christian,

    You may be able to get ruby/rbvmomi running on classic ESX, but with ESXi, it probably won't work in the Busybox Console. At a minimum you'll probably need a statically linked ruby binary to encapsulate all it's dependencies.

    The latter solution is probably the best option and it's actually one that VMware uses with it's Auto Deploy appliance which does exactly that after pre-provision. It does "call back" to Auto Deploy and from there the system uses the APIs to join it to specific vCenter & apply a host profile. ESXi has netcat starting with 4.1 which you can build a dumb client to periodically call back to a server and then perform advanced operations using any SDK available whether that is VI Java, Perl, PowerCLI, etc. on any platform (Windows/Linux)

    The possibilities are pretty much endless

    ReplyDelete
  3. Theres an easier way to get the IP:

    import socket
    myip=socket.gethostbyname(socket.gethostname())

    ReplyDelete
  4. @mcowger,

    Thanks Matt, I've updated the script with your suggestion

    ReplyDelete
  5. One thing I've found is that you can use urllib2.HTTPCookieProcessor() in your opener to handle cookies instead of having to capture it after the first page and pass it with each subsequent request.

    opener = urllib2.build_opener(authhandler,urllib2.HTTPCookieProcessor())

    ReplyDelete
  6. @Justin,

    HTTPCookieProcessor() is part of cookielib which is not a python module available on ESXi.

    ReplyDelete
  7. Hi there, got this configured and running on ESXi 4.1.0 build 348481.

    All looks good, logs success etc but never appears in the vcenter console. Any ideas?
    Thanks Ed

    ReplyDelete
  8. @Rucking,

    Can you manually run the script from the ESXi host and see if you get any errors? While debugging this, I've seen that though the successful message is thrown, it may have had an error in processing the request. Also, did you see any failed tasks in your vCenter server about this host trying to join?

    ReplyDelete
  9. Yep, running it manually logs success with no errors. Not seeing any events in VCenter at all, was expecting to but see nothing. Checked the cluster variable, vcenter variables, authentication etc etc by changing them to be deliberately wrong, changing any of these causes an error. So my 'good' config is definately good. It's obviously jumping out somewhere but it's doing it silently, any more points for debug?

    ReplyDelete
  10. Found the errors - The tags on this line are case sensitive in our environment so hostname /hostName don't match, same with sslThumbprint. Sorting/matching these cases fixed the issues. Not sure why our environment is case sensitive and yours isn't but there you go. Cheers

    # Code to create ConnectHostSpecxml = '%hostname%sha%user%pass1'

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. I had same issue as Rucking but changing the line to match the case got it working.
    William your a legend!
    xml = '%hostname%sha%user%pass1'

    ReplyDelete
  13. Hello William,

    Amazing post! I'm trying to run this script manually from an ESXi 4.1u1 host to test it out and make sure it's working before putting it into a kickstart script. I keep getting "Failed to retrieve MOB data" error message from the error log. When I go to the MOB listed from a browser https://VCENTER/mob/?moid=SearchIndex&method=findByInventoryPath and use the AD account created and assigned the permissions from above I can access the URL just fine. I'm not sure where to go from here.

    I've worked with your Active Directory connection script and that one works swimmingly.

    Thanks,

    Jim

    ReplyDelete
  14. @Jim

    When you encoded your password, do you see any weird characters? It's possible you may have characters that may need to be escaped using "\". On easy way to validate this is instead of using double-quotes, try single quotes around the vc_encodedpassword

    ReplyDelete
  15. @William

    that worked :) I'm able to get the script to run through with out an error now. The only issue now is it doesn't actually join vCenter. I get a silent failure. I see the AD account I created log into the vCenter server and then log out but the ESXi host never joins my cluster. I've verified the Cluster name is correct as DataCentername/host/Clustername. Do you have anymore advice? Thanks again

    ReplyDelete
  16. @Jim,

    Check that the account has the correct permission and that you can manually add the ESXi host to the particular cluster. Since this credentials related, it's outside of the script.

    ReplyDelete
  17. @William,

    Thanks for the advice yet again! There are no issues with adding the ESXi host to the cluster under the AD account I specfied. I logged into the vSphere Client with the account in question and added the ESXi host directly into the cluster. It added with out any issues.

    ReplyDelete
  18. Hey guys, i am in the middle of my Bachelor-Thesis and i write about die virtualization. By that, i want to show some automaziation. So i scriptet a little bit with pxe-boot, webserver, ... and finally realized the installation of an esxi-host.

    vmaccepteula
    install url http://10.23.136.145/installmedia/esxi-iso
    rootpw VMware208
    clearpart --overwritevmfs --firstdisk=local
    autopart --firstdisk=local --overwritevmfs

    #DHCP
    #network --bootproto=dhcp --device=vmnic6
    #statisch
    network --bootproto=static --device=vmnic6 --ip=10.23.136.208 --gateway=10.23.136.129 --netmask=255.255.255.128 --hostname=esx208 --nameserver="10.23.40.242,10.23.40.243"
    keyboard German
    reboot

    So now I want the add the host to my Vcenter. So i edit my script like this:

    vmaccepteula
    install url http://10.23.136.145/installmedia/esxi-iso
    rootpw VMware208
    clearpart --overwritevmfs --firstdisk=local
    autopart --firstdisk=local --overwritevmfs

    #DHCP
    #network --bootproto=dhcp --device=vmnic6
    #statisch
    network --bootproto=static --device=vmnic6 --ip=10.23.136.208 --gateway=10.23.136.129 --netmask=255.255.255.128 --hostname=esx208 --nameserver="10.23.40.242,10.23.40.243"
    keyboard German
    reboot

    %firstboot --unsupported --interpreter=busybox

    #enable TechSupportModes
    vim-cmd hostsvc/enable_remote_tsm
    vim-cmd hostsvc/start_remote_tsm
    vim-cmd hostsvc/enable_local_tsm
    vim-cmd hostsvc/start_local_tsm
    vim-cmd hostsvc/net/refresh

    #Add Host to Vcenter

    import re,os,urllib,urllib2
    url = "https://10.23.136.144/mob/?moid=&method=addHost"
    username = "administrator"
    password = "administratorpassword"
    passman = urllib2.HTTPPasswordMgrWithDefaultRealm()
    passman.add_password(None,url,username,password)
    authhandler = urllib2.HTTPBasicAuthHandler(passman)
    opener = urllib2.build_opener(authhandler)
    urllib2.install_opener(opener)
    cmd = "openssl x509 -sha1 -in /etc/vmware/ssl/rui.crt -noout -fingerprint"
    tmp = os.popen(cmd)
    #tmp_sha1 = tmp.readline()
    tmp.close()
    s1 = re.split('=',tmp_sha1)
    s2 = s1[1]
    s3 = re.split('\n', s2)
    sha1 = s3[0]

    xml = '10.23.136.208rootVMware2081'
    xml = xml.replace(sha1)

    params = {'spec':xml,'asConnected':'1','resourcePool':'','license':''}
    e_params = urllib.urlencode(params)
    req = urllib2.Request(url,e_params)
    page = urllib2.urlopen(req).read()

    reboot

    I have to say, that for this test i did disable the ssl function.
    ...
    The Installation runs but the password is now default and not VMware208 and the Server is also not in my Vcenter.
    What went wrong??? There were no errors!

    Could you please help me?

    ReplyDelete
  19. Great site! I am having issues running this script. Its failing with the error:

    Failed to find cluster "ESX 4 - Devlab/Host/Test-Cluster-01"!

    If i browse to the MOB FindByInventoryPath screen I can search for the datacenter "ESX 4 - Devlab" and I get a result, but if I add /Host or /Host/Test-Cluster-01 it does not return a result.

    If I use the direct access method in Justin's script I can browse to the unique ID as well (https://devvcenter02/mob/?moid=domain%2dc11994).

    ReplyDelete
  20. Great write up, I'm hitting an issue where I get the following returned.

    Method Invocation: InvalidRequest

    There are no errors with the script it logs in and gets the correct moid etc. I've tried doing it manually through the website as well but get the same response. Do you have any ideas or tips on where the underlying issue may be? I've modified the role as well.

    Thanks,

    ReplyDelete
  21. @vunusual,

    Are you using an folders from your datacenter to cluster? It should work if your inventory is like the example above

    ReplyDelete
  22. @chris,

    Hm, it almost sounds like a permission issue if you can't perform the operation using the MOB. Are you able to join the host to the very same cluster using the vSphere Client connecting to vCenter?

    ReplyDelete
  23. Awesome script Wiliam!
    Unfortunately I’m getting “AttributeError” and hope you can guide me in the right direction. BTW, in our production vCenter there are couples of folders between Datacenter and Cluster. Will that be a problem?

    “AttributeError”
    /tmp # python /tmp/joinvCenter.py
    Traceback (most recent call last):
    File "/tmp/joinvCenter.py", line 61, in
    nonce = reg.search(page_content).group(1)
    AttributeError: 'NoneType' object has no attribute 'group'

    ReplyDelete
  24. @chris,@Nasir

    If you have folders, between your Datacenter + Clusters, then you just need to properly append those.

    Here is one example and what the path string would look like:

    Datacenter: Primp-Skunkworks
    Folder: Primp-Folder
    Cluster: Skunkworks-Cluster

    Inventory Path:
    Primp-Skunkworks/host/Primp-Folder/Primp-Skunkworks-Cluster

    Notice, the "Folders" name still goes after the "host" key.

    I would recommend using the vSphere MOB - http://www.virtuallyghetto.com/2010/07/whats-new-in-vsphere-41-mob.html to help you find the inventory path

    ReplyDelete
  25. I had the same issue with matching case in the xml. All lowercase in the opening tag wouldn't work. Changing the opening tag to match the closing tag did. I had to change these three tags to match:





    I also figured out how to add the host at the root level rather than in a cluster by changing these three lines:

    clusterMoRef = re.search('domain-c[0-9]*',page)
    url = "https://" + vcenter_server + "/mob/?moid=" + clusterMoRef.group() + "&method=addHost"
    params = {'vmware-session-nonce':nonce,'spec':xml,'asConnected':'1','resourcePool':'','license':''}

    To these:

    clusterMoRef = re.search('group-h[0-9]*',page)
    url = "https://" + vcenter_server + "/mob/?moid=" + clusterMoRef.group() + "&method=addStandaloneHost"
    params = {'vmware-session-nonce':nonce,'spec':xml,'compResSpec':'','addConnected':'1','license':''}

    And then changing the 'cluster=' variable like so:

    cluster = "Primp-Skunkworks/host"

    Hope it helps save others some time!

    Thanks,
    -Loren

    ReplyDelete
  26. Lol, the xml tags show up invisible in my post...let's try again:

    hostName<>/hostName
    sslThumbprint<>/sslThumbprint
    userName<>/userName

    ReplyDelete
  27. Having the exact same problem as Nasir.
    I am using ESXi 4.1.0 build 348481
    Any progress on the solution?
    Thanks,
    -Erki

    /tmp # python joinvCenter.py
    Traceback (most recent call last):
    File "joinvCenter.py", line 62, in
    nonce = reg.search(page_content).group(1)
    AttributeError: 'NoneType' object has no attribute 'group'

    ReplyDelete
  28. I'm having the same problem as Nasir and Erki. i'm using a specials esxi installation from HP (https://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=HPVM06) because we are using BL460c G7 which includes network cards not in the default vmware iso. anyone got an idea how to get this to work.

    ReplyDelete

 
/*http://blog.cartercole.com/2009/10/awesome-syntax-highlighting-made-easy.html*/