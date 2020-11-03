I am super excited to be able to finally share, what I think, is a really cool ESXi-Arm solution which has been an evolution of this and this. This solution also incorporates a number of automation techniques I have shared over the years when it comes to ESXi scripted installation aka Kickstart, so it was really neat to all those things get pulled into a single solution. Lastly, I also want to give huge thanks to Cyprien Laplace who threw the initial challenge my way after I had shared how to perform an ESXi-Arm scripted installation without using SD Card.

ESXi (x86) can be deployed using either a stateful or stateless installation. In the latter case, ESXi is booted over the network using the vSphere Auto Deploy feature in vCenter Server which does not require any local media for ESXi. Upon attaching itself to vCenter Server, Auto Deploy then leverages vSphere Host Profiles and its rules engine to determine which configurations or profiles should be applied to ensure the ESXi hosts are configured per their desired stated. Here is a quick video overview of how Auto Deploy and Host Profiles work.

Fundamentally, vSphere Auto Deploy and Host Profiles can also work with ESXi-Arm but today, the x86 vCenter Server would require some code modification for this to actually work.

OK, so am I teasing you with something that does not exists? Nope, but I just wanted to help set the context 🙂

The solution that I have created boots ESXi-Arm over the network in a "stateless" manner, so is no need for an SD Card or USB device plugged into the Raspberry Pi (rPI). In addition to the ESXi-Arm files, it also includes a custom payload which runs to retrieve additional configurations which can automatically join a desired vCenter Server as well as apply further customizations of an ESXi-Arm host. As you can see, this solution behaves similiar to that of vSphere Auto Deploy and Host Profiles but does not use either of these vSphere features and works with the ESXi-Arm Fling right now.

Technically speaking, these techniques can also be applied to x86 ESXi but I will leave that to the reader for further exploration.

Here is a quick video demonstrating my ESXi-Arm Stateless solution booting one of my Raspberry Pi 4 systems:

Below are the instructions on how to set this up and although they are a bit lengthy, it is well worth the effort!

Step 1 - Download and install the Raspberry Pi Imager Tool for your desktop OS. This is needed as we need to install Raspberry Pi (rPI) OS onto our SD Card so that we can change the default rPI boot order to 0xf241 which attempts booting using the following order: Network Boot, USB and then SD Card. If no bootable devices are found, the sequence is repeated all over again. This order would allow us to perform our stateless boot and for those that prefer to have a stateful installation on USB, you can install ESXi-Arm via Kickstart as outlined in my previous article.

Step 2 - Power up the rPI with the SD Card plugged in, which has rPI OS image. Once rPI OS boots up, open up a terminal and run the following command to apply latest EEPROM and update the boot order, which is only available when using the command-line.

PI_EEPROM_VERSION=pieeprom-2020-09-03

wget https://github.com/raspberrypi/rpi-eeprom/raw/master/firmware/beta/${PI_EEPROM_VERSION}.bin

sudo rpi-eeprom-config ${PI_EEPROM_VERSION}.bin > bootconf.txt

sed -i 's/BOOT_ORDER=.*/BOOT_ORDER=0xf241/g' bootconf.txt

sudo rpi-eeprom-config --out ${PI_EEPROM_VERSION}-netboot.bin --config bootconf.txt ${PI_EEPROM_VERSION}.bin

sudo rpi-eeprom-update -d -f ./${PI_EEPROM_VERSION}-netboot.bin



Reboot for the changes to go into effect. At this point, you can now shutdown the rPI and remove the SD Card from the system.

Step 3 - Follow Steps 1-7 from this blog post in setting up dnsmasq for our PXE infrastructure. I really like dnsmasq as it integrates with your existing DHCP environment and is fairly easy to setup. From here on out, I will refer to this this system as our PXE Server and in my example, the IP Address of this system is 192.168.30.176.

To perform a stateless boot of ESXi-Arm, you just need to remove all the default boot options in the kernelopt line in ESXi-Arm efi/boot/boot.cfg configuration file. To make our solution a bit more dynamic, we are going to leverage a few custom kernel boot options which we will define and will get passed into our custom script. The three options are

configServer - IP Address of your PXE Server which also runs web server hosting the configuration files for customization

- IP Address of your PXE Server which also runs web server hosting the configuration files for customization joinVC - Specifies whether to automatically join ESXi-Arm host to vCenter Server

- Specifies whether to automatically join ESXi-Arm host to vCenter Server runExtraConfig - Specifies whether to apply additional post-deployment configurations

Note: When ESXi-Arm boots up in stateless mode, the default root password is empty. This is also reflected when ESXi-Arm host is added to vCenter Server. It should be possible to change the password as part of the post-deployment configuration but the default behavior is to have an empty password.

Step 4 - Replace the kernelopt line in /srv/tftp/esxi-arm/efi/boot/boot.cfg with the example below, where the IP Address will be the PXE server that will be hosting our configuration files. We also need to append the modules line with our custom payload called extra.tgz which actually does all the magic.

bootstate=0 title=Loading ESXi installer timeout=5 prefix=esxi-arm kernel=b.b00 kernelopt=configServer=192.168.30.176 joinVC=true runExtraConfig=true modules=jumpstrt.gz --- useropts.gz --- features.gz --- k.b00 --- procfs.b00 --- vmx.v00 --- vim.v00 --- tpm.v00 --- sb.v00 --- s.v00 --- ena.v00 --- bnxtnet.v00 --- bnxtroce.v00 --- brcmfcoe.v00 --- brcmnvme.v00 --- elxiscsi.v00 --- elxnet.v00 --- i40en.v00 --- i40iwn.v00 --- iavmd.v00 --- igbn.v00 --- iser.v00 --- ixgben.v00 --- lpfc.v00 --- lpnic.v00 --- lsi_mr3.v00 --- lsi_msgp.v00 --- lsi_msgp.v01 --- lsi_msgp.v02 --- mtip32xx.v00 --- ne1000.v00 --- nenic.v00 --- nfnic.v00 --- nhpsa.v00 --- nmlx4_co.v00 --- nmlx4_en.v00 --- nmlx4_rd.v00 --- nmlx5_co.v00 --- nmlx5_rd.v00 --- ntg3.v00 --- nvme_pci.v00 --- nvmerdma.v00 --- nvmxnet3.v00 --- nvmxnet3.v01 --- pvscsi.v00 --- qcnic.v00 --- qedentv.v00 --- qedrntv.v00 --- qfle3.v00 --- qfle3f.v00 --- qfle3i.v00 --- qflge.v00 --- rste.v00 --- sfvmk.v00 --- smartpqi.v00 --- vmkata.v00 --- vmkfcoe.v00 --- vmkusb.v00 --- vmw_ahci.v00 --- elx_esx_.v00 --- btldr.v00 --- esx_dvfi.v00 --- esx_ui.v00 --- esxupdt.v00 --- tpmesxup.v00 --- weaselin.v00 --- loadesx.v00 --- lsuv2_hp.v00 --- lsuv2_in.v00 --- lsuv2_ls.v00 --- lsuv2_nv.v00 --- lsuv2_oe.v00 --- lsuv2_oe.v01 --- lsuv2_oe.v02 --- lsuv2_sm.v00 --- native_m.v00 --- qlnative.v00 --- vmware_e.v00 --- vsan.v00 --- vsanheal.v00 --- vsanmgmt.v00 --- tools.t00 --- imgdb.tgz --- imgpayld.tgz --- extra.tgz build=7.0.0-1.0.40886095 updated=0 1 2 3 4 5 6 7 8 9 bootstate = 0 title = Loading ESXi installer timeout = 5 prefix = esxi - arm kernel = b . b00 kernelopt = configServer = 192.168.30.176 joinVC = true runExtraConfig = true modules = jumpstrt . gz -- - useropts . gz -- - features . gz -- - k . b00 -- - procfs . b00 -- - vmx . v00 -- - vim . v00 -- - tpm . v00 -- - sb . v00 -- - s . v00 -- - ena . v00 -- - bnxtnet . v00 -- - bnxtroce . v00 -- - brcmfcoe . v00 -- - brcmnvme . v00 -- - elxiscsi . v00 -- - elxnet . v00 -- - i40en . v00 -- - i40iwn . v00 -- - iavmd . v00 -- - igbn . v00 -- - iser . v00 -- - ixgben . v00 -- - lpfc . v00 -- - lpnic . v00 -- - lsi_mr3 . v00 -- - lsi_msgp . v00 -- - lsi_msgp . v01 -- - lsi_msgp . v02 -- - mtip32xx . v00 -- - ne1000 . v00 -- - nenic . v00 -- - nfnic . v00 -- - nhpsa . v00 -- - nmlx4_co . v00 -- - nmlx4_en . v00 -- - nmlx4_rd . v00 -- - nmlx5_co . v00 -- - nmlx5_rd . v00 -- - ntg3 . v00 -- - nvme_pci . v00 -- - nvmerdma . v00 -- - nvmxnet3 . v00 -- - nvmxnet3 . v01 -- - pvscsi . v00 -- - qcnic . v00 -- - qedentv . v00 -- - qedrntv . v00 -- - qfle3 . v00 -- - qfle3f . v00 -- - qfle3i . v00 -- - qflge . v00 -- - rste . v00 -- - sfvmk . v00 -- - smartpqi . v00 -- - vmkata . v00 -- - vmkfcoe . v00 -- - vmkusb . v00 -- - vmw_ahci . v00 -- - elx_esx_ . v00 -- - btldr . v00 -- - esx_dvfi . v00 -- - esx_ui . v00 -- - esxupdt . v00 -- - tpmesxup . v00 -- - weaselin . v00 -- - loadesx . v00 -- - lsuv2_hp . v00 -- - lsuv2_in . v00 -- - lsuv2_ls . v00 -- - lsuv2_nv . v00 -- - lsuv2_oe . v00 -- - lsuv2_oe . v01 -- - lsuv2_oe . v02 -- - lsuv2_sm . v00 -- - native_m . v00 -- - qlnative . v00 -- - vmware_e . v00 -- - vsan . v00 -- - vsanheal . v00 -- - vsanmgmt . v00 -- - tools . t00 -- - imgdb . tgz -- - imgpayld . tgz -- - extra . tgz build = 7.0.0 - 1.0.40886095 updated = 0

Step 5 - Download (or create) the extra.tgz to your PXE Server and copy that to /srv/tftp/esxi-arm directory

Step 6 - Create /var/www/html/esxi-arm-config.json file which contains the vCenter Server configuration for the ESXi-Arm host to automatically join along with the matching NTP server as this is required. For security purposes, you should consider creating a non-administrator account which only has permissions to add ESXi-Arm hosts to a specific vSphere Cluster. If you do not want your ESXi-Arm host to automatically be joined to vCenter Server, simply set the joinVC boot option to false

{ "vcenter_server": "192.168.30.200", "vcenter_user": "*protected email*", "vcenter_pass": "VMware1!", "vcenter_datacenter": "Arm-Datacenter", "vcenter_cluster": "Arm-Cluster", "ntp_server": "pool.ntp.org" } 1 2 3 4 5 6 7 8 { "vcenter_server" : "192.168.30.200" , "vcenter_user" : "*protected email*" , "vcenter_pass" : "VMware1!" , "vcenter_datacenter" : "Arm-Datacenter" , "vcenter_cluster" : "Arm-Cluster" , "ntp_server" : "pool.ntp.org" }

Step 7 - Create /var/www/html/esxi-arm-extra-config.sh file and set it to be executable. This is basically a shell script that contains ESXi-Arm shell commands that would be executed for additional host configurations. If you do not have additional configurations, you can disable this by simply set the runExtraConfig boot option to false.

Below is a very basic example which simply suppresses the warnings found on the vSphere UI. For shared storage such as configuring NFS/iSCSI, it is recommended that you place those settings here so that all ESXi-Arm hosts will have the same configurations.

#!/bin/sh # Suppress UI Warnings esxcli system settings advanced set -o /UserVars/SuppressShellWarning -i 1 esxcli system settings advanced set -o /UserVars/SuppressCoredumpWarning -i 1 1 2 3 4 5 #!/bin/sh # Suppress UI Warnings esxcli system settings advanced set - o / UserVars / SuppressShellWarning - i 1 esxcli system settings advanced set - o / UserVars / SuppressCoredumpWarning - i 1

Step 8 - Download the latest official Raspberry Pi Firmware and extract the contents to your local desktop, you should have a folder called firmware-master. This corresponds to the microcode necessary to initialize the Raspberry Pi. Download the latest community Raspberry Pi 4 UEFI firmware and extract the contents to your desktop you should have a folder called RPi4_UEFI_Firmware_v1.20. This is the firmware necessary to boot ESXi-Arm.

Step 9 - Delete all files starting with kernel*.img within firmware-master/boot directory and then copy the entire content of the "boot" directory into a new folder called uefi

rm ~/Desktop/firmware-master/boot/kernel*.img

cp -rf ~/Desktop/firmware-master/boot/* uefi

Step 10 - Copy all files within the RPi4_UEFI_Firmware_v1.20 directory into the same uefi directory

cp -rf ~/Desktop/RPi4_UEFI_Firmware_v1.20/* uefi

Note: For 4GB Pi 4 only, edit the config.txt file in the uefi directory and append gpu_mem=16:

Step 11 - Zip up the contents of the "uefi" folder and not the folder itself. On a Mac, this can be done by change into the folder and running the following command from within the folder itself and name it uefi.zip:

zip -r ../uefi.zip *

Step 12 - SCP the uefi.zip file to our PXE Server and place it under /srv/tftproot

Step 13 - Run the following command to create our UEFI directory and unzip the contents of the uefi.zip file

mkdir /srv/tftproot/rpi-uefi-1.20/

cd /srv/tftproot/rpi-uefi-1.20/

unzip uefi.zip

Step 14 - We need to obtain the serial number of our rPI as it expects the UEFI files to be placed in a directory with that ID. You can easily do this by just powering on the rPI and serial will be displayed under the "board:" line as shown in the screenshot. In my example below, it is 49a6ff15



Login to Kickstart server and we will just create a symlink for our rPI serial to our UEFI files which is stored in /srv/tftproot/rpi-uefi-1.20/ by running the following command:

ln -s /srv/tftproot/rpi-uefi-1.20/ 49a6ff15

Step 15 - Finally, enable and start both dnsmasq and apache2 services by running the following commands:

systemctl enable dnsmasq

systemctl start dnsmasq

systemctl enable apache2

systemctl start apache2

You are now ready to power up your rPI and see the stateless magic happen! Not only is this an easy way to deploy ESXi-Arm, especially with the 180 days evaluation period but super simple way to try out newer version of the ESXi-Arm Fling without much hassle, especially for those that have more than one device.

Troubleshooting

The default extra.tgz payload has been configured to log directly to ESXi Console during boot up but also into /var/log/syslog on the ESXi-Arm host. You can simply grep for the keyword "STATELESS-DEBUG" to see what is happening.

Below is a log snippet for an initial deployment:

[STATELESS-DEBUG] Enabling and Starting SSH

[STATELESS-DEBUG] Enabling and Starting ESXi-Arm Shell

[STATELESS-DEBUG] Enabling httpClient on ESXi-Arm Firewall

[STATELESS-DEBUG] Processing ESXi-Arm Boot Options

[STATELESS-DEBUG] Downloading ESXi-Arm Configuration File

[STATELESS-DEBUG] Configuring NTP

[STATELESS-DEBUG] Downloading ESXi-Arm Extra Configuration Script

[STATELESS-DEBUG] Running esxi-arm-extra-config.sh

[STATELESS-DEBUG] Running join-vcenter.py

[STATELESS-DEBUG] jsonConfigData={vcenter_pass: VMware1!, ntp_server: pool.ntp.org, vcenter_server: 192.168.30.200, vcenter_datacenter: Arm-Datacenter, vcenter_cluster: Arm-Cluster, vcenter_user: *protected email*}

[STATELESS-DEBUG] Creating AddHost Spec

[STATELESS-DEBUG] hostAddSpec={vmFolder: null, port: 443, userName: root, sslThumbprint: 25:B3:EC:4C:D1:68:E3:4B:29:2F:AC:CF:BB:E0:2A:F2:7D:F1:2F:23, vimAccountName: null, lockdownMode: null, dynamicType: null, dynamicProperty: [], hostName: 192.168.30.91, managementIp: null, hostGateway: null, force: true, password: , vimAccountPassword: null}

[STATELESS-DEBUG] Joining vCenter Server

Upon a reboot or power cycle, one thing I needed to consider was that the previous ESXi-Arm host which was added to vCenter Server is now in a disconnected state and would cause re-connecting to fail since the ESXi-Arm host IP/Hostname has been seen before. This is automatically handle by checking to see if the ESXi-Arm IP exists in vCenter and if so, remove that entry prior to re-adding. You will know that ESXi-Arm host has gone through a reboot with the additional log entry of "Removing previous ESXi-Arm instance X" where X is the IP Address.

Below is a log snippet of reboot or power cycle:

[STATELESS-DEBUG] Enabling and Starting SSH

[STATELESS-DEBUG] Enabling and Starting ESXi-Arm Shell

[STATELESS-DEBUG] Enabling httpClient on ESXi-Arm Firewall

[STATELESS-DEBUG] Processing ESXi-Arm Boot Options

[STATELESS-DEBUG] Downloading ESXi-Arm Configuration File

[STATELESS-DEBUG] Configuring NTP

[STATELESS-DEBUG] Downloading ESXi-Arm Extra Configuration Script

[STATELESS-DEBUG] Running esxi-arm-extra-config.sh

[STATELESS-DEBUG] Running join-vcenter.py

[STATELESS-DEBUG] jsonConfigData={vcenter_datacenter: Arm-Datacenter, vcenter_cluster: Arm-Cluster, vcenter_pass: VMware1!, vcenter_user: *protected email*, ntp_server: pool.ntp.org, vcenter_server: 192.168.30.200}

[STATELESS-DEBUG] Removing previous ESXi-Arm instance 192.168.30.91

[STATELESS-DEBUG] hostAddSpec={lockdownMode: null, userName: root, vimAccountName: null, hostName: 192.168.30.91, port: 443, hostGateway: null, vmFolder: null, force: true, dynamicType: null, sslThumbprint: E5:14:E0:A4:9F:AE:D8:4F:57:DF:01:5D:BD:B2:C0:A6:4F:5E:FC:A9, managementIp: null, password: , vimAccountPassword: null, dynamicProperty: []}

[STATELESS-DEBUG] Joining vCenter Server