As some of you may have heard, there is currently a known issue with NFS based datastores (includes VSA NFS datastores) after upgrading to vSphere 5.5 Update 1. The issue causes NFS datastores to disconnect and go into an APD (All Paths Down) state. VMware is currently aware of the problem and you can follow KB 2076392 for the latest updates.

While going through my Twitter stream this morning, I noticed an interesting question from fellow Blogger and friend Jase McCarty who asked the following:

vsphere55u1-nfs-apd-alarm-2
I was quite surprised to hear that there were no vCenter Alarms being triggered for this issue. I decided to take a look at the KB to better understand the symptoms and see if there was anything I could do to help. From what I can tell, the only way to identify this particular problem is by looking at the logs which the KB has an example of what you would see.

Once I took a look at the logs, I knew there was at least two methods in which one could get alerts. One option would be to leverage vCenter Log Insight and create a query based on the particular string but no every customer is using Log Insight and it does require a bit of setup. The second more obvious option for me would be to key off of the VMkernel VOBs that are being generated which I have written about in the past for detecting duplicate IP Addresses for ESXi and VSAN component threshold count.

Here are the steps to create vCenter Alarm:

Step 1 - Create a new vCenter Alarm and give it a name. Select "Hosts" for Monitor and "Specific event occurring ..." for Monitor for

vsphere55u1-nfs-apd-alarm-0
Step 2 - For the Trigger, you will add the following VOB entries (just copy/paste them in)

  • esx.problem.storage.apd.start
  • esx.problem.vmfs.nfs.server.disconnect
  • esx.problem.storage.apd.timeout

Note: The alarm will activate if ANY of the VOBs are seen since it is an OR statement. It would have been nice to be able to group these together to generate the alarm

vsphere55u1-nfs-apd-alarm-1
Once the alarm has been created, you will at least have a way to get notified if you are potentially affected by this problem. I would still highly recommend you subscribe to KB 2076392 for all the latest updates.

9 thoughts on “How to create vCenter Alarm to alert on ESXi 5.5u1 NFS APD issue?

  1. Is there a way the alarm triggers are reported in the FAT client v/s web Client?
    I have the screenshots, not sure if I can attach to the comment.

  2. The alarms are nice, but I’ve noticed two things about them: 1) they never go from red to green after being tripped, and 2) there’s no information about the datastore that tripped the alarm.

    Yes, the instructions above indicate that there are limitations in the way the alarm trigger works (the “or vs and” factor), but it’s sort of weird to see these alarms tripped after upgrading to 5.5U2 _and_ removing NFS stores from the cluster…

    • Jim,

      1) I forget off hand if you could create an alarm that will send an alert but not stay red. For most cases, admins would want to see it and then ACK, else you never know when an alarm was fired off unless you were watching it.

      2) You’re right, this is an area we could improve in. I would guess that if you were using the API, you could pull more information about the object that tripped the alarm, I thought this was possible within the Events view when an alarm tripped but haven’t tested it myself.

Thanks for the comment!