Okay, maybe I am being a bit dramatic 😉 However, this is what crosses my mind every time I hear someone trying to query for something in the VCDB (vCenter Server Database) instead of using the vSphere API. Every so often I see a customer request asking for a specific SQL Query to find something in the VCDB. It just baffles me on why someone would want to go through such pain, not to mention this is not supported nor a recommended best practice.

Disclaimer: No kittens were harmed during the writing of this article

There are many disadvantages by going directly to the VCDB versus going through the vSphere API:

  • There are no guarantees on backwards compatibility of the database schema or database views from release to release
  • The data is in a very raw form, there is no abstraction which is usually provided by an API to make consumption easier
  • Risk of crafting queries that could negatively impact the performance of vCenter Server
  • Not officially supported by VMware, unless directed by VMware GSS
  • Pulling hair out while trying to understand the internal relationships of various objects, especially with a complex system like vCenter Server

Although the vSphere data is eventually persisted in the VCDB, the main purpose of the vSphere API, like many other software APIs is to provide a well defined interface for interacting with the underlying platform. I think there are still many vSphere/System Administrators who fear the word "API". Instead of being afraid of APIs, we should all embrace it and start to better understand how they work as it is just a way of interacting with the underlying system.

I can speak from my own personal experience, the first task for me out of college and into the real world was to look into building an application impact assessment report when an vSphere HA event occurred. We knew which Virtual Machines were restarted, but we did not have a good idea of what applications were affected. I had to correlate the impacted Virtual Machines with a list of applications and application owner from our homegrown CMDB which ran on MS SQL Server. At the time, I did not know any better and I thought best way of getting this information was to go directly to the VCDB (at the time this was Virtual Center 2.5). I found myself creating super complex SQL queries and trying to reverse engineer the relationships between all the different tables to get basic information from vCenter Server. If I knew what I know now back then, I would have gone straight to learning the VMware APIs, as I could have gotten everything I needed in just a couple of lines of code. After completing the project among few others which I leverage the VCDB directly, I realized that I needed to learn this VMware API. This early lesson has greatly helped me in my career to further automate IT/VMware infrastructure and help us move towards this new world of a Software-Defined Datacenter.

The diagram below provides an overview of some of the ways the vSphere API can be consumed by customers. An example is the vSphere Web Client and the legacy vSphere C# Client, this is a UI interface that customers can use to interact with the vSphere platform which in turns uses the vSphere API. There are also a variety of CLIs and scripting/programming languages that are built for easily extracting information and performing operations against vSphere that could be as simple as one-liner. The vSphere API can also be consumed by 3rd party companies to further abstract and provide new and interesting ways of interacting with vSphere, a great example of this is the CloudPhysics Card Builder platform that makes creating custom vSphere reports a breeze.

Note: The diagram above is just an example of some of the SDK/CLI/UIs that VMware provides to our customers. This is not meant as an exhaustive list and you can find all SDKs/CLIs for vSphere here.

Hopefully with this information, customers will better understand the benefits of using the vSphere API versus going directly to the VCDB.

14 thoughts on “A kitten dies, every time you query the VCDB

  1. William, I feel bad as we’re an org that make use of our SQL DBA’s to gather monthly reporting data from VCDB. I would like to change but have little knowledge when it comes to using the API. I can get thing going with PowerCli and get great assistance from the PowerCli Community.
    Is there such a support community when it comes to say “pyvmomi” or is that purely up to an individual with some python skills?
    Thanks for the great work.

  2. PROS of direct sql query of VCDB:
    -Getting information from Vcenter is a LOT slower than getting directly from VCDB
    -Getting it with PowerCLI can take hours even with optimized query

    • When you say “slower” is this through the use of PowerCLI? The reason I ask is often times customers use PowerCLI to perform reporting and don’t realize their queries are not optimized and even in the case that they are, it may not always be the most efficient as PowerCLI provides an abstraction against the vSphere API to provide ease of use. As with anything, you will always be giving something up for ease of use. You can get quite efficient when leveraging the API directly and I’ve seen customers including internal solutions that can query several thousand objects and get real time updates without issues. It’s all about how it’s done, so though PowerCLI may “seem slower” compared to the VCDB, it’s not the only way. The point of this article is that for any data queries, you should be leveraging the API as that is the “contract” VMware provides for a consistent/documented way of getting the data.

  3. Suppose a person doesn’t know how to access the vCenter API (where is this information?) or just wants to return a small amount of data from CLI. For example I want to automate system monitoring and don’t want to have to manually open a GUI to get the information I need.


  4. Thank you. At first glance, it doesn’t make much sense to me, but I will see if I can get the information I need from one of those links on the page.

    • Hi Lewis,

      That’s a typo on my end, I think I was trying to make the boxes “even” and I sort of assumed vSphere SDK for .NET under “PowerCLI” which though it does use, are two separate things. I’ll append a note so others are aware but .NET SDK is healthy and has a bright future. Sorry about that

  5. No kitten dies accessing VCDB without the VM SDKs – they died, because the data model is not fully documented.

    • It’s apparent you didn’t actually read the article or else you would not be making that statement. If you prefer to do things the painful way, more power to you 🙂

  6. Thanks, this article could not have come at a better time. I was just having conversation with my colleague about this because he is working with our DBAs to build a reporting application by quering VCDB directly. He wants monthly report going to managers showing how many resources their VMs are consuming (based on owner identified in custom annotation field of VM and looking up their manager in CMDB). Can this type of report be built using Cloud Physics or another tool?

Thanks for the comment!

This site uses Akismet to reduce spam. Learn how your comment data is processed.