Fixing Invalid Virtual Machines in VMware


Sometimes virtual machines within your VMWare environment may show up as invalid. The machine is in-fact still running at this point; but you are unable to manage the virtual machine. This can handle for a few reasons, but in my experience the most common is when the esx host is unable to access the storage.

I have seen this caused by high latency when accessing an NFS datastore, when you leave an offline datastore mounted for an extended period, and I have also seen this happen when a SAN controller failover event occurs.

This condition happens when the esx host cannot access the vmx file (vmware configuration file) in the datastore, so VMWare just sort of forgets the configuration of the machine. The process which runs the virtual machine is still running on the host, so your VM does not crash. But, you still need to recover since HA, DRS, etc... no longer work when the VM is in this state.

If this is becoming a common occurrence in your environment; you will need to do additional troubleshooting to determine why this is happening to you. But to get you out of the mess you are in right now, you need to re-register the VM. One way to re-register the VM would be to shut down the virtual machine, remove it from inventory, browse the datastore, then re-add it to inventory.

But, my preferred method, which does not involve shutting down your VM is to do the following:
1. Login to vCenter, and locate the VM in your inventory
2. Click on VM, Select Summary, note the Host the VM is running on
3. Enable SSH on that Host
4. Go into the Hosts and Clusters view in vCenter
5. Select the host

*Click the configuration tab
*Click on Security Profile
*Click on SSH
*Click Options
*Click Start

6. SSH into the host and type the following command *note* (If you have not created any other users, you will need to use your Root credentials to connect):
vim-cmd vmsvc/getallvms | grep invalid
The above will list all the VM's which are currently invalid
7. Note the four digit VM ID number for the VM you wish to repair
8. Type the following command, replacing with the four digit number from the previous step:
vim-cmd vmsvc/reload <vm-id>
~ # vim-cmd vmsvc/getallvms | grep invalid
Skipping invalid VM '34'
~ # vim-cmd vmsvc/reload 34

9. Wait 60 seconds, the VM will now re-register, and should no longer display as invalid.

If you perform the steps above, but when you get to step 4 you have trouble finding the VM ID that you need to run this against, you can take out the grep command, and just run vim-cmd vmsvc/getallvms This will list all the VM ID’s running on this ESX host. You might see the name of the VM in question, or you might see on VM ID where it shows an error while retrieving the information. When you see that, it should give you an indication that you have found the VM you are trying to fix.