- Preface
- Core services may not running
- Two different host types
- No Shared volume
- No enough Quota
- Apparmor in place
- The source compute host swapped
- No valid host found
- Further reading
Preface
Recently I had faced an interesting question during an Interview. Question was, One of the virtual machine is in error state, how will you troubleshoot?
. Well, thats a general question and all of them went through it atleast once?. But real question is where to troubleshoot? Before start troubleshooting, you need to understand whats the core issue causes this error status. Here i try to explain few scenarios for a Guest VM error state.
Core services may not running
nova service-list
neutron service-list
rabbitmqctl cluster_status
cinder service-list
systemctl status keystone.service
All should give a smiley to you :-)
. If not, check the reason for a failure.
Two different host types
When you are live migrating a Guest VM from a host to another host which have a different computing capabilities like less number of CPU cores, the VM may ended up in an error.
No Shared volume
Live migration is possible only when a shared volume is attached to both Source and Destination host. If there is no shared volume available, the virtual machine will fall down to error
state.
No enough Quota
User is spinning up a virtual machines, but his resouce quota finished, for example neutron port
the virtual machine will ended up with error state
Apparmor in place
Apparmor (Application Armor) is a Linux kernel security module that allows the system administrator to restrict programs' capabilities with per-program profiles. If you didnt implement proper profile for libvirt, during the live migration the virtual machine may ended up in error.
The source compute host swapped
The source compute where Guest VM residing is already swapped. Guest VM start using swap. And in this case you are trying to live migrate the instance to different host, your virtual machine may ended up with error.
No valid host found
Some times, you have enough resouce in your compute inventory but you may be missing a metadata
entry in your Aggregate configuration. In this case, nova-scheduler
is looking for aggregate with this metadata in flavor
properties and could not find one. In this case, nova mark the virtual machine in error state with a message No valid hosts found
.