1、问题现象

生产环境上,对计算节点文件系统修复,导致某些虚机的镜像文件数据丢失,出现异常,最终造成虚机无法启动,查看对应计算节点的nova日志,报如下错误

nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2560, in power_on
nova-compute: self._hard_reboot(context, instance, network_info, block_device_info)
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2449, in _hard_reboot
nova-compute: vifs_already_plugged=True)
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5191, in _create_domain_and_network
nova-compute: destroy_disks_on_failure)
nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
nova-compute: self.force_reraise()
nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
nova-compute: six.reraise(self.type_, self.value, self.tb)
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5163, in _create_domain_and_network
nova-compute: post_xml_callback=post_xml_callback)
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5081, in _create_domain
nova-compute: guest.launch(pause=pause)
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 145, in launch
nova-compute: self._encoded_xml, errors='ignore')
nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
nova-compute: self.force_reraise()
nova-compute: File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
nova-compute: six.reraise(self.type_, self.value, self.tb)
nova-compute: File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 140, in launch
nova-compute: return self._domain.createWithFlags(flags)
nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit
nova-compute: result = proxy_call(self._autowrap, f, *args, **kwargs)
nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call
nova-compute: rv = execute(f, *args, **kwargs)
nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute
nova-compute: six.reraise(c, e, tb)
nova-compute: File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker
nova-compute: rv = meth(*args, **kwargs)
nova-compute: File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1065, in createWithFlags
nova-compute: if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
nova-compute: libvirtError: internal error: process exited while connecting to monitor: 2020-03-16T01:44:43.128499Z
qemu-kvm: -drive file=/os_instance/3dc75704-f729-4c33-865b-313f0e8a8df8/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none:
qcow2: Image is corrupt; cannot be opened read/write

2、修复方法

进入到虚机disk的目录下,执行qemu-img check disk,检查镜像数据的一致性,发现很多error,执行qemu-img check -r all disk命令,对磁盘镜像进行修复,最后重启虚机即可

3、qemu-img check命令详解

qemu-img check [-f fmt]  [--output=ofmt]  [-r [leaks | all]]  filename

对磁盘镜像文件进行一致性检查,查找镜像文件中的错误,目前仅支持对“qcow2”、“qed”、“vdi”格式文件的检查。其中,qcow2是QEMU 0.8.3版本引入的镜像文件格式,也是目前使用最广泛的格式。qed(QEMU enhanced disk)是从QEMU 0.14版开始加入的增强磁盘文件格式,为了避免qcow2格式的一些缺点,也为了提高性能,不过目前还不够成熟。而vdi(Virtual Disk Image)是Oracle的VirtualBox虚拟机中的存储格式。

参数-f fmt是指定文件的格式,如果不指定格式qemu-img会自动检测,filename是磁盘镜像文件的名称(包括路径)。

如果指定了“-r”,qemu-img将尝试修复在检查时发现的任何非一致性。在使用qemu-img check -r 命令执行,最好对磁盘文件进行备份,-r leaks 仅修复集群损坏。

-r all修复各种类型的错误,该命令执行后,会有一个退出码,不同的数字,表示不同的检测结果

0 检查完成,镜像(现在)是一致的

1 检查由于内部错误而未完成

2 检查完成,镜像已损坏

3 检查完成,镜像已泄漏集群,但没有损坏

63 镜像格式不支持检查

05-28 01:37