问题描述
VM guest虚拟机中的进程是否有可能使用VMX(AMD-V,VT-x)CPU指令,然后由外部VMM处理而不是直接在CPU上处理?
Is it possible that a Process inside a VM guest uses the VMX (AMD-V, VT-x) CPU instructions, that are then processed by the outer VMM instead of directly on the CPU?
编辑:假定外部VM使用VMX本身来管理其虚拟来宾计算机(即,它在Ring -1中运行).
Assume that the outer VM uses VMX itself to manage its virtual guest machine (i.e. it runs in Ring -1).
是否有可能实现支持模拟/拦截VMX调用的VMM(VMware,Parallels,KVM等)?
If it is possible are there any implementations of VMMs that support emulating/intercepting VMX calls (VMware, Parallels, KVM,...)?
推荐答案
Intel的 VT-x 或AMD的 AMD-V 都不支持硬件中的完全递归虚拟化-CPU以call
/ret
对的相同方式保留嵌套虚拟化环境的层次结构.
Nor the Intel's VT-x nor the AMD's AMD-V support a fully recursive virtualization in hardware - where the CPU keep a hierarchy of nested virtualized environments in the same fashion of a call
/ret
pair.
逻辑处理器仅支持两种操作模式:主机模式(在Intel术语中称为VMX根模式,在AMD的术语中称为虚拟机管理程序)和来宾模式(在AMD手册中称为,而在Intel的手册中称为VMX非根模式) ).
这意味着层次结构是扁平的,CPU对每个虚拟环境的处理都是相同的-CPU不知道VM层次结构的深度是多少级.
A logical processor only supports two modes of operation: the host mode (called VMX root mode in Intel terminology, hypervisor in AMD's one) and the guest mode (called as such in AMD's manuals and VMX non-root mode in Intel's ones).
This implies a flattened hierarchy where every virtualized environment is treated the same by the CPU - the CPU is unaware of how many levels the hierarchy of VMs is deep.
尝试在访客内部使用他们自己的虚拟化指令将对监视器(VMM)产生控制.
但是最近出现了一些对加速常用虚拟指令的支持,从而使嵌套VM成为可能.
An attempt to use the virtualization instructions them-selves inside a guest will yield control to the monitor (VMM).
But some support for accelerating frequently used virtual instructions has appeared recently making nested VM possible.
我将尝试分析要实现嵌套虚拟化所要面对的问题.
我并没有讨论全部问题,而是在考虑基本情况,而忽略了涉及硬件虚拟化的所有部分.本身与软件虚拟化一样有问题的部分.
I'll try to analyse the issues to face to implement a nested virtualization.
I'm not dealing with the whole thing - I'm considering the base case only leaving out all the part dealing with the virtualization of the hardware; a part that itself is as problematic as the virtualization of the software.
注意
我不是虚拟化技术专家,也没有任何经验-欢迎进行更正.
该答案的目的是使读者从概念上相信嵌套虚拟化是可能的,并概述了要面对的问题.
Note
I'm not an expert on virtualization technology and have no experience on it at all - corrections are welcome.
The purpose of this answer is to make the reader conceptually believe that nested virtualization is possible and outline the problems to face.
逻辑处理器通过执行vmxon
进入VMX操作-进入模式后,处理器立即进入root模式.
根模式是VMM的模式,它可以启动,恢复和处理VM.
A logical processor enters the VMX operation by executing vmxon
- as soon as the mode is entered the processor is in root mode.
Root mode is the mode of the VMM, it can launch, resume and handle the VMs.
然后VMM使用vmptrld
设置当前的VMCS(VM控制结构)-VMCS包含虚拟化来宾所需的所有元数据.
读取和写入VMCS时不使用直接内存访问,而是使用vmread
和vmwrite
指令.
The VMM then set the current VMCS (VM Control Structure) with vmptrld
- the VMCS contains all the metadata necessary to virtualise a guest.
The VMCS is read and written not with direct memory accesses but with vmread
and vmwrite
instructions.
最后,VMM执行vmlaunch
以开始执行来宾.
Finally, the VMM executes vmlaunch
to start executing the guest.
现在逻辑处理器正在虚拟环境中执行.
假设guest虚拟机本身就是VMM,我们称其为非根VMM-它需要重复上述步骤.
Now the logical processor is executing in a virtualized environment.
Suppose the guest is a VMM itself and let's call this the non-root VMM - it needs to repeat the steps above.
但是,英特尔在其手册中有明确规定(手册3-第25.1.2章):
But Intel is clear in its manuals (Manual 3 - Chapter 25.1.2):
vmxon
该指令导致VM退出,根VMM在其最后一个vmlaunch
之后从该指令恢复,可以检查VMCS退出的原因并采取适当的措施.
我不是经验丰富的VMM编写者,所以我不确定根VMM必须完全模仿该指令执行什么操作-因为在VMX根模式下执行vmxon
将会失败,并且先执行vmxoff
,然后执行使用非根VMM给出的VM Region似乎是一个安全漏洞(或导致它的漏洞),我相信所有根VMM要做的就是记录来宾现在处于"VMX根模式".
在这里引号是必需的:仅当根VMM将控制权交还给非根VMM且CPU将处于非根VMX模式时,该模式才存在于软件中.
vmxon
this instruction causes a VM Exit, the root VMM resume from the instruction after its last vmlaunch
, can inspect the VMCS for the reason of the exit and take appropriate action.
I'm not a seasoned VMM writer so I'm not sure what the root VMM have to do exactly to emulate this instruction - since executing a vmxon
in VMX root mode will fail and doing a vmxoff
followed by a vmxon
with VM Region given by the non-root VMM seems a security vulnerability (or a lead to it) I believe that all the root VMM has to do is record that the guest is now in "VMX root mode".
The quotes are necessary here: this mode exists only in software when the root VMM will handle the control back to the non-root VMM the CPU will be in non-root VMX mode.
之后,非根VMM将尝试使用vmptrld
设置当前的VMCS.vmptrld
将导致VM退出,并且根VMM再次处于受控状态-如果CPU不支持 VMCS遮盖 根VMM必须记录由非根VMM现在是当前的VMCS-如果CPU支持 VMCS影子,则VMM将其 VMCS的 VMCS链接指针字段设置为(用于虚拟化非根VMM的虚拟机)到非根VMM给出的VMCS.
VMM以一种或另一种方式知道哪个虚拟化VMCS处于活动状态.
After that, the non-root VMM will attempt to use vmptrld
to set the current VMCS.vmptrld
will induce a VM exit and the root VMM is in control once again - if the CPU doesn't support VMCS shadowing the root VMM has to record that the pointer given by the non-root VMM is now the current VMCS - if the CPU does support VMCS shadowing the VMM set the VMCS link pointer field of its VMCS (the one used to virtualise the non-root VMM) to the VMCS given by the non-root VMM.
One way or another the VMM knows which virtualised VMCS is active.
vmread
和vmwrite
将或不会导致VM退出.
如果激活了VMCS阴影,则CPU不会执行VM退出操作,而是读取处于活动状态的VMCS中 VMCS链接指针指向的VMCS(称为 shadow VMCS ) .
这将加速嵌套VM的虚拟化.
如果未激活VMCS阴影,则CPU将退出VM,并且根VMM必须模拟读取/写入.
vmread
and vmwrite
executed by the non-root VMM will or will not cause a VM exit.
If VMCS shadowing is active the CPU won't do a VM Exit and instead will read the VMCS pointed by the VMCS link pointer in the active VMCS (called the shadow VMCS).
This will speed up virtualization of nested VMs.
If VMCS shadowing is not active the CPU will VM exit and the root VMM has to emulate the read/write.
最后,非根VMM将启动其VM-这是一个嵌套VM.vmlaunch
将触发VM退出.
根VMM必须做一些事情:
Finally, the non-root VMM will launch its VM - this is a nested VM.vmlaunch
will trigger a VM Exit.
The root VMM has to do a few things:
- 将其VMCS保存在某个地方.
- 合并当前的VMCS和非根VMM VMCS-例如,由于由VMCS控制,导致VM退出的事件是被合并的VM,因此在这方面必须是两者的并集.
- 将合并的VMCS加载为CPU的当前版本
- 执行
vmlaunch
/vmresume
.
- Save its VMCS somewhere.
- Merge the current VMCS and the non-root VMM VMCS - Since the VMCS controls, for example, what events cause a VM Exit the merged one must be the union of the two in this regard.
- Load the merged VMCS as the CPU's current one
- Do a
vmlaunch
/vmresume
.
现在,CPU正在执行嵌套VM(VVM-虚拟VM?).
当敏感指令或事件导致VM退出时会发生什么?
Now the CPU is executing the nested VM (a VVM - Virtual VM?).
What happens when a sensitive instruction or an event causes a VM Exit?
从处理器的角度来看,只有两个虚拟化级别:根VMX模式和非根VMX模式.
由于访客处于非根VMX模式下,因此控制权将转移回根VMX模式代码-即根VMM.
From the processor point of view, there are only two levels of virtualization: the root VMX mode and the non-root VMX mode.
Since the guest is in non-root VMX mode, control is transferred back to the root VMX mode code - i.e. the root VMM.
现在,根VMM必须了解该事件是来自其VM还是来自其VM的VM.
这可以通过跟踪vmlaunch
/vmresume
的使用并检查VMCS中的位来完成.
The root VMM now must understand if that event is from its VM or from its VM's VM.
This can be done by tracking the use of vmlaunch
/vmresume
and checking the bits in the VMCS.
如果VM出口定向到非根VMM,则根VMM必须加载其原始VMCS,最终在其中设置非根VMM的链接,更新非根VMM VMCS状态位并执行vmresume
.
如果将VM出口定向到该出口,则根VMM将像处理其他任何VM出口一样对其进行处理.
If the VM Exit is directed to the non-root VMM the root VMM has to load its original VMCS, eventually set in it the link the non-root VMM, update the non-root VMM VMCS status bits and do a vmresume
.
If the VM Exit is directed to it, the root VMM will handle it as any other VM Exit.
如果我们想在嵌套VM中创建VM,该怎么办?一种虚拟虚拟VM(VVVM).
What if we want to create a VM inside the nested VM?Kind of a Virtual Virtual VM (VVVM).
有两件事要注意:
- 根VMM仍然是在每次VM退出期间调用的VMM.
即使VVVM具有三个级别的深度,它也不是用于非虚拟VMM的第一个和/或唯一用于对其进行虚拟化的管理器.
从安全角度来看,根VMM是薄弱环节. - 该硬件实际上并不支持任意深度嵌套.
从支持1级嵌套到N级嵌套(再次我在这里没有经验),VMM可能不需要太多的工作,但是仍然需要上面概述的特殊支持.
这不像启动VM那样容易,CPU会照顾所有其他一切.
- The root VMM is still the one invoked during every VM Exit.
Even if the VVVM is three levels deep it is not the non-root-non-root VMM the first and/or the only manager used to virtualise it.
From a security point of view, the root VMM is the weak link. - The hardware doesn't really support arbitrary deep nesting.
A VMM may not need too much effort to go from supporting 1-level of nesting to n-levels of nesting (again I'm not seasoned here) but special support as outlined above is still needed.
It is not as easy as launch the VM and everything else will be taken care by the CPU.
AMD-v
AMD-v中没有root或non-root模式,CPU开始使用vmrun
执行VM,该VM带有指向VMCB(VM控制块)的指针,该VMCB的作用与Intel的VMCS相同. br>在vmrun
上,CPU处于访客模式.
AMD-v
There is no root vs non-root mode in AMD-v, the CPU starts executing a VM with vmrun
that takes a pointer to a VMCB (VM Control Block) that serves the same purpose of the Intel's VMCS.
Upon a vmrun
the CPU is in guest mode.
VMCB已缓存,但只能通过常规内存访问来读取.vmload
/vmsave
指令显式加载要缓存的VMCB字段并从缓存中保存.
The VMCB is cached but it can only be read with usual memory accesses.
The vmload
/vmsave
instructions explicitly load into and save from the cache the VMCB fields subject to caching.
此接口比Intel的接口简单,但功能强大-甚至在嵌套虚拟化方面也是如此.
This interface is easier than Intel's one but it is as powerful - even when it comes to nesting virtualization.
假设我们在VM内,并且代码执行vmrun
-因此我们正在虚拟化VMM.
Assume we are inside a VM and the code executes a vmrun
- thus we are virtualizing a VMM.
从技术上讲,VMM可以选择何时vmrun
触发或不触发VM退出.
但是实际上,AMD-v当前始终要求使用前者:
Technically a VMM can choose whenever vmrun
will or will not trigger a VM Exit.
Practically, however, AMD-v currently require the former to always be the case:
因此,根VMM(我将使用与Intel情况相同的术语)将获得控制权,并且必须模拟vmrun
(因为硬件仅支持单一级别的虚拟化).
Thus the root VMM (I'll use the same terminology as in the Intel case) will gain control and has to emulate a vmrun
(since the hardware only support a single level of virtualisation).
根VMM可以保存当前VMCB并将其与非根VMM VMCB合并,并像Intel一样继续使用vmrun
.
The root VMM can save and merge the current VMCB with the non-root VMM VMCB and go ahead with the vmrun
as in the Intel case.
在出口上,根VMM必须确定出口是定向到它还是非根VMM,这又可以通过跟踪VMCB中的vmrun
和控制位来完成.
Upon an exit the root-VMM has to determine if the exit was directed to it or to the non-root VMM, again this can be done tracking the vmrun
and the control bits in the VMCB.
我们已经相对容易地在虚拟机内部设置了虚拟机-现在退出虚拟机时会发生什么?
根VMM接收到退出,并且如果定向到非根VMM,则必须还原其原始VMCB并继续运行(即,将vmrun
与原始VMCB一起使用).
We have set up a VM inside a VM relatively easy - now what happens upon a VM Exit?
The root VMM receives the exit and if directed to the non-root VMM is has to restore its original VMCB and resume the run (i.e. use vmrun
with its original VMCB).
AMD-v通过考虑vmsave
和vmload
指令的地址来宾地址来支持它们的快速虚拟化,因此可以进行通常的页面嵌套虚拟化.
AMD-v supports a fast virtualisation of the vmsave
and vmload
instructions by considering their addresses guest addresses and thus subject to the usual page-nesting virtualisation.
与Intel情况一样,只要VMM支持该功能,就可以再次嵌套虚拟化.
As with the Intel case, the virtualization can be nested again as long as the VMM support that features.
针对Intel的案例所指出的严重安全警告同样适用于AMD的案例.
The critical security warning noted for the Intel's case is valid for the AMD's one as well.
由于其实现定义的格式,并且该存储区可用作未实时更新的溢出区的事实
Due to its implementation-defined format and the fact the memory area can be used just as a spill area that is not updated in real time
这篇关于是否可以在VM中使用VMX CPU指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!