This chapter describes common sources of overhead in a QNX hypervisor system, and strategies for reducing this overhead and optimizing system performance.
A VM defines its guest's hardware environment. Once its guest is running, however, a VM is at the mercy of the guest's operations, much like hardware is at the mercy of the code it must execute.
The hypervisor must always respond to the needs of guests in its VMs. If a guest is not tuned for optimal performance in a hypervisor VM and for the board on which the hypervisor is running, the hypervisor can't do anything about the situation. For example, if a guest makes requests that generate numerous spurious interrupts (and, therefore, guest exits), the hypervisor can only attempt to handle these interrupts, in the same way that hardware must respond to the requests of an unvirtualized system.
Sources of overhead specific to hypervisor systems include:
A good general rule to keep in mind when tuning your hypervisor system for performance is that for a guest in a VM the relative costs of accessing priviledged or device registers and memory are different than for an OS running in an unvirtualized environment.
That is, for an OS running in a VM, the cost of accessing memory is usually comparable to the cost for an OS performing the same action in an unvirtualized environment. However, for a guest OS running in a VM, accessing privileged or device registers requires guest exits, and thus incurs significant additional overhead compared to the same action in an unvirtualized environment.
For instructions on how to get more information about what your hypervisor and its guests are doing, see the Monitoring and Troubleshooting chapter.
There are many sources of overhead in a virtualized system; finding them requires analysis both from the top down and from the bottom up.
For top-down analysis:
Run the same system in a VM, and obtain record the same benchmark information (V for virtual).
Usually benchmark N will show better performance, though the opposite is possible if the VM is able to optimize an inefficient guest OS.
Assuming that benchmark N was better than benchmark V, adjust the virtual environment to isolate the source of the highest additional overhead. If benchmark V was better, you may want to examine your guest OS in an unvirtualized environment before proceeding.
When you have identified the sources of the most significant increases to overhead, you will know where your tuning efforts are likely to yield the greatest benefits.
For bottom-up analysis:
The hypervisor events include all guest exits. Guest exits are significant source of overhead in a hypervisor system (see Guest exits and Guest-triggered exits in this chapter).
QNX hypervisors include a hypervisor-enabled microkernel. As with all QNX microkernel systems, the bootloader and startup code pre-configure the SoC, including use of the physical CPUs (e.g., SMP configuration) and memory cache configuration. The hypervisor doesn't modify this configuration.
For more information about how to change the bootloader and startup code, see the Building Embedded Systems in the QNX Neutrino user documentation, and the User's Guide documentation for your Board Support Package (BSP).
The hypervisor supports adaptive partitioning (APS). You can use APS to prevent guests from starving other guests (or even the hypervisor) of essential processing capacity. APS can ensure that processes aren't starved of CPU cycles, while also making sure that system resources aren't wasted. It assigns minimum levels of processor time to a group of threads to use if the threads need it.
Thus, in a hypervisor system, you can use APS to ensure that the vCPU threads in a VM hosting a guest running critical applications are guaranteed the physical CPU resources they need, while also allowing vCPU threads in other VMs to use these CPU resources when critical applications don't need them.
A QNX Neutrino OS running as a guest can also use APS. However, if you are using APS in a guest, you should remember that the partitioning applies to virtual CPUs (i.e., vCPU threads), and not to physical CPUs. If you don't ensure that your guest gets the physical CPU resources it needs, nothing you do with the vCPUs will matter.
For more information, see the Adaptive Partitioning User's Guide.
Like all systems built with a QNX microkernel, the hypervisor supports symmetric multiprocessing (SMP) and bound multiprocessing (BMP). You can configure threads in the hypervisor host to run on specific physical CPUs. These threads include vCPU threads in your VMs; you can pin vCPU threads to one or several physical CPUs by using the cpu runmask option when you assemble your VMs (see cpu in the VM Configuration Reference chapter).
For more information about using SMP and BMP with the QNX Neutrino microkernel, see Multicore Processing in the QNX SDP System Architecture guide.
You can use a combination of CPU runmasks for your vCPU threads with adaptive partitioning to excert very precise control over vCPU scheduling. Remember, though, that runmasks always take priority: if the adaptive partition allows the hypervisor to allocate additional CPU time to a vCPU thread, the hypervisor can allocate the time only if the runmask permits it. That is, the hypervisor can allocate the time it if is available on a CPU on which the vCPU thread is allowed to run, but can't allocate the time if it is on a CPU that has been masked for the vCPU thread.
For example, if the runmask for vCPU thread 1 allows that thread to run on CPUs 2 and 3, and CPU 2 is fully occupied but time is available on CPU 3, then the hypervisor can allow vCPU thread 1 to run on CPU 3. However, if CPUs 2 and 3 are fully loaded but time is available on CPU 4, the hypervisor won't allow vCPU thread to run there, because the runmask for vCPU thread forbids it.
See also Scheduling in the Understanding QNX Virtual Environments chapter.