My IDC server is down again, I reckon if it is possible to host the server in my home and retire the IDC server, then I can reclaim hardwares from the problematic server (it's storage is still decent) and also save CNY 4300 annually. I do have a spare machine that can do this job, it has 32 logic cores and 128GB of RAM. The problem is when I use it as my test server, I also use this machine as a windows gaming platform, which requires a powerful graphic card and windows operating system (my servers are all linux). When the machine be use as a server, it have to be stable, and prevent downtime as much as possible. I also want to use the graphic card on the machine for my GPU and CUDA experiments. The operating system for the experiment platform is Ubuntu and reboot the host server is not an option in this case.
Virtualisation looks like a solution. I was able to use this machine for both computing in Ubuntu Server and gaming platform in Windows with the help of Hyper-V by Microsoft in Windows 8. There will be 3 virtual server in the host, 2 of them need to use the graphic card in the host, because there is only one card available, there will only one of the two VM running in the same time. If the VM want to use hardware on the host PCI-E, it have to be passthrough to the VM. When a hardware was passthroughed, it will be unavailable for host and other VMs. It can be inference that if you want to use PCI passthrough for your graphic card, you also need another card for your host, or you can't see anything after the operating system booted.
Because Hyper-V does not support PCI passthrough, I have to looking for other solutions. As a result to my research, almost all of the solutions that support PCI passthrough virtualization are based on Linux kernel. The first one come into my mind is Xen Server from Citrix. It is free and open source. I have successfully installed the system on my host, installed Windows and passthroughed the graphic. But the problem is the driver from Nvidia cannot get the card working, Windows was able to recongize it's model but there is always a yellow exclamation mark in the device manager. I also gets black screen with no signal all the time then the VM started. I think that might be a compatible problem. Then I turned to VMWare ESXi, unfortunately it's installer cannot load on my machine.
Two of the most authentic solution was passed, I think I have to build my own tools from scratch. Xen Server is a platform faces enterprises, based on Linux, Qemu and KVM. I found some articles to build my own virtualization solution on regular Linux distribution like Xubuntu 16.04 manually. The pro to this way is I can control every delicate details in setting up the VM for my system.
Get a normal VM without PCI passthrough working is easy. But for the machine that required real graphic card, it becomes tricky. First you need to blacklist your device in the kernel pci-stub to avoid it to be used in the host, then it need to be switch to VFIO for VMs. After following the instructions on the web page I mentioned, I get almost the same problem that I encountered in the Xen Server, there is no output in the graphic card. When I use the virtual graphic from Qemu, the driver is not working after installation.
I go through a lot of search and fount this article said that the new card cannot work with default seabios in the system, but OVMF instead. The solution is simple. First install OVMF bios:
sudo apt-get install ovmf
then modify the bios parameter for your VM command with:
After that I was able to see pictures on the screen output form the passwhtoughed graphic card.
How about the performance? I did a quick test in the VM with 3DMark Demo. I got 6582 for graphics and 5947 for CPU (8 cores 16 threads). Looks like near native.
I have also setup a Ubuntu Desktop version use the same Qemu configuration and get CUDA SDK and mxnet examples working. Those two system are switchable by close one and starting another, without interrupting the third virtual machine running as a application server.