I could bring it to work as I wanted. I think, I will create a script or such later on. If someone is interested:
Note: nvidia-inst was used to install the latest on my pc.
Pre install (AMD + Nvidia)
sudo nano /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="nowatchdog nvme_load=YES loglevel=3 amd_iommu=on iommu=pt nvidia-drm.modeset=1"
sudo nano /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1e07,10de:10f7,10de:1ad6,10de:1ad7
softdep nvidia pre: vfio-pci
sudo nano /etc/dracut.conf.d/10-vfio.conf
force_drivers+=" vfio_pci vfio vfio_iommu_type1 "
sudo grub-mkconfig -o /boot/grub/grub.cfg
sudo dracut-rebuild
Output after reboot (fresh start)
lspci -nnk | grep -A 3 "NVIDIA"
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] [10de:1e07] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: nouveau, nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
01:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: xhci_pci
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: i2c_nvidia_gpu
- VM working as expected
- nvidia-smi not working as expected
- “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
vfio to host
echo "0000:01:00.0" | sudo tee /sys/bus/pci/drivers/vfio-pci/unbind
echo "0000:01:00.1" | sudo tee /sys/bus/pci/drivers/vfio-pci/unbind
echo "0000:01:00.2" | sudo tee /sys/bus/pci/drivers/vfio-pci/unbind
echo "0000:01:00.3" | sudo tee /sys/bus/pci/drivers/vfio-pci/unbind
sudo modprobe nvidia
sudo modprobe nvidia_modeset
sudo modprobe nvidia_drm
sudo modprobe nvidia_uvm
echo "0000:01:00.0" | sudo tee /sys/bus/pci/drivers/nvidia/bind
sudo modprobe snd_hda_intel
echo "0000:01:00.1" | sudo tee /sys/bus/pci/drivers/snd_hda_intel/bind
sudo modprobe xhci_pci
echo "0000:01:00.2" | sudo tee /sys/bus/pci/drivers/xhci_hcd/bind
For the Serial Bus Controller (01:00.3): This device is typically not manually bound like the others since its functionality is often managed internally by the kernel or other modules. If needed, ensure i2c_nvidia_gpu is loaded (though this is not commonly bound/unbound manually).
lspci -nnk | grep -A 3 "NVIDIA"
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] [10de:1e07] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: snd_hda_intel
Kernel modules: snd_hda_intel
01:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel modules: i2c_nvidia_gpu
- vm not working as expected
- nvidia-smi and so on are working as expected.
host to vfio
echo "0000:01:00.2" | sudo tee /sys/bus/pci/drivers/xhci_hcd/unbind
echo "0000:01:00.1" | sudo tee /sys/bus/pci/drivers/snd_hda_intel/unbind
sudo rmmod nvidia_uvm
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia
echo "0000:01:00.0" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
echo "0000:01:00.1" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
echo "0000:01:00.2" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
echo "0000:01:00.3" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
lspci -nnk | grep -A 3 "NVIDIA"
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU102 [GeForce RTX 2080 Ti Rev. A] [10de:1e07] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: nouveau, nvidia_drm, nvidia
01:00.1 Audio device [0403]: NVIDIA Corporation TU102 High Definition Audio Controller [10de:10f7] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
01:00.2 USB controller [0c03]: NVIDIA Corporation TU102 USB 3.1 Host Controller [10de:1ad6] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: xhci_pci
01:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU102 USB Type-C UCSI Controller [10de:1ad7] (rev a1)
Subsystem: NVIDIA Corporation Device [10de:12fa]
Kernel driver in use: vfio-pci
Kernel modules: i2c_nvidia_gpu
virt-manager had to be restarted as it seams (reopen it if it was open before).
- VM working as expected
- nvidia-smi not working as expected
- “NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
Notes:
modprobe
has to be executed in this order!
sudo modprobe nvidia
sudo modprobe nvidia_modeset
sudo modprobe nvidia_drm
sudo modprobe nvidia_uvm
rmmod
has to be executed in this order!
sudo rmmod nvidia_uvm
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia