Proxmox Server vGPU: Difference between revisions

From SEPT Knowledge Base
(Created page with " This assumes NVidia vGPU compatible cards and is meant for the VXRail type system. See the official Proxmox documentation for consumer card passthroughs. == PCI Passthrough == === Verifying IOMMU parameters === Verify IOMMU is enabled ==== iDRAC ==== # Log into the iDRAC # Select the BIOS options under Configuration from IDRAC # Select Processor settings and ensure that Virtualization Technology is enabled.File:Virtualization Technology in VXRail.png # Click app...")
 
Line 17: Line 17:


=== Verify IOMMU Isolation ===
=== Verify IOMMU Isolation ===
For working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.
Add CPU passthrough for the modules at boot time<syntaxhighlight lang="bash" line="1" start="1">
cat << EOF >> /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
EOF
</syntaxhighlight>
 
===== Update GRUB =====
 
# Edit the file <code>/etc/default/grub</code> and to the ''GRUB_CMDLINE_LINUX_DEFAULT="quiet"'' add <code>intel_iommu=on iommu=pt</code>
# Run <code>proxmox-boot-tool refresh</code>
 
Reboot the systemFor working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.


When executing (replacing {nodename} with the name of your node):<syntaxhighlight lang="bash" line="1" start="1">
When executing (replacing {nodename} with the name of your node):<syntaxhighlight lang="bash" line="1" start="1">

Revision as of 12:40, 20 March 2024

This assumes NVidia vGPU compatible cards and is meant for the VXRail type system. See the official Proxmox documentation for consumer card passthroughs.

PCI Passthrough

Verifying IOMMU parameters

Verify IOMMU is enabled

iDRAC

  1. Log into the iDRAC
  2. Select the BIOS options under Configuration from IDRAC
  3. Select Processor settings and ensure that Virtualization Technology is enabled.
  4. Click apply if changes were made at the bottom of the list and reboot

LifeCycle Controller

TBC

Verify IOMMU Isolation

Add CPU passthrough for the modules at boot time

cat << EOF >> /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
EOF
Update GRUB
  1. Edit the file /etc/default/grub and to the GRUB_CMDLINE_LINUX_DEFAULT="quiet" add intel_iommu=on iommu=pt
  2. Run proxmox-boot-tool refresh

Reboot the systemFor working PCI passthrough, you need a dedicated IOMMU group for all PCI devices you want to assign to a VM.

When executing (replacing {nodename} with the name of your node):

# pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist ""

you should get:

┌──────────┬────────┬──────────────┬────────────┬────────┬────────────────────────────────────────┬...
│ class    │ device │ id           │ iommugroup │ vendor │ device_name                            |                         
╞══════════╪════════╪══════════════╪════════════╪════════╪════════════════=═══════════════════════╪
|0x030200  |0x1eb8  | 0000:3b:00.0 |   3        | 0x10de |  TU104GL [Tesla T4]                    |
├──────────┼────────┼──────────────┼────────────┼────────┼────────────────────────────────────────┼

Activate GPU Passthrough

Blacklisting drivers

The standard NVidia and nouveau drivers from Linux needs to be blacklisted.

echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf 
echo "blacklist nvidia*" >> /etc/modprobe.d/blacklist.conf

After blacklisting, you will need to reboot.

Setup Proxmox VE Repositories

Ensure that the correct repositories are enabled:

Ensure that the relevant repositories are enabled. Most systems we use are using the no-subscription model. You can use the Repositories management panel in the Proxmox VE web UI for managing package repositories, see the documentation for details.

Setup DKMS:

Because the NVIDIA module is separate from the kernel, it must be rebuilt with Dynamic Kernel Module Support (DKMS) for each new kernel update.

To set up DKMS, you must install the headers package for the kernel and the DKMS helper package. In a root shell, run

apt update
apt install dkms libc6-dev proxmox-default-headers --no-install-recommends

Installing Host Drivers

NOTE: The drivers can be downloaded using the btechts account on the nVidia website or found under the folder \VMware\7 Ent - VXRail\nVidia\ of the software share.

  1. Copy the KVM based run file to the host/node
  2. Make the file executable
  3. Install the drivers with the dkms module enabled
chmod +x NVIDIA-Linux-x86_64-xxx.xxx.xx-vgpu-kvm.run
./NVIDIA-Linux-x86_64-xxx.xxx.xx-vgpu-kvm.run

After the installer has finished successfully, you will need to reboot your system, either using the web interface or by executing reboot.

Enable SR-IOV

  1. Create the folder structure /usr/local/lib/systemd/system/
  2. Create the file /usr/local/lib/systemd/system/nvidia-sriov.service
  3. Add the service information below to the service
  4. Reload the systemctl daemon
  5. Enable the SR-IOV module
mkdir -p /usr/local/lib/systemd/system/
cat <<EOF > /usr/local/lib/systemd/system/nvidia-sriov.service
[Unit]
Description=Enable NVIDIA SR-IOV
After=network.target nvidia-vgpud.service nvidia-vgpu-mgr.service
Before=pve-guests.service

[Service]
Type=oneshot
ExecStart=/usr/lib/nvidia/sriov-manage -e ALL

[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now nvidia-sriov.service

Configure GRUB

ZFS File systems

  1. Edit the file /etc/kernel/cmdline and add to the end of it intel_iommu=on
  2. Run the command proxmox-boot-tool refresh
  3. Reboot the machine
  4. Confirm the IOMMU parameter with cat /proc/cmdline