Why does nvidia CTK config need to be manually generated at each boot?
For certain GPU-accelerated tasks, such as Docker containers using GPU, I need to run the following on every boot:
sudo nvidia-ctk cdi generate --output /var/run/cdi/nvidia.yaml
Without this, GPU-dependent containers refuse to launch and nvidia-smi
doesn't work inside containers, but after I run that command, they work normally.
The YAML file disappears at shutdown. Unsurprising, since /var/run is a virtual FS. But in that case, is there some drawback to making it get auto generated at each boot up?
I discovered the command by trial and error, during a period of desperate troubleshooting. I know it's needed to make GPU containers work, but I don't know why and I don't recall where I originally found it (I was copying and pasting blindly from random bug report comments). I know how to create a systemd unit to automate it, but I would like to understand why the issue exists in the first place.
0 comment threads