Skip to main content

Unable to resolve NVIDIA / nvprof ERR_NVGPUCTRPERM with NVreg_RestrictProfilingToAdminUsers=0 [Resolved]

I've just purchased an RTX 2060 and so far everything works well in my environment / setup. However, I am still unable to profile my code --

(nvidia) brandon@b350-gaming-pc:~/projects/nvidia$ nvprof ./example.py 
==29983== NVPROF is profiling process 29983, command: python3 ./example.py
Time: 0.05056905746459961
==29983== Warning: ERR_NVGPUCTRPERM - The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/ERR_NVGPUCTRPERM 
==29983== Profiling application: python3 ./example.py
==29983== Profiling result:
No kernels were profiled.
No API activities were profiled.
==29983== Warning: Some profiling data are not recorded. Make sure cudaProfilerStop() or cuProfilerStop() is called before application exit to flush profile data.

I understand this was apparently a permissions "bug," so I proceeded to add the following --

(nvidia) brandon@b350-gaming-pc:~/projects/nvidia$ cat /etc/modprobe.d/cuda.conf 
NVreg_RestrictProfilingToAdminUsers=0

However, following a reboot, I get the same message while attempting to profile my code. Moreover,

(nvidia) brandon@b350-gaming-pc:~/projects/nvidia$ sudo update-initramfs -u
[sudo] password for brandon: 
update-initramfs: Generating /boot/initrd.img-4.15.0-55-generic
libkmod: ERROR ../libkmod/libkmod-config.c:656 kmod_config_parse: /etc/modprobe.d/cuda.conf line 1: ignoring bad line starting with 'NVreg_RestrictProfilingToAdminUsers=0'
libkmod: ERROR ../libkmod/libkmod-config.c:656 kmod_config_parse: /etc/modprobe.d/cuda.conf line 1: ignoring bad line starting with 'NVreg_RestrictProfilingToAdminUsers=0'
libkmod: ERROR ../libkmod/libkmod-config.c:656 kmod_config_parse: /etc/modprobe.d/cuda.conf line 1: ignoring bad line starting with 'NVreg_RestrictProfilingToAdminUsers=0'
...

This command repeats seemingly forever.

Is there something I'm missing here?

Here's some more information about the driver and my environment --

(base) brandon@b350-gaming-pc:~$ nvidia-smi 
Mon Sep  9 11:12:51 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.00    Driver Version: 418.87.00    CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2060    On   | 00000000:0A:00.0  On |                  N/A |
|  0%   45C    P8    20W / 170W |   1323MiB /  5903MiB |     38%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2603      G   /usr/lib/firefox/firefox                       3MiB |
|    0      4300      G   /usr/lib/xorg/Xorg                            34MiB |
|    0      4894      G   /usr/bin/gnome-shell                          51MiB |
|    0      5806      G   /usr/lib/xorg/Xorg                           254MiB |
|    0      5920      G   /usr/bin/gnome-shell                         899MiB |
|    0     10378      G   ...quest-channel-token=3880407371781342003    36MiB |
+-----------------------------------------------------------------------------+
(base) brandon@b350-gaming-pc:~$ uname -r
4.15.0-55-generic
(base) brandon@b350-gaming-pc:~$ lsmod | grep -i nvidia
nvidia_uvm            798720  0
nvidia_drm             45056  8
nvidia_modeset       1093632  17 nvidia_drm
nvidia              18194432  718 nvidia_uvm,nvidia_modeset
drm_kms_helper        167936  1 nvidia_drm
drm                   401408  11 drm_kms_helper,nvidia_drm
ipmi_msghandler        53248  2 ipmi_devintf,nvidia
(base) brandon@b350-gaming-pc:~$ which nvprof 
/usr/local/cuda-10.1/bin/nvprof
(base) brandon@b350-gaming-pc:~$ which python
/home/brandon/anaconda3/bin/python

Please let me know if you'd like to see anything else / output from my system.


Question Credit: bd1251252
Question Reference
Asked September 21, 2019
Tags: ubuntu, nvidia
Posted Under: Unix Linux
29 views
2 Answers

I think you just missed the entire option for the /etc/modprobe.d/cuda.conf file. Try this instead:

options nvidia "NVreg_RestrictProfilingToAdminUsers=0"

credit: ajgringo619
Answered September 21, 2019

If not as root as sudo run below from your login:

systemctl isolate multi-user # Stop the window manager. modprobe -r nvidia_uvm nvidia_drm nvidia_modeset nvidia-vgpu-vfio nvidia sudo setcap cap_sys_admin+ep modprobe nvidia NVreg_RestrictProfilingToAdminUsers=0 ;; add following to /etc/modprobe.d/<.conf> systemctl isolate graphical

Before setting the inserting module key set or unset the windows manager should be stopped and any old modules should be unloaded. After inserting the module key make sure you start the windows manager.

Print the out put of command from current user where you are running above commands if you still notice error: $ capsh --print|grep -i "cap_sys_admin"


credit: Kiran Kumar Annam
Answered September 21, 2019
Your Answer
D:\Adnan\Candoerz\CandoProject\vQA