Message boards : GPUs : Another GPU detection problem with AMD - ubuntu 14.04 - SIGSEGV
Message board moderation
Author | Message |
---|---|
Send message Joined: 30 May 15 Posts: 265 |
I had noticed GPU detection would stop after an X or display manager (lightdm) crash. Restarting lightdm would restore a screen - but boinc would no loger recognize the GPU. Normally i just restart and all is well, i lock my X session screen rather than logout and all is well. I thought i'd dig a bit into the problem. Enabling the co_proc debug (and restarting in the normal fashion) sudo service boinc-client restart I see Sat 19 Mar 2016 09:35:51 GMT | | [coproc] calInit() returned 1 Sat 19 Mar 2016 09:35:51 GMT | | [coproc] Caught SIGSEGV in OpenCL detection Sat 19 Mar 2016 09:35:51 GMT | | No usable GPUs found and coproc_info.xml shows <coprocs> <warning>NVIDIA: libcuda.so: cannot open shared object file: No such file or dir ectory</warning> <warning>calInit() returned 1</warning> <warning>Caught SIGSEGV in OpenCL detection</warning> </coprocs> I know - see detect gpus - boinc forks a copy with (undocemented) option "--detect_gpus" and looking at the source code, part of this is to generate the coproc_info.xml file. If I run the gpu detection like this sudo -u boinc boinc --detect_gpus the coproc.xml is created well. If i run boinc straight from the command line to see the on screen output 19-Mar-2016 10:39:17 [---] GPU detection failed. error code 512 19-Mar-2016 10:39:17 [---] [coproc] read_coproc_info_file() returned error -108 19-Mar-2016 10:39:17 [---] No usable GPUs found It now does not show a SIGSEGV but new errors In the meantime I'm just going to restart, but i can recreate the problem and i suspect it has to do with fgrlx. Any thoughts? |
Send message Joined: 29 Aug 05 Posts: 15560 |
BOINC error -108 can means: 1. "cannot find the file or the directory it's in", because the file or directory is hidden. 2. BOINC finds that the file is open and in use by another process, or BOINC cannot write to the file. 3. The file being written to is still locked due to an earlier abnormally terminated reading or writing process. 4. Permission problems. BOINC is running as a different user than the one that installed it/has permission to write to files in the directory. Solutions are: 1. not to hide files/directories. 2. exit & restart BOINC. 3. restart computer. 4. run as the user with full permissions, or adjust the directory/file permissions that this user can write to them. |
Send message Joined: 20 Nov 12 Posts: 801 |
As long as we can't see the stacktrace blaming fglrx is fine by me :P It is quite possible you have cause and effect the wrong way round. fglrx may have corrupted itself and that crashed X or lightdm and that corruption also broke OpenCL. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.