Thread 'CUDA switching off my PC'

Message boards : Questions and problems : CUDA switching off my PC
Message board moderation

To post messages, you must log in.

AuthorMessage
mike

Send message
Joined: 24 Oct 09
Posts: 5
United Kingdom
Message 28288 - Posted: 24 Oct 2009, 11:14:05 UTC
Last modified: 24 Oct 2009, 11:22:34 UTC

Ever since I began to use CUDA, my system suffers from sudden switchoffs and reboots. Occassionally I'll get a bluescreen for a second or so before it instant-shutdowns. Cutting down CPU usage has little effect. This ONLY ever occurs when a CUDA task is running, and I only crunch SETI. Switch out the CUDA and never get a problem.

The event log states that its a BOINC app error, and there is some evidence that its a 'memory related problem'.
A search of this forum hasn't found anything quite the same so can anyone help?


System:
BIONC 6.6.36
XP.
Aging Pentium D 3.4 overclocked to 3655 @ 35degrees nominal during normal crunching. Custom air cooler.

4 gigs of ram, but its 32bit so only sees part of the extra memory.

Coolermaster 700W silent pro PSU running

Nvidia GTS250 card.driver 6.14.11.9107.
I overclock this with VTune for gaming without problems. The CUDA crashes occur even when the card is run in safe mode.

Temperature never gets above 50 so I've ruled that out already.

cheers
Mike

ID: 28288 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15626
Netherlands
Message 28290 - Posted: 24 Oct 2009, 11:48:00 UTC - in response to Message 28288.  

Since you're finding it's a Seti app error/problem, best post about it in the Seti CUDA forum.
You may want to check Event Viewer on what it says about the BSODs and post either a link to your erroneous tasks, or post the complete error messages you see (in your tasks list and Event Viewer).

Pentium D 3.4 overclocked to 3655 @ 35degrees nominal during normal crunching. Custom air cooler.

I'm doubtful you're checking the correct sensor. I think you're checking the motherboard sensor, not the actual CPU sensor. I have a P4 3.0Ghz and even after rigorous cleaning, it'll run 45C normal/55-60C under load. Motherboard sensor shows 26-30C though.

35C can almost only be gotten if you freeze the whole heat sink or run water cooling.
ID: 28290 · Report as offensive
mike

Send message
Joined: 24 Oct 09
Posts: 5
United Kingdom
Message 28291 - Posted: 24 Oct 2009, 12:28:51 UTC
Last modified: 24 Oct 2009, 12:49:07 UTC

OK, I'll try that forum too (its seems you post on this issue there anyway, Ageless).

Event Viewer isn't very forthcoming:
The description for Event ID ( 1 ) in Source ( BOINC ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: BOINC error: 183, Another instance of BOINC is running.

Erm. There isn't.

Unsure what 'tasks list' refers to in your post.

I monitor using Asus own AiBooster, which gives me separate readings for CPU and System. It can be a little flaky, but the readings correlate to Probe2 readings so on balance I trust them but open to anyone with better knowledge! I have an AC Freezer Pro 7 CPU cooler - maybe you should try one 8^).


24/10/2009 12:47:26 Processor: 2 GenuineIntel Intel(R) Pentium(R) D CPU 3.40GHz [x86 Family 15 Model 6 Stepping 4]
24/10/2009 12:47:26 Processor features: fpu tsc pae nx sse sse2 mmx
24/10/2009 12:47:26 OS: Microsoft Windows XP: Professional x86 Edition, Service Pack 3, (05.01.2600.00)
24/10/2009 12:47:26 Memory: 3.00 GB physical, 4.84 GB virtual
24/10/2009 12:47:26 Disk: 232.88 GB total, 142.94 GB free
24/10/2009 12:47:26 Local time is UTC +1 hours
24/10/2009 12:47:26 CUDA device: GeForce GTS 250 (driver version 19107, compute capability 1.1, 1024MB, est. 84GFLOPS)
ID: 28291 · Report as offensive
ProfileGundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 28292 - Posted: 24 Oct 2009, 13:29:27 UTC - in response to Message 28291.  

Event Viewer isn't very forthcoming:

Ageless was speaking about BSOD-related messages, not (necessarily) BOINC-related ones.

Unsure what 'tasks list' refers to in your post.

You get your task list if you click "Tasks View" on this page.

I monitor using Asus own AiBooster, which gives me separate readings for CPU and System.

Can you read out the temperature of your GPU?

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 28292 · Report as offensive
mike

Send message
Joined: 24 Oct 09
Posts: 5
United Kingdom
Message 28293 - Posted: 24 Oct 2009, 14:28:52 UTC - in response to Message 28292.  
Last modified: 24 Oct 2009, 14:30:48 UTC

Its usually an instant shutdown rather than a BSOD - when I do see a blue screen, I get a reboot seconds later. Usually, its almost like somethings tripping the PC to switch off.

Event Viewer isn't very forthcoming:

Ageless was speaking about BSOD-related messages, not (necessarily) BOINC-related ones.


Thought he meant system tasks not SETI tasks. Fail to see how a specific cuda task name is going to help solve the problem.

Unsure what 'tasks list' refers to in your post.

You get your task list if you click "Tasks View" on this page.



According to Vtune, during crunching it stabilises at 50-51 deg C, fan fixed at 50%. Get much hiogher with dynamic fan set but I hate heat. Have watched the system several time as it 'pops' and the temp doesn't appear to spike.

Wondering if its an issue with current, but unsure how to analyse current spikes.

I monitor using Asus own AiBooster, which gives me separate readings for CPU and System.

Can you read out the temperature of your GPU?

Gruß,
Gundolf
ID: 28293 · Report as offensive
ProfileGundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 28294 - Posted: 24 Oct 2009, 15:02:17 UTC - in response to Message 28293.  

Its usually an instant shutdown rather than a BSOD - when I do see a blue screen, I get a reboot seconds later.

Yes, and what does the event log tell you then?

Thought he meant system tasks not SETI tasks. Fail to see how a specific cuda task name is going to help solve the problem.

Not the name, the link! It shows the stderr output of the task.

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 28294 · Report as offensive
mike

Send message
Joined: 24 Oct 09
Posts: 5
United Kingdom
Message 28296 - Posted: 24 Oct 2009, 15:39:39 UTC - in response to Message 28294.  
Last modified: 24 Oct 2009, 15:45:28 UTC

as below, not a lot:

The description for Event ID ( 1 ) in Source ( BOINC ) cannot be found. The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part of the event: BOINC error: 183, Another instance of BOINC is running.

There isn't.


The task list shows that I have a few in progress. The last completed was a few days ago and it was validated. I have no CUDA tasks with errors.

1397578492 520262779 20 Oct 2009 22:30:09 UTC 24 Oct 2009 9:20:07 UTC Completed and validated 2,265.77 632.56 114.35 83.18 SETI@home Enhanced v6.08 (cuda)


clicking Workuint ID on this gives 2 records:
1397578491 3750395 20 Oct 2009 22:30:09 UTC 21 Oct 2009 9:25:36 UTC Completed and validated 0.00 20,901.49 83.18 83.18 SETI@home Enhanced v6.03 (other user)
1397578492 3382555 20 Oct 2009 22:30:09 UTC 24 Oct 2009 9:20:07 UTC Completed and validated 2,265.77 632.56 114.35 83.18 SETI@home Enhanced v6.08 (cuda)


Its usually an instant shutdown rather than a BSOD - when I do see a blue screen, I get a reboot seconds later.

Yes, and what does the event log tell you then?

Thought he meant system tasks not SETI tasks. Fail to see how a specific cuda task name is going to help solve the problem.

Not the name, the link! It shows the stderr output of the task.

Gruß,
Gundolf
ID: 28296 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15626
Netherlands
Message 28298 - Posted: 24 Oct 2009, 17:13:29 UTC - in response to Message 28291.  
Last modified: 24 Oct 2009, 17:15:41 UTC

OK, I'll try that forum too (its seems you post on this issue there anyway, Ageless).

I post in more than one (project) forum, there's nothing to that.

I pointed you to the Seti forums, as it seems to be a Seti app problem, not a BOINC problem. Or more like a problem with your videocard, which is still not a BOINC problem. There are plenty of others at Seti who have a deep insight into what might be ailing your computer.

If BOINC and any science application on the CPU runs without flaws and your system starts crashing/rebooting when you run tasks on the GPU, it's most probably a GPU or other hardware problem (including cables, PCIe/PCI/AGP bus, connectors, bulging capacitors and PSUs).

So, we can ask you to upgrade to BOINC 6.6.41 (the latest recommended), or to 6.10.16 (the latest beta), but they'll probably portray the same problem as, as far as I can see, it isn't a BOINC problem.

If the science app would crash the BOINC client, it would be a BOINC problem.
Rebooting/BSODing computers on running this one science app isn't a BOINC problem. BOINC doesn't do science, doesn't use the GPU in any way, the project's science application does all that.

You can test any other project's GPU app. If any of those switch your computer off as well, it's neither a BOINC nor a science app problem, but definitively hardware.
ID: 28298 · Report as offensive
mike

Send message
Joined: 24 Oct 09
Posts: 5
United Kingdom
Message 28490 - Posted: 4 Nov 2009, 22:24:50 UTC - in response to Message 28298.  
Last modified: 4 Nov 2009, 22:25:10 UTC

moved to http://setiathome.berkeley.edu/forum_thread.php?id=55973

Jord, you may well be right, Memtest pops the PC too...
ID: 28490 · Report as offensive

Message boards : Questions and problems : CUDA switching off my PC

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.