Message boards : BOINC Manager : No Heartbeat seems to be causing DLL error messages
Message board moderation
Author | Message |
---|---|
Send message Joined: 13 Sep 05 Posts: 10 |
I have been getting those DLL initialization errors on 'no heart beat' conditions. Take a look at this malariacontrol Looking at the stderr section, you notice the heartbeat condition. This matches up with my log messages: Lair2.paperdragon.ca malariacontrol.net beta 2008-01-24 01:20:34 Task wu_84_316_89784_0_1200991208_0 exited with a DLL initialization error. Or this one from SETI beta Log entry for it: Lair2.paperdragon.ca SETI@home Beta Test 2008-01-24 01:20:34 Task 11oc06aa.25099.530432.3.10.87_0 exited with a DLL initialization error. So for some reason the no heartbeat errors are being reported as DLL errors |
Send message Joined: 16 Apr 06 Posts: 386 ![]() |
The mislabeling of 'status zero exit' messages as 'dll initialisation error' messages was fixed shortly after 5.10.30 was released, although these later versions are still in testing. |
Send message Joined: 19 Jan 07 Posts: 1179 ![]() |
I have been getting those DLL initialization errors on 'no heart beat' conditions. What BOINC version do you have? |
![]() Send message Joined: 3 Apr 06 Posts: 547 ![]() |
The mislabeling of 'status zero exit' messages as 'dll initialisation error' messages was fixed shortly after 5.10.30 was released, although these later versions are still in testing. I was and am still seeing the "DLL initialization error" exit messages through 5.10.41, 5.10.42 and 5.10.45, on Win XP SP2, and definitely on 'no heart beat' conditions. Happens mostly with a busy system during a wakeup from hibernation (it does not matter whether the applications are suspended or not, whereas the client in the mean time often happily comunicates with various schedulers or uploads files). But occasionally just while the client tries to communicate (possibly the 'single blocked thread' problem), like today: 28-Apr-2008 18:39:48 [QCN Alpha Test] [task_debug] result qcne_002178_0 checkpointed 28-Apr-2008 18:42:34 [SETI@home] [task_debug] result 24mr08ab.22129.22976.15.8.197_1 checkpointed 28-Apr-2008 18:50:42 [QCN Alpha Test] [task_debug] result qcne_002178_0 checkpointed 28-Apr-2008 18:53:24 [SETI@home] [task_debug] result 24mr08ab.22129.22976.15.8.197_1 checkpointed 28-Apr-2008 19:00:00 [---] Resuming network activity 28-Apr-2008 19:00:03 [The Lattice Project] [sched_op_debug] Fetching master file 28-Apr-2008 19:00:03 [The Lattice Project] Fetching scheduler list 28-Apr-2008 19:00:03 [Milkyway@home] Started upload of gs_560_1209170002_132848_0_0 28-Apr-2008 19:00:03 [Cels@Home] Started upload of N16-5_m50c001_160_0_S.gz_0_0 28-Apr-2008 19:00:37 [QCN Alpha Test] [task_debug] Process for qcne_002178_0 exited 28-Apr-2008 19:00:37 [QCN Alpha Test] Task qcne_002178_0 exited with a DLL initialization error. 28-Apr-2008 19:00:37 [QCN Alpha Test] If this happens repeatedly you may need to reboot your computer. 28-Apr-2008 19:00:37 [QCN Alpha Test] [task_debug] task_state=UNINITIALIZED for qcne_002178_0 from handle_exit_external 28-Apr-2008 19:00:37 [SETI@home] [task_debug] Process for 24mr08ab.22129.22976.15.8.197_1 exited 28-Apr-2008 19:00:37 [SETI@home] Task 24mr08ab.22129.22976.15.8.197_1 exited with a DLL initialization error. 28-Apr-2008 19:00:37 [SETI@home] If this happens repeatedly you may need to reboot your computer. 28-Apr-2008 19:00:37 [SETI@home] [task_debug] task_state=UNINITIALIZED for 24mr08ab.22129.22976.15.8.197_1 from handle_exit_external 28-Apr-2008 19:00:37 [Cels@Home] [task_debug] Process for N16-1_m35c001_180_2_S.gz_0 exited 28-Apr-2008 19:00:37 [Cels@Home] Task N16-1_m35c001_180_2_S.gz_0 exited with a DLL initialization error. 28-Apr-2008 19:00:37 [Cels@Home] If this happens repeatedly you may need to reboot your computer. 28-Apr-2008 19:00:37 [Cels@Home] [task_debug] task_state=UNINITIALIZED for N16-1_m35c001_180_2_S.gz_0 from handle_exit_external 28-Apr-2008 19:00:37 [QCN Alpha Test] [cpu_sched] Starting qcne_002178_0(resume) 28-Apr-2008 19:00:37 [QCN Alpha Test] [task_debug] task_state=EXECUTING for qcne_002178_0 from start 28-Apr-2008 19:00:37 [QCN Alpha Test] Restarting task qcne_002178_0 using qcnalpha version 246 28-Apr-2008 19:00:37 [SETI@home] [cpu_sched] Starting 24mr08ab.22129.22976.15.8.197_1(resume) 28-Apr-2008 19:00:37 [SETI@home] [task_debug] task_state=EXECUTING for 24mr08ab.22129.22976.15.8.197_1 from start 28-Apr-2008 19:00:37 [SETI@home] Restarting task 24mr08ab.22129.22976.15.8.197_1 using setiathome_enhanced version 527 28-Apr-2008 19:00:37 [Cels@Home] [cpu_sched] Starting N16-1_m35c001_180_2_S.gz_0(resume) 28-Apr-2008 19:00:37 [Cels@Home] [task_debug] task_state=EXECUTING for N16-1_m35c001_180_2_S.gz_0 from start 28-Apr-2008 19:00:37 [Cels@Home] Restarting task N16-1_m35c001_180_2_S.gz_0 using cels version 100 28-Apr-2008 19:00:39 [---] Project communication failed: attempting access to reference site 28-Apr-2008 19:00:50 [The Lattice Project] [sched_op_debug] Deferring communication for 1 min 0 sec 28-Apr-2008 19:00:50 [The Lattice Project] [sched_op_debug] Reason: Scheduler list fetch failed: http error 28-Apr-2008 19:00:51 [Milkyway@home] Temporarily failed upload of gs_560_1209170002_132848_0_0: http error 28-Apr-2008 19:00:51 [Milkyway@home] Backing off 1 min 0 sec on upload of gs_560_1209170002_132848_0_0 28-Apr-2008 19:00:56 [ralph@home] [sched_op_debug] Fetching master file 28-Apr-2008 19:00:56 [ralph@home] Fetching scheduler list 28-Apr-2008 19:01:08 [Cels@Home] Temporarily failed upload of N16-5_m50c001_160_0_S.gz_0_0: system connect 28-Apr-2008 19:01:08 [Cels@Home] Backing off 1 min 0 sec on upload of N16-5_m50c001_160_0_S.gz_0_0 28-Apr-2008 19:01:11 [---] Access to reference site failed - check network connection or proxy configuration. 28-Apr-2008 19:01:32 [ralph@home] [sched_op_debug] Deferring communication for 1 min 0 sec 28-Apr-2008 19:01:32 [ralph@home] [sched_op_debug] Reason: Scheduler list fetch failed: http error Network comm was set to "auto" and was off by rule until 19:00. At that moment client started its communication attempts, which were not possible due to the machine being behind a proxy, but everything else including DNS was fully functional. The machine was otherwise idle (just me internetbrowsing), and Seti and Cels were still consuming some 80-90% of CPU until at least 19:00:25 (confirmed by Process Explorer logs). The client noticed it at 19:00:37 and declared them dead. I'd like to find out, what's exactly behind these lost heartbeats during wakeup... Peter |
Send message Joined: 16 Apr 06 Posts: 386 ![]() |
I've heard more reports of 5.10.45 raising the wrong message recently, so I'd have to guess that they fixed the V6 branch but didn't backfit the same (simple) fix to the V5 branch. |
![]() Send message Joined: 3 Apr 06 Posts: 547 ![]() |
I've heard more reports of 5.10.45 raising the wrong message recently, so I'd have to guess that they fixed the V6 branch but didn't backfit the same (simple) fix to the V5 branch. This is quite possible. I've indeed noticed few possibly related code changes in client/app_control.C between 5.10.14 and trunk, like [trac]changeset:14348[/trac] in 3. Dec 2007 [pre] --- /trunk/boinc/client/app_control.C (revision 14310) +++ /trunk/boinc/client/app_control.C (revision 14348) @@ -178,7 +178,7 @@ static void limbo_message(ACTIVE_TASK& at) { #ifdef _WIN32 - if (at.result->exit_status = STATUS_DLL_INIT_FAILED) { + if (at.result->exit_status == STATUS_DLL_INIT_FAILED) { msg_printf(at.result->project, MSG_INFO, "Task %s exited with a DLL initialization error.", at.result->name );[/pre] (would just print "Task %s exited with zero status but no 'finished' file" instead), or [trac]changeset:14552[/trac] [pre]--- trunk/boinc/client/app_control.C (revision 14549) +++ trunk/boinc/client/app_control.C (revision 14552) @@ -269,9 +269,10 @@ case 0x40010004: // vista shutdown?? can someone explain this? case STATUS_DLL_INIT_FAILED: // This can happen because: - // - The OS is shutting down, so attempting to start - // any new application fails automatically. + // - The OS is shutting down, and attempting to start + // any new application fails automatically. // - The OS has run out of desktop heap + // - (reportedly) The computer has just come out of hibernation // handle_premature_exit(will_restart); break; [/pre] (just a comment confirming the problem). So it's time to to give the 6.1.17 a try, maybe already the current 6.1.16? (Few comments on email lists still keep me waiting.) Peter |
![]() Send message Joined: 29 Aug 05 Posts: 15585 ![]() |
So it's time to to give the 6.1.17 a try, maybe already the current 6.1.16? (Few comments on email lists still keep me waiting.) Good luck.. do backup your data, do not have an internet connection when you have installed BOINC. It'll make sure you won't burst out in needless tears. ;-) |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.