Thread 'Problem with CPU work request, BOINC 7.0.28'

Message boards : BOINC client : Problem with CPU work request, BOINC 7.0.28
Message board moderation

To post messages, you must log in.

AuthorMessage
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 45595 - Posted: 8 Sep 2012, 21:10:37 UTC
Last modified: 8 Sep 2012, 21:12:32 UTC

09/09/2012 00:58:30 | SETI@home | [sched_op] Starting scheduler request
09/09/2012 00:58:30 | SETI@home | Sending scheduler request: To fetch work.
09/09/2012 00:58:30 | SETI@home | Requesting new tasks for CPU and ATI
09/09/2012 00:58:30 | SETI@home | [sched_op] CPU work request: 1036800.00 seconds; 4.00 devices
09/09/2012 00:58:30 | SETI@home | [sched_op] ATI work request: 139389.34 seconds; 0.00 devices
09/09/2012 00:58:38 | SETI@home | Scheduler request completed: got 0 new tasks
09/09/2012 00:58:38 | SETI@home | [sched_op] Server version 701
09/09/2012 00:58:38 | SETI@home | Project has no tasks available
09/09/2012 00:58:38 | SETI@home | Project requested delay of 303 seconds
09/09/2012 00:58:38 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
09/09/2012 00:58:38 | SETI@home | [sched_op] Reason: requested by project
09/09/2012 01:02:17 | SETI@home | Computation for task 27mr10ac.13328.12751.12.10.242_0 finished
09/09/2012 01:02:17 | SETI@home | Starting task 27mr10ac.13328.12751.12.10.230_0 using setiathome_enhanced version 610 (ati13ati) in slot 0
09/09/2012 01:02:21 | SETI@home | Computation for task 27mr10ac.26426.9888.11.10.142_0 finished
09/09/2012 01:02:21 | SETI@home | Starting task 27mr10ac.26426.9888.11.10.159_0 using setiathome_enhanced version 610 (ati13ati) in slot 1
09/09/2012 01:03:44 | SETI@home | [sched_op] Starting scheduler request
09/09/2012 01:03:44 | SETI@home | Sending scheduler request: To fetch work.
09/09/2012 01:03:44 | SETI@home | Reporting 2 completed tasks, requesting new tasks for ATI
09/09/2012 01:03:44 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
09/09/2012 01:03:44 | SETI@home | [sched_op] ATI work request: 139948.88 seconds; 0.00 devices
09/09/2012 01:03:56 | SETI@home | Scheduler request completed: got 35 new tasks
09/09/2012 01:03:56 | SETI@home | [sched_op] Server version 701
09/09/2012 01:03:56 | SETI@home | Project requested delay of 303 seconds
09/09/2012 01:03:56 | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
09/09/2012 01:03:56 | SETI@home | [sched_op] estimated total ATI task duration: 85399 seconds


CPU was idle, no work was downloaded for CPU... but on the next work request BOINC doesn't request work for CPU also, it shows no idle CPU devices in request though no CPU tasks were downloaded and CPU completelly idle still.

New bug or known one ?
ID: 45595 · Report as offensive
arkayn
Avatar

Send message
Joined: 21 Mar 09
Posts: 33
United States
Message 45596 - Posted: 8 Sep 2012, 21:27:51 UTC - in response to Message 45595.  

It is known, mainly BOINC is trying to fill up the most efficient device first.

There are a lot of posts on the main SETI boards about this problem.
ID: 45596 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5139
United Kingdom
Message 45597 - Posted: 8 Sep 2012, 21:39:35 UTC - in response to Message 45595.  

What were the values of dt and inc shown in [WFD]?

08/09/2012 22:36:41 | Milkyway@Home | [work_fetch] CPU: fetch share 0.000 rsc backoff (dt 2289.16, inc 4800.00)

If dt > 0.00, fetch is inhibited for that resource - no point in hammering a server with requests for work when they haven't finished writing the app yet...
ID: 45597 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 45598 - Posted: 9 Sep 2012, 5:26:44 UTC - in response to Message 45596.  

It is known, mainly BOINC is trying to fill up the most efficient device first.

There are a lot of posts on the main SETI boards about this problem.


In my case it did just reverse - it fills up not idle device leaving idle one empty.

ID: 45598 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 45599 - Posted: 9 Sep 2012, 5:32:16 UTC - in response to Message 45597.  
Last modified: 9 Sep 2012, 5:34:51 UTC

What were the values of dt and inc shown in [WFD]?

08/09/2012 22:36:41 | Milkyway@Home | [work_fetch] CPU: fetch share 0.000 rsc backoff (dt 2289.16, inc 4800.00)

If dt > 0.00, fetch is inhibited for that resource - no point in hammering a server with requests for work when they haven't finished writing the app yet...

Sorry, I don't quite understand this.
1. It's CPU device and app for CPU presents in app_info.
2. On next request (after another 5 min of idle CPU) BOINC started to make CPU work requests too (and finally got it). So I don't understand how "no app" example is relevant here.

More verbose log citation:
09/09/2012 00:53:15 | SETI@home | Requesting new tasks for CPU and ATI
09/09/2012 00:53:15 | SETI@home | [sched_op] CPU work request: 1036800.00 seconds; 4.00 devices
09/09/2012 00:53:15 | SETI@home | [sched_op] ATI work request: 180340.27 seconds; 0.00 devices
09/09/2012 00:53:25 | SETI@home | Scheduler request completed: got 37 new tasks
09/09/2012 00:53:25 | SETI@home | [sched_op] Server version 701
09/09/2012 00:53:25 | SETI@home | Project requested delay of 303 seconds
09/09/2012 00:53:25 | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
09/09/2012 00:53:25 | SETI@home | [sched_op] estimated total ATI task duration: 83518 seconds
09/09/2012 00:53:25 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
09/09/2012 00:53:25 | SETI@home | [sched_op] Reason: requested by project
09/09/2012 00:58:30 | SETI@home | [sched_op] Starting scheduler request
09/09/2012 00:58:30 | SETI@home | Sending scheduler request: To fetch work.
09/09/2012 00:58:30 | SETI@home | Requesting new tasks for CPU and ATI
09/09/2012 00:58:30 | SETI@home | [sched_op] CPU work request: 1036800.00 seconds; 4.00 devices
09/09/2012 00:58:30 | SETI@home | [sched_op] ATI work request: 139389.34 seconds; 0.00 devices
09/09/2012 00:58:38 | SETI@home | Scheduler request completed: got 0 new tasks
09/09/2012 00:58:38 | SETI@home | [sched_op] Server version 701
09/09/2012 00:58:38 | SETI@home | Project has no tasks available
09/09/2012 00:58:38 | SETI@home | Project requested delay of 303 seconds
09/09/2012 00:58:38 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
09/09/2012 00:58:38 | SETI@home | [sched_op] Reason: requested by project
09/09/2012 01:02:17 | SETI@home | Computation for task 27mr10ac.13328.12751.12.10.242_0 finished
09/09/2012 01:02:17 | SETI@home | Starting task 27mr10ac.13328.12751.12.10.230_0 using setiathome_enhanced version 610 (ati13ati) in slot 0
09/09/2012 01:02:21 | SETI@home | Computation for task 27mr10ac.26426.9888.11.10.142_0 finished
09/09/2012 01:02:21 | SETI@home | Starting task 27mr10ac.26426.9888.11.10.159_0 using setiathome_enhanced version 610 (ati13ati) in slot 1
09/09/2012 01:03:44 | SETI@home | [sched_op] Starting scheduler request
09/09/2012 01:03:44 | SETI@home | Sending scheduler request: To fetch work.
09/09/2012 01:03:44 | SETI@home | Reporting 2 completed tasks, requesting new tasks for ATI
09/09/2012 01:03:44 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
09/09/2012 01:03:44 | SETI@home | [sched_op] ATI work request: 139948.88 seconds; 0.00 devices
09/09/2012 01:03:56 | SETI@home | Scheduler request completed: got 35 new tasks
09/09/2012 01:03:56 | SETI@home | [sched_op] Server version 701
09/09/2012 01:03:56 | SETI@home | Project requested delay of 303 seconds
09/09/2012 01:03:56 | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
09/09/2012 01:03:56 | SETI@home | [sched_op] estimated total ATI task duration: 85399 seconds
09/09/2012 01:03:56 | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 27mr10ac.13328.12751.12.10.242_0
09/09/2012 01:03:56 | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 27mr10ac.26426.9888.11.10.142_0
09/09/2012 01:03:56 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
09/09/2012 01:03:56 | SETI@home | [sched_op] Reason: requested by project
09/09/2012 01:04:33 | SETI@home | Backing off 3 min 26 sec on download of 21my10ae.4121.178.7.10.114
09/09/2012 01:04:33 | SETI@home | Backing off 2 min 55 sec on download of 21my10ae.4121.178.7.10.47
09/09/2012 01:04:35 | | Project communication failed: attempting access to reference site
09/09/2012 01:04:36 | | Internet access OK - project servers may be temporarily down.
09/09/2012 01:05:43 | SETI@home | Backing off 4 min 10 sec on download of 21my10ae.4121.178.7.10.47
09/09/2012 01:05:45 | | Project communication failed: attempting access to reference site
09/09/2012 01:05:46 | | Internet access OK - project servers may be temporarily down.
09/09/2012 01:10:01 | SETI@home | [sched_op] Starting scheduler request
09/09/2012 01:10:01 | SETI@home | Sending scheduler request: To fetch work.
09/09/2012 01:10:01 | SETI@home | Requesting new tasks for CPU and ATI
09/09/2012 01:10:01 | SETI@home | [sched_op] CPU work request: 1036800.00 seconds; 4.00 devices
09/09/2012 01:10:01 | SETI@home | [sched_op] ATI work request: 98171.48 seconds; 0.00 devices
09/09/2012 01:10:17 | SETI@home | Scheduler request completed: got 0 new tasks
09/09/2012 01:10:17 | SETI@home | [sched_op] Server version 701
09/09/2012 01:10:17 | SETI@home | Project has no tasks available
09/09/2012 01:10:17 | SETI@home | Project requested delay of 303 seconds
09/09/2012 01:10:17 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
09/09/2012 01:10:17 | SETI@home | [sched_op] Reason: requested by project
09/09/2012 01:18:18 | SETI@home | update requested by user
09/09/2012 01:18:22 | SETI@home | [sched_op] Starting scheduler request
09/09/2012 01:18:22 | SETI@home | Sending scheduler request: Requested by user.
09/09/2012 01:18:22 | SETI@home | Requesting new tasks for CPU and ATI
09/09/2012 01:18:22 | SETI@home | [sched_op] CPU work request: 1036800.00 seconds; 4.00 devices
09/09/2012 01:18:22 | SETI@home | [sched_op] ATI work request: 99496.03 seconds; 0.00 devices
09/09/2012 01:18:32 | SETI@home | Scheduler request completed: got 45 new tasks
09/09/2012 01:18:32 | SETI@home | [sched_op] Server version 701
09/09/2012 01:18:32 | SETI@home | Project requested delay of 303 seconds
09/09/2012 01:18:32 | SETI@home | [sched_op] estimated total CPU task duration: 48749 seconds
09/09/2012 01:18:32 | SETI@home | [sched_op] estimated total ATI task duration: 101743 seconds
09/09/2012 01:18:32 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
09/09/2012 01:18:32 | SETI@home | [sched_op] Reason: requested by project
09/09/2012 01:19:01 | SETI@home | Starting task 14fe12aa.12221.2930.10.10.158_0 using setiathome_enhanced version 603 in slot 2
09/09/2012 01:19:02 | SETI@home | Starting task 25fe12ac.32441.5167.5.10.166_0 using setiathome_enhanced version 603 in slot 3
09/09/2012 01:19:03 | SETI@home | Starting task 14fe12aa.12221.2930.10.10.160_0 using setiathome_enhanced version 603 in slot 4
09/09/2012 01:19:04 | SETI@home | Starting task 21my10ae.4121.2636.7.10.195_0 using setiathome_enhanced version 603 in slot 5
09/09/2012 01:19:05 | SETI@home | Backing off 3 min 5 sec on download of 21my10ae.4121.2636.7.10.169


As one can see, BOINC stopped to ask for CPU work when GPU tasks finished. Why finish of GPU task connected with CPU work request inhibition ???
ID: 45599 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5139
United Kingdom
Message 45600 - Posted: 9 Sep 2012, 8:41:50 UTC - in response to Message 45599.  

What were the values of dt and inc shown in [WFD]?

08/09/2012 22:36:41 | Milkyway@Home | [work_fetch] CPU: fetch share 0.000 rsc backoff (dt 2289.16, inc 4800.00)

If dt > 0.00, fetch is inhibited for that resource - no point in hammering a server with requests for work when they haven't finished writing the app yet...

Sorry, I don't quite understand this.
1. It's CPU device and app for CPU presents in app_info.
2. On next request (after another 5 min of idle CPU) BOINC started to make CPU work requests too (and finally got it). So I don't understand how "no app" example is relevant here.

...

As one can see, BOINC stopped to ask for CPU work when GPU tasks finished. Why finish of GPU task connected with CPU work request inhibition ???

It isn't. CPU work request inhibition is caused by

Scheduler request completed: got 0 new [CPU] tasks

With [Work_Fetch_Debug] you would have seen that GPU fetch was also disabled at that point, for the same reason. 'Computation for task 27mr10ac.13328.12751.12.10.242_0 finished' cleared the request backoff for that resource type (only) - it must have been a GPU task, because the replacement task has version 610 and plan_class ati13ati. So, the next work request is for the not-backed-off resource only.

That's all according to design specification. Work fetch backs off when no work is received in response to a request. Where I agree with you is that the duration of the backoff should be related to the reason for the lack of work.

My backoff was because MilkyWay haven't finished re-writing the N-Body Simulation application for Windows yet. Another week or two in backoff would have been acceptable.

Your backoff was because the BOINC server design can't hold sufficient tasks to fulfill a work request for 2 days GPU crunching in one go, chooses (preferentially) to supply work for the faster resource, and, I suppose, gives no extra weight to the idle resource when choosing which part of a mixed request to give priority to.
ID: 45600 · Report as offensive

Message boards : BOINC client : Problem with CPU work request, BOINC 7.0.28

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.