Message boards : Questions and problems : "Backup" projects frequently returning work units late
Message board moderation
Author | Message |
---|---|
Send message Joined: 16 Mar 25 Posts: 4 |
I'm crunching for a number of different projects. These projects are my "primary" projects, and all have resource share of 100:
|
Send message Joined: 7 Dec 24 Posts: 38 |
In reply to William Albert's message of 16 Mar 2025: I also don't run large caches — BOINC is configured to store at most 0.5 days of work for each project (and it seems to only pull a few WUs at a time from the backup projects in any case),When running more than one project, no cache (or almost no cache is best). If running projects purely as backup projects then no cache is best. ie Store at least 0.05 days (70min) and Store up to an additional 0.01 days (that way it should only download a Task just before it need finishes processing the current one. Even if you do run with any sort of cache, you need to set the cache value using the Store at least Value. The Additional days value is always best set at 0.01 days- eg. If you set things to 1 day and 5 additional days, what will happen is you will get 6 days of work, then it will run down until it drops below 1 day, and then reload back up to 6 days worth. If you want 6 days worth, then set it to 6 and 0.01 additional days). And apparently not all projects honour the 0 Resource share value setting, so odd things can happen. Grant Darwin NT. |
Send message Joined: 16 Mar 25 Posts: 4 |
I can see about adjusting the cache value, but I so far haven't had a situation where I've requested so much work that it's too much to complete by the deadline. Rather, my issue is that BOINC is letting work from my backup projects sit in a waiting state for too long while it continuously fetches new work for a primary project. Likewise, if a project isn't respecting my resource share setting, I'd expect to get work when I don't want it, or have my worker's capacity weighted toward a particular project. While I have noticed this somewhat with LHC@Home because it's my only project where a WU requests more than one core, I haven't seen this to be a problem more generally, and BOINC sitting on WUs until they're late when there's plenty of computing capacity available seems like an entirely client-side issue. |
![]() Send message Joined: 28 Jun 10 Posts: 2809 ![]() |
I have only ever had this problem when mixing projects whose run times differ widely. cpdn tasks often take 3-6 days on my machine. (Still a vast improvement from the days when they could take six months or longer on a slow machine!) Because my preferred work is all either CPDN or ARP tasks from WCG, supply of which is erratic in both cases, I settle for being more interventionist and when work supply is good from my preferred projects, I turn the others off. |
Send message Joined: 7 Dec 24 Posts: 38 |
How many Tasks for the backup projects have been completed and Validated? All projects seem to have issues with initial estimates (some are a bit out, others not even close), and it takes 10 completed & validated Tasks for the project sever & BOINC manager to sort out their actual processing rates to then enable accurate estimates of processing time. Significant hardware changes, new applications or major changes to the Tasks being processed for a given application can throw everything out of whack & take time to re-determine the processing rate again. The smaller the cache, the more cores & threads available to BOINC, the more time BOINC has to actually process work (ie Use at most 100% of CPU time, never suspend when non-BOINC usage exceeds any level, never suspend BOINC processing when there is keyboard or mouse input) then the sooner that can occur. However, with backup projects, if they very rarely get called upon, and the more projects you have, then the processing rate estimates for the backup projects may take months to become accurate. work_fetch_debug cpu_sched_debug priority_debug rr_simulation time_debug are all options you can set in the Manager for the Event log to see what the Manager is doing, and why. But be warned- some of them will produce huge amounts of output, and the more projects there are, the more output... Grant Darwin NT. |
![]() Send message Joined: 29 Aug 05 Posts: 15604 ![]() |
In reply to William Albert's message of 17 Mar 2025: While I have noticed this somewhat with LHC@Home because it's my only project where a WU requests more than one coreBoth LHC and Milkyway have multithreaded applications (https://lhcathome.cern.ch/lhcathome/apps.php, https://milkyway.cs.rpi.edu/milkyway/apps.php), which afaik require the amount of cores to be free that were available when BOINC requested work and got that from those projects. If then at a later time those cores aren't free, that work will wait until those cores are free. |
Send message Joined: 7 Dec 24 Posts: 38 |
In reply to Jord's message of 17 Mar 2025: In reply to William Albert's message of 17 Mar 2025:However, if the backup project Task(s) are in danger of missing their deadlines, then they should become High Priority, and as many other Tasks as necessary to supply the needed number of cores to complete them for the other projects should be paused, while the now High Priority Tasks are processed (i would have thought). Grant Darwin NT. |
Send message Joined: 16 Mar 25 Posts: 4 |
In reply to Grant (SSSF)'s message of 17 Mar 2025: How many Tasks for the backup projects have been completed and Validated? With the exception of DENIS@Home (which I haven't completed any work for yet due to the project being out of work for an extended period of time), I've completed hundreds of work units minimum for all of my active projects. Additionally, while I can understand if a WU is late if it takes longer to process than estimated, the estimates for the late WUs that I've seen have been reasonably accurate -- BOINC just doesn't resume them in a timely manner. In fact, for the Milkyway@Home WU that I cited as an example in my original post, the WU was still "waiting to run" even though the estimated time to completion was longer than the remaining deadline. It was my understanding that it should have had its priority boosted by BOINC, but it didn't happen until the WU was nearly expired, and the WU ended up being returned hours late (thankfully, I still got credit for it). Thanks for the debug tips. I'll see about turning on debugging the next time BOINC pulls work from a backup project. In reply to Jord's message of 17 Mar 2025: In reply to William Albert's message of 17 Mar 2025: MIlkyway@Home has a multithreaded application, but the number of threads a WU can use can be controlled in the project preference, and I have it set to 1 thread per WU. In reply to Grant (SSSF)'s message of 17 Mar 2025: In reply to Jord's message of 17 Mar 2025: That's what I thought as well, but either this isn't working properly, or something is disrupting the priority boost for some reason. In reply to Dave's message of 17 Mar 2025: I have only ever had this problem when mixing projects whose run times differ widely. cpdn tasks often take 3-6 days on my machine. (Still a vast improvement from the days when they could take six months or longer on a slow machine!) Because my preferred work is all either CPDN or ARP tasks from WCG, supply of which is erratic in both cases, I settle for being more interventionist and when work supply is good from my preferred projects, I turn the others off. If there a way to tell BOINC "resume work on these WUs", and have those WUs prioritized, then this wouldn't be such an issue because I could temporarily intervene. However, I'm not aware of any such functionality, and I've resorted in the past to setting my backup projects to "No New Work" and suspending my primary projects to allow the backup project WUs to complete in a timely manner. I tried to let BOINC do its thing this time around because I've read advice elsewhere that BOINC's scheduler can be disrupted if one tries to micromanage it, but letting it be results in WUs being late. If there were a way to tell BOINC to prioritize WUs in the order which they were downloaded (rather than whatever BOINC does by default), that would presumably resolve this issue and allow me to just set and forget it. |
Send message Joined: 7 Dec 24 Posts: 38 |
If there were a way to tell BOINC to prioritize WUs in the order which they were downloaded (rather than whatever BOINC does by default), that would presumably resolve this issue and allow me to just set and forget it.Actually, that is the default way it processes the Tasks. First in, First out. With multiple projects and even Resource share values, when BOINC first starts processing work for them, it will download a group of each and often start processing on the first few from each group (all depending on the number of cores available). All things being equal, eventually it would get to the point that the Tasks are processed in the order they are downloaded. But since all things aren't equal, it doesn't necessarily happen that way all the time. It should, generally, do them in the order that they are downloaded. But if a Task is received that has a longer or shorter deadline or estimated runtime than that type of Task will usually have, then it may get done sooner or later than it otherwise would be. The change in processing order comes about from different deadlines, Resource share settings, and changing processing rate values (with some applications and data, the processing time for a given application and Task is pretty much stable. For others, Taks can take more than twice as long, or finish in half the time, of the average running time for most Tasks). Some projects have much more efficient applications than others. But while a Task for a project that is set as a backup project will have the lowest level of priority, even if it was downloaded earlier; however it should still be completed before it's deadline. Grant Darwin NT. |
![]() Send message Joined: 28 Jun 10 Posts: 2809 ![]() |
The only other thing to add is if you want BOINC to work things out, don't constantly change settings as each time you do that it has to start again. Or if you spend lots of time on your computer you can keep an eye on it micromanage it. |
Send message Joined: 16 Mar 25 Posts: 4 |
In reply to Grant (SSSF)'s message of 17 Mar 2025: If there were a way to tell BOINC to prioritize WUs in the order which they were downloaded (rather than whatever BOINC does by default), that would presumably resolve this issue and allow me to just set and forget it.Actually, that is the default way it processes the Tasks. First in, First out. With respect, it objectively doesn't unless you're only doing work for a single project where all WUs have the same length deadline, and the rest of your post even details the algorithm BOINC uses to determine how to schedule WUs. In reply to Grant (SSSF)'s message of 17 Mar 2025: But while a Task for a project that is set as a backup project will have the lowest level of priority, even if it was downloaded earlier; however it should still be completed before it's deadline. Unfortunately, this doesn't seem to work as expected. |
Send message Joined: 7 Dec 24 Posts: 38 |
With respect, it objectively doesn't unless you're only doing work for a single project where all WUs have the same length deadline, and the rest of your post even details the algorithm BOINC uses to determine how to schedule WUs.That doesn't change the fact that the default is first in first out- regardless of the number of projects you might have. But Resource share settings, application runtimes and differing deadlines do result in non-first in first out processing in order to meet those competing requirements. Grant Darwin NT. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.