Panic mode needs to start earlier or something

Message boards : BOINC client : Panic mode needs to start earlier or something
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2533
United Kingdom
Message 101540 - Posted: 10 Nov 2020, 13:02:48 UTC - in response to Message 101538.  

It may mean some deadlines being missed in the short term but won't BOINC learn how long things actually take if you give it its head and ignore it for long enough?
ID: 101540 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5080
United Kingdom
Message 101541 - Posted: 10 Nov 2020, 14:16:00 UTC - in response to Message 101540.  

... won't BOINC learn how long things actually take if you give it its head and ignore it for long enough?
It depends. In principle yes, But the definition of 'long enough' is problematic.

In the early years of BOINC (up until 2010), BOINC used a value called DCF ('duration correction factor'). That was carefully constructed to protect deadlines. If even a single task over-ran (and finished normally, to prove it wasn't stuck in a loop), then the estimates were updated instantly - on the client, so even tasks already downloaded and waiting to run were affected. One task might be wasted, but BOINC would 'know' that all the rest should go into panic mode earlier.

But that method has problems. It's very difficult for projects to keep track of varying applications with varying speeds through a single value. So with multiplying projects, multiplying applications, and multiplying devices (GPUs of differing speeds), things needed to change. David, in his infinite wisdom (ahem!), decided to take all the calculations back onto the server, and work with slowly-changing averages instead of sudden jumps. The easiest way of tracking these is in the 'application details' shown for each of your computers at each project's website - look for APR ('average processing rate').

There are multiple problems with the APR approach.
1) The initial values for a new application, or a new computer, are badly designed and can be seriously wrong.
2) The 'averaging' approach makes it slow to adjust, even once it's got started.
3) Only runtimes from validated tasks are considered, so if an over-running task is disqualified for lateness, it's difficult to get back in range.

So, "How long is enough?" depends on each separate project, and how it manages its estimates.

Some mainstream BOINC projects (Einstein, GPUGrid) still use DCF, and will react instantly.
Most other mainstream BOINC projects will use APR.
And WCG isn't a mainstream BOINC project - they run their own server code, and do things their own way. I haven't studied it: you'll have to ask the folks over there.
ID: 101541 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2533
United Kingdom
Message 101544 - Posted: 10 Nov 2020, 16:16:34 UTC

I think I saw that when I joined Primegrid. Something very strange happened. I have a 0+3 hour buffer (0+0.13 days), yet Primegrid gave me enough tasks to run for 2 weeks, as predicted by their estimated run times! So even if those estimates were faulty, that doesn't explain why I got 2 weeks worth. What did it use to decide I could do all those in 3 hours? In fact they ended up taking 1 week. So: I ask for 3 hours, I get an estimated 2 weeks, which takes 1 week. Any idea what happened there? Is there another time estimate I can't see?

Worth checking out the Prime Grid fora on this. I questioned the estimates having received tasks that were not going to finish on time on my new Ryzen7. If a Ryzen7 can't finish a task on time there is clearly something wrong. I enquired about this and was told that as the deadline got closer it would be extended. I never actually confirmed this as so little of the task was completed I aborted it when new batches of work for CPDN came along.

It's Africa Rainfall Project over at WCG I had the problem with. I don't know why, but they give 1 week deadlines for 3/4s of the Africa tasks, but 1/4 of them only have 3.5 day deadlines. Maybe they're ones that have been sent out again and they're more in a hurry?
I ran quite a few ARP tasks from WCG when there was nothing from CPDN. The ones I got with short deadlines were all resends shown by a number n>0 at the end of the task. In theory these are sent to computers that are known to return results quickly. This is important for ARP as the project aims at providing real time information to farmers there.
ID: 101544 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2533
United Kingdom
Message 101546 - Posted: 10 Nov 2020, 18:10:02 UTC

I wasn't aware it was realtime, I thought they were developing better forecasting models. I can't see Boinc being fast enough for real time weather forecasting, the tasks have a 1 week deadline, weather forecasting has to be done on the day surely? This is what their main page says:


Clearly I got that wrong, possibly by reading something by someone else who got it wrong.
ID: 101546 · Report as offensive

Message boards : BOINC client : Panic mode needs to start earlier or something

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.