Thread 'A thought about aborting WUs'

Message boards : Server programs : A thought about aborting WUs
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13676 - Posted: 9 Nov 2007, 3:48:34 UTC
Last modified: 9 Nov 2007, 3:50:27 UTC

On some projects with inhomogenous WU types I sometimes see people who abort those WUs that are known to have a slightly worse time to credits ratio than others.

"Picking out the raisins" is what we call that.

I think, a credits penalty for those aborted WUs that have not even been tried to crunch (CPU time less than a minute or so) would be a good idea.

I'm aware that there are some real reasons to abort a WU, like a known faulty batch - but for that batch the admin could set a "no malus" flag or abort it on server side, which could be taken as a sign for "no penalty".


(now waiting for the stones to be thrown at me)
ID: 13676 · Report as offensive
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 13677 - Posted: 9 Nov 2007, 4:32:56 UTC

Ha, no stones, mate.

It's suspected that this has happened numerous times on the BBC climate project, ("I didn't like the look of the temperatures, so I aborted the model"), as well as on the main site after some model batches turned out to be noticeably slower.
Some sort of penalty would serve them right.

ID: 13677 · Report as offensive
Keck_Komputers
Avatar

Send message
Joined: 29 Aug 05
Posts: 304
United States
Message 13680 - Posted: 9 Nov 2007, 7:05:19 UTC

There is a penalty for aborting tasks either manually or from server-side aborts. In either case the daily quota is reduced 1 per abort.
BOINC WIKI

BOINCing since 2002/12/8
ID: 13680 · Report as offensive
Odd-Rod

Send message
Joined: 29 Oct 07
Posts: 13
South Africa
Message 13682 - Posted: 9 Nov 2007, 7:29:53 UTC - in response to Message 13676.  


I think, a credits penalty for those aborted WUs that have not even been tried to crunch (CPU time less than a minute or so) would be a good idea.

I'm aware that there are some real reasons to abort a WU, like a known faulty batch - but for that batch the admin could set a "no malus" flag or abort it on server side, which could be taken as a sign for "no penalty".


(now waiting for the stones to be thrown at me)


A good idea in principle, but it must be carefully applied to not mistakenly penalize "justified aborts".
I trust that pebble was tossed gently enough to not hurt? ;)

As per Keck Computers posting:
There is a penalty for aborting tasks either manually or from server-side aborts. In either case the daily quota is reduced 1 per abort.


Perhaps the number to reduce by could be increased for the "unjustified aborts" ? At least this way, even wrong penalties will not be too severe.

And one last question - how soon does the daily quota increase again?
Regards
Rod
ID: 13682 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15561
Netherlands
Message 13683 - Posted: 9 Nov 2007, 7:52:05 UTC - in response to Message 13682.  

And one last question - how soon does the daily quota increase again?

It doubles with each correctly returned & reported task. 1, 2, 4, 8, 16 etc.
ID: 13683 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13684 - Posted: 9 Nov 2007, 8:38:40 UTC
Last modified: 9 Nov 2007, 8:50:02 UTC

If people abort the long ones and crunch the short ones, the quota penalty doesn't really hurt. Especially in projects with various WU durations, they need a high quota anyway because there might be a series of short ones.

If someone really managed to bring his quota down to 0, he can just detach, re-attach and merge (well, as long as "merge" isn't broken again *g).

There are other ways to increase the quota too (minor patch, did that myself when Akos' Einstein client crunched faster than the default quota allowed). Hasn't the number of accepted CPUs just lately been increased to eight? ;-)


p.s.: one has to keep an eye on expired WUs as well or instead of aborting, they could just stop them and wait for the BOINC client to abort them after the expiration. As expired (finished) WUs usually get deleted after some time, that would not work in the same way as aborting them. The server will not accept them anymore, so they wouldn't count as aborted ones.


p.p.s.: The pebble even missed as I already had mentioned justified aborts in the thread starter (those with the "real reasons") :-)
ID: 13684 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 13689 - Posted: 9 Nov 2007, 13:25:11 UTC - in response to Message 13684.  

Hasn't the number of accepted CPUs just lately been increased to eight?

5 months ago, in [trac]changeset:12771[/trac].
ID: 13689 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13694 - Posted: 9 Nov 2007, 15:22:24 UTC - in response to Message 13689.  
Last modified: 9 Nov 2007, 15:23:55 UTC

Hasn't the number of accepted CPUs just lately been increased to eight?

5 months ago, in [trac]changeset:12771[/trac].


Yep, that's what I mean - the BOINC server has to trust the number of CPUs reported by the client, it has no chance to check it. If a host reports more CPUs than it actually has, that results in a much higher total quota for that box.
ID: 13694 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 13700 - Posted: 9 Nov 2007, 16:22:06 UTC - in response to Message 13694.  

Yep, that's what I mean - the BOINC server has to trust the number of CPUs reported by the client, it has no chance to check it. If a host reports more CPUs than it actually has, that results in a much higher total quota for that box.

If you have a quad and make it report itself as an 8-core, you get double the quota limit. But I managed to get a practically infinite quota using a different trick (which won't post it here).

The server seems to be trusting the client more than it should; but I can't really think a way to avoid it.
ID: 13700 · Report as offensive
mo.v
Avatar

Send message
Joined: 13 Aug 06
Posts: 778
United Kingdom
Message 13748 - Posted: 10 Nov 2007, 23:01:38 UTC

So how do I convince a project server that my old computer has 8 cores, each 125MHz?
ID: 13748 · Report as offensive
Nicolas

Send message
Joined: 19 Jan 07
Posts: 1179
Argentina
Message 13750 - Posted: 10 Nov 2007, 23:28:11 UTC - in response to Message 13748.  

So how do I convince a project server that my old computer has 8 cores, each 125MHz?

I successfully faked a scheduler request (sending it from a script, outside of BOINC) with 2147483648 cores and the host showed like that on the project webpage.
ID: 13750 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13752 - Posted: 10 Nov 2007, 23:39:34 UTC - in response to Message 13748.  
Last modified: 10 Nov 2007, 23:43:13 UTC

So how do I convince a project server that my old computer has 8 cores, each 125MHz?


sched_send.c :

const int MIN_SECONDS_TO_SEND = 0;
const int MAX_SECONDS_TO_SEND = (28*SECONDS_IN_DAY);
const int MAX_CPUS = 8; <<<<<<<===================================
    // max multiplier for daily_result_quota;
    // need to change as multicore processors expand


Do you plan to crunch 8 Climate models on 125MHz each?

@Nicolas : The value is stored in the database as it is reported by the PC but the server side scheduler still limits it to that MAX_CPUS constant
ID: 13752 · Report as offensive
ProfileAnanas

Send message
Joined: 27 Jun 06
Posts: 305
Germany
Message 13762 - Posted: 11 Nov 2007, 18:43:48 UTC - in response to Message 13760.  
Last modified: 11 Nov 2007, 18:50:34 UTC

So how do I convince a project server that my old computer has 8 cores, each 125MHz?

I successfully faked a scheduler request (sending it from a script, outside of BOINC) with 2147483648 cores and the host showed like that on the project webpage.

Thanks for sharing. Would that succeed at WCG as well?


If you manage to find that information about my host (I'm not so familiar with WCG yet), you should see an Athlon MP2600+ with 12 CPUs - it has only 2 of course.

well, I found it : http://boincstats.com/stats/host_graph.php?pr=wcg&id=236800

CPU AMD Athlon(tm) MP 2600+
Number of CPU's (number of (virtual) cores) 12
ID: 13762 · Report as offensive

Message boards : Server programs : A thought about aborting WUs

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.