BOINC's policy for uploads retry

Message boards : Questions and problems : BOINC's policy for uploads retry
Message board moderation

To post messages, you must log in.

AuthorMessage
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 25936 - Posted: 10 Jul 2009, 11:12:09 UTC

Does any reason exist why BOINC does result upload retry based on per task policy (each result has its own timer) instead of per project policy?
For fast hosts it leads to constant attempts to connect with server. This makes current situation with SETI project only worse...

ID: 25936 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 25938 - Posted: 10 Jul 2009, 11:18:25 UTC - in response to Message 25936.  

I'll assume it has to do with the whole first in/first out policy that tasks run in.
ID: 25938 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 25939 - Posted: 10 Jul 2009, 11:24:15 UTC - in response to Message 25938.  

I'll assume it has to do with the whole first in/first out policy that tasks run in.

But it will not report tasks accordingly this policy.
Tasks are reported (i.e. their results become available for further scientific analysis) in bunches. And this process is governed by per project policy for retries.
So still don't see any reason to do result upload retry on per task basis.
This leads to excessive server load and bandwidth consuming that becomes very critical for SETI project now.


ID: 25939 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 25940 - Posted: 10 Jul 2009, 11:40:04 UTC - in response to Message 25939.  

So? What else should they do?
ID: 25940 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 25942 - Posted: 10 Jul 2009, 11:44:51 UTC - in response to Message 25940.  
Last modified: 10 Jul 2009, 11:45:17 UTC

So? What else should they do?

They should do result upload retries based on per project policy, not per task policy.
Actually this BOINC flaw was pointed out by Richard Haselgrove in far 2005 year (!) http://setiathome.berkeley.edu/forum_thread.php?id=24612&nowrap=true#209283
No reaction from BOINC dev crew since. This flaw still here.
And still contributing in SETI's server problem.
ID: 25942 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 25943 - Posted: 10 Jul 2009, 12:06:14 UTC - in response to Message 25942.  
Last modified: 10 Jul 2009, 12:06:59 UTC

Ah, you mean this part:

In the long term, this translates to a BOINC problem, not a Seti problem (contrary to what some people have been asserting in the BOINC fora). If the BOINC upload algorithm could be modified so that only one WU per project could be in the 'uploading' or 'retry [upload] in hh:mm:ss' states, and others could go into a new 'queued' state until the first one finishes, then a minor or temporary glitch wouldn't grow out of control like this one appears to have done.


As else it isn't too clear what you mean. A per project upload could mean to batch all together and try to send one big (compressed) file; it could also mean that BOINC will just have to wait with its uploads to other projects until the first is done uploading all... not really useful, that.

As for the problems Seti has...
[own opinion]
I'm still considering that a project problem (but who am I?). If they have trouble getting work in or out, they should find a fix for their own (self-produced) problems, not impair the way BOINC works.

They can easily stop accepting new users, send out less work, send out longer running work, no longer send out CUDA work, or minimize the amount of work going out to each host, until they have a bigger pipeline coming back into the project.

Other projects with even less donation money than Seti can survive on lesser connections. Let Seti then learn to use their resources more sparsely than they've been doing thus far.

BOINC <> Seti. Or in C BOINC != Seti.
[/own opinion]

You can always make it a request enhancement in Trac. Make sure to clearly explain what you want, though.
ID: 25943 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 20 Dec 07
Posts: 1069
Germany
Message 25944 - Posted: 10 Jul 2009, 12:17:38 UTC - in response to Message 25943.  

As else it isn't too clear what you mean. A per project upload could mean to batch all together and try to send one big (compressed) file; it could also mean that BOINC will just have to wait with its uploads to other projects until the first is done uploading all... not really useful, that...

No, it was quite clear (at least to me:-), as the topic was upload retries, not uploads in general.

And I must agree that the exponential backup doesn't work well with several hundred uploads waiting and the reset to one-minute periods after a few of four hours (I think after ten retries in total, when the master file is refetched).

Gruß,
Gundolf
Computer sind nicht alles im Leben. (Kleiner Scherz)
ID: 25944 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 25945 - Posted: 10 Jul 2009, 12:41:43 UTC - in response to Message 25943.  
Last modified: 10 Jul 2009, 12:59:13 UTC

[own opinion]
I'm still considering that a project problem (but who am I?). If they have trouble getting work in or out, they should find a fix for their own (self-produced) problems, not impair the way BOINC works.

They can easily stop accepting new users, send out less work, send out longer running work, no longer send out CUDA work, or minimize the amount of work going out to each host, until they have a bigger pipeline coming back into the project.

Other projects with even less donation money than Seti can survive on lesser connections. Let Seti then learn to use their resources more sparsely than they've been doing thus far.

BOINC <> Seti. Or in C BOINC != Seti.
[/own opinion]

You can always make it a request enhancement in Trac. Make sure to clearly explain what you want, though.


Ok, will start ticket. Just wanted to be sure I not missed some apparent reason for such BOINC behavior.

About your opinion: surely disagree ;)
BOINC, not scientific project responsible for data transfers and inefficiency in BOINC will hurt all projects, in more or less degree.
If some projects have lesser number of users well, they still need to grow to be touched by this inefficiency. But they will eventually. Look into the root of problem, not in its manifestation.

And about BOINC!=SETI, well, actually BOINC is just SETI++ (in C++ terms) ;)
SETI was its parent and surely remains biggest and [own opinion]most important[/own opinion] of all its participating projects ;)

EDIT:
http://boinc.berkeley.edu/trac/ticket/932
ID: 25945 · Report as offensive
Profile Jord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15480
Netherlands
Message 25946 - Posted: 10 Jul 2009, 13:10:48 UTC - in response to Message 25945.  

BOINC, not scientific project responsible for data transfers and inefficiency in BOINC will hurt all projects, in more or less degree.

The science project is responsible for the amount of work that any host attached to it can get at any given time. When this project knows they have trouble getting the work back, you'd expect as an automatic reaction that they act on giving out less work or no work at all, until the immediate problem is fixed.

It's like the police shutting down a highway when a major accident has occurred, that is blocking all lanes. Traffic at the back is diverted to take another route so not many more are getting stuck in the queue. They'll allow the traffic to flow free once the accident rubble has been cleared and all of the resulting queue is driving again.
ID: 25946 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 25948 - Posted: 10 Jul 2009, 13:18:29 UTC - in response to Message 25946.  
Last modified: 10 Jul 2009, 13:19:23 UTC

This issue often can't be predicted.
Yes, they can (and do) what they can to solve situation.
But this doesn't cancel noticed BOINC's inefficiency.
Why to try point to another ??
Sure, if project will have gigabit connection and cut off 9/10 of its' users it will never meet this problem. And what? BOINC will become more effective in such situation? Or just will look as effective? IMHO second.
ID: 25948 · Report as offensive
Raistmer

Send message
Joined: 9 Apr 06
Posts: 302
Message 25956 - Posted: 10 Jul 2009, 18:05:27 UTC

Fixed, ticked closed.
Will await new BOINC release with this fix.
ID: 25956 · Report as offensive
Profile rtX

Send message
Joined: 6 May 06
Posts: 33
United Kingdom
Message 26014 - Posted: 15 Jul 2009, 10:57:10 UTC

The manner in which BOINC currently handles data transfer to/from the servers is deficient. It serves no-one to simply say it is down to a problem with the server or project management. Servers have problems and will encounter these sorts of issues and the client should help in resolving things, not exacerbate matters.

For example, currently BOINC polls any attached project for GPU work even if there is no GPU work provided by the project. That could be stopped simply by the client being configurable on a per project basis not to request GPU work. The manner in which the client attempts to attach to servers when the server is overloaded inevitably leads to the attempted connection being treated as a DoS attack.

As BOINC project handles/controls development of both the client and server software, there has to be a much better way of handling this communication. Even if the server were to respond on a per project basis, "overloaded - give me x hours" it would be better than individual WUs continually retrying individually. It is also a shame that a whole batch of WUs cannot all be sent together.
ID: 26014 · Report as offensive

Message boards : Questions and problems : BOINC's policy for uploads retry

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.