Message boards : Projects : Is fully simultaneous BOINC possible?
Message board moderation
Author | Message |
---|---|
Send message Joined: 31 Oct 12 Posts: 2 |
Hello, I'm an experienced programmer but new to BOINC and excited about possibly using it as the distributed computing platform for a problem I'm working on. I'm currently planning a project that by nature will need to have each client download a large file (~1gb) and then continually process this, while at the same time communicating to other peers that received "related" files so they can share some information. The actual processing can be done with relatively average hardware but the network connectivity would require about 10 connections per client (to other peers). The network traffic would be significant (about 500kB/s). So from the start I realize that only individuals with fast internet connections would be able to participate but this shouldn't bump out everyone. On top of this, because of the system-wide nature of the problem, I could only compute if all files are being processed at the SAME time. This is because it should really be done on one gigantic computer with shared memory. Although that's not realistic, it is possible to split up the problem into chunks that each person can process with a small amount of peer data that relates neighbor chunks. Here's the catch though... if one person turns off their computer or decides to quit, the entire network would have to pause, their file would need to be sent to someone else (taking a max of ~30 minutes) and only then can processing continue in the entire network. This is only because one person needs to share just a bit of their data with 10 others, and those 10 other need to share theirs, and so on, creating an entire network dependency. As you can see, this is not the typical submit jobs to clients, wait for results, aggregate and save. This obstacle can be easily overcome with a dedicated compute cluster but I think it would be more economical and interesting to bring in the masses. So basically, now that you know the context, how can I perform fully simultaneous calculations in the BOINC network efficiently? I feel like this dependency would make the network pause all the time because I assume the probability of someone coming back to or turning off their computer at any one point is pretty high with thousands of clients. Have I misunderstood what BOINC is capable of? Has anyone else run into similar problems? I really just need some kind of dedicated power user that will guarantee a minimum amount of time they can be there so I can plan around it. Any high-level ideas welcome cause this is really just on paper right now. (sorry for the wall of text) |
Send message Joined: 31 Aug 09 Posts: 11 |
hi! something similar tries to achieve Volpex project, check out their site http://volpex.cs.uh.edu/VCP/ http://volpex.cs.uh.edu/VCP/forum_forum.php?id=2 I crunch for Ukraine |
Send message Joined: 10 Sep 05 Posts: 726 |
1) you might want to post this on the boinc_projects email list: http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_projects 2) Volpex addresses some of your issues, but it's a student project and is probably not ready for production use. 3) If the computation is synchronous (i.e. a node must wait for data from its neighbors before proceeding) then you need to worry not just about failed nodes but also about slow nodes: the global computation will end up going at the speed of the slowest node. Is there a way to make it asynchronous? 4) Because of firewalls, nodes can't generally communicate directly with other nodes. The simplest thing is to have them communicate via a central server; however, the network bandwidth of this server may be a bottleneck. 5) BOINC has a mechanism called "trickle messages" that allows a running application to communicate with a BOINC server. Let's continue this discussion on the boinc_projects email list, or email me directly. -- David |
Send message Joined: 31 Oct 12 Posts: 2 |
Thanks for info guys. Yea, I probably should have posted this in developers section. oops. Volpex looks interesting, I will look into this some more. @David The whole thing is really a grey area between async and sync. Each node needs to stream information to other nodes, but some delay is ok, and the streamed information is kind of like a hint of what to do next, not definite instructions. So each node can continue for a little bit without any streamed input. It's just that the system as a whole will become extremely unstable and completely break if one node just disappears. In that case the whole network would have to stop. Also, the possibility of using a central server for data routing would quickly explode my server. ;) Simply too much. I think maybe adding doubles of each node that could remain idle and get updated every minute or so of the changes from their copy would reduce the likelihood of complete blowout and still won't require a central data stream server. This way when a copy node notices their mirror has "died" they can take over immediately. Is this too much redundancy or is this typical? |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.