Message boards : Projects : News on Project Outages
Message board moderation
Previous · 1 . . . 47 · 48 · 49 · 50 · 51 · 52 · 53 . . . 68 · Next
Author | Message |
---|---|
Send message Joined: 30 Mar 20 Posts: 423 |
New Update from WCG: Update: Unfortunately, additional hardware problem on the storage server besides the RAID card are preventing us from restarting. Working with the data center on the alternative solutions. Comment: Incredible......... |
Send message Joined: 25 May 09 Posts: 1302 |
As all too often - cure the obvious problem and there are at least two more problems lurking in the shadows to come out an bite one :-( |
Send message Joined: 30 Mar 20 Posts: 423 |
@Rob: Yeah, it looks as if you're right about that. New Update from WCG: "Update #2: Unfortunately, the RAID controller was not the root cause of our storage system failure, the PCI bus failed. Data center is in the process of moving the disks to an alternate system and we will post updates as we progress. Once again, thank you for your patience." Comment: Looking at their Facebook account, there's not much "patience" there, to say thank you for :-) |
Send message Joined: 30 Mar 20 Posts: 423 |
Another day, another day to wait for new revelations of new issues that will stop WCG from restarting. This is getting old by now, really really old. It's almost a week now, that WCG has been down. SPOF's in such a big project, is simply not acceptable, and utterly unprofessional. I've had a lot of patience with this migration from IBM to Jurisica Lab/Krembil, but by now, my patience if wearing very thin. |
Send message Joined: 28 Jun 10 Posts: 2718 |
How did Krembil come to get it? Did they put in a bid when IBM wanted out or what? It seems to me that they are grossly under resourced for a project of this size. Were there any other options to take over from IBM? That is something I don't recall seeing any discussion of. |
Send message Joined: 29 Aug 05 Posts: 80 |
Krembil was already tied to World Community Grid and maybe researchers there saved it from oblivion. First save the patient from death and then treat the developing symptoms. https://web.archive.org/web/20210913180104/https://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=732 wrote: Additionally, we’re transferring knowledge about World Community Grid's projects and practices to the team at Krembil Research Institute. Through the Mapping Cancer Markers and Help Conquer Cancer projects, Dr. Jurisica and his team are already familiar with the power of World Community Grid. They are excited to learn even more about its inner workings and to embrace the power of citizen science. |
Send message Joined: 30 Mar 20 Posts: 423 |
Well, one of my computers logged the WCG system as down the first time on 1 Mar 2023 06.42.56 UTC. Dr Who Fan posted about it on 1 Mar 2023, 7:16:30 UTC. So, the system has been down now for at least about 6 days, 10 hours, 38+ minutes or about 154 hours (rounded down), and still counting. That must be a new record, to fix a relatively simple problem. And WCG isn't up yet. The last update from the team, was more than 20 hours ago..... Edit, added: Luke 23:34 |
Send message Joined: 28 Jun 10 Posts: 2718 |
Krembil was already tied to World Community Grid and maybe researchers there saved it from oblivion. First save the patient from death and then treat the developing symptoms.Read the link., still isn't clear to me at least whether Krembil took it on because it would have folded otherwise or weather there were any other potential takers. I shall have to do some searching and try and find out more though if no one here knows anything I might not have much luck. |
Send message Joined: 10 May 07 Posts: 1450 |
... ...Read the link., still isn't clear to me at least whether Krembil took it on because it would have folded otherwise or weather there were any other potential takers. I shall have to do some searching and try and find out more though if no one here knows anything I might not have much luck. I don't think information about other possible candidates for takeover of WCG besides Krembil were ever made public by IBM. Big Blue sprung it up on the BOINC public community after the deal was made. |
Send message Joined: 28 Jun 10 Posts: 2718 |
Thanks, I wondered if that were the case. |
Send message Joined: 30 Mar 20 Posts: 423 |
It's 19:40 ish, in Toronto, and still no update for today. No further comments needed, for now. |
Send message Joined: 8 Mar 23 Posts: 11 |
I feel i have one more FUBAR'd unfortunate incident in me before it's time to walk... mostly because some of my machines are quite old and can't really do much of the more intensive work of other projects. i wish there was a really good alternative for medical and overall people-benefiting research. |
Send message Joined: 24 Dec 10 Posts: 37 |
DENIS, Rosetta, SiDock and TNGrid all do medical research. Paul. |
Send message Joined: 28 Jun 10 Posts: 2718 |
DENIS, Rosetta, SiDock and TNGrid all do medical research.Dennis currently telling me it has no work available. |
Send message Joined: 30 Mar 20 Posts: 423 |
1 day, 15 hours+ since the last update from the WCG team. Very disturbing, very bad PR for WCG. |
Send message Joined: 8 Feb 17 Posts: 7 |
Rosetta has work available. I have 4 machines busy crunching files. |
Send message Joined: 3 Mar 23 Posts: 14 |
SiDock@home has ~10k WUs (long and short) - so I don't see any problems with computer's idle due to WCG outage. |
Send message Joined: 30 Mar 20 Posts: 423 |
New update, 10 minutes ago: "Update #3: As of this morning, the data center continues to work on booting the temporary replacement DSS 7000 storage system. They are attempting multiple alternative strategies to resolve current failures." Edit, added: Specifications Dell DSS 7000: Form Factor: 4U Max capacity: 720TB (90 x 8TB HDDs) 2 x DSS server nodes (2S Intel E5 v3 series CPUs) |
Send message Joined: 28 Jun 10 Posts: 2718 |
"Update #3: As of this morning, the data center continues to work on booting the temporary replacementWould that be a size 10 boot? |
Send message Joined: 24 Dec 10 Posts: 37 |
Dennis currently telling me it has no work available. It has un-sent units currently (was 0 before but many in progress). Paul. |
Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.