Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 68 · Next

AuthorMessage
Jim1348

Send message
Joined: 8 Nov 10
Posts: 310
United States
Message 100358 - Posted: 20 Aug 2020, 10:34:39 UTC - in response to Message 100347.  

QuChemPedIA is back up.
Work units are flowing, and website is accessible.
ID: 100358 · Report as offensive     Reply Quote
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15571
Netherlands
Message 100359 - Posted: 20 Aug 2020, 11:57:50 UTC

Power outage affecting CPDN

Hi All,

There is currently a power outage across different areas of Oxford. The result of this is that all our servers are now offline and the project is currently unreachable. When power is restored, I will bring back all the services.

Best regards,

Andy
ID: 100359 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2719
United Kingdom
Message 100373 - Posted: 21 Aug 2020, 10:54:16 UTC - in response to Message 100359.  

Power outage affecting CPDN

Hi All,

There is currently a power outage across different areas of Oxford. The result of this is that all our servers are now offline and the project is currently unreachable. When power is restored, I will bring back all the services.

Best regards,

Andy


Zip files are still being uploaded at least for the most recent Linux models as they go to a server somewhere else in the world. However everything at Oxford still appears to be down.
ID: 100373 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5130
United Kingdom
Message 100375 - Posted: 21 Aug 2020, 11:12:36 UTC - in response to Message 100373.  

Zip files are still being uploaded at least for the most recent Linux models as they go to a server somewhere else in the world. However everything at Oxford still appears to be down.
I've just uploaded a zip to upload11.cpdn.org (192.171.139.103). Out of curiosity, I tried to find the location of that IP address. The site I chose offered four different database answers: Swindon, London, Nottingham (all in England), and Currie in Scotland.
ID: 100375 · Report as offensive     Reply Quote
boboviz
Help desk expert

Send message
Joined: 12 Feb 11
Posts: 419
Italy
Message 100436 - Posted: 24 Aug 2020, 10:20:44 UTC

Boinc@Tacc is down for maintenance.
But i don't know how long
ID: 100436 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5130
United Kingdom
Message 100437 - Posted: 24 Aug 2020, 10:35:10 UTC - in response to Message 100436.  

Judging by the statistics shown at https://www.boincstats.com/stats/188/project/detail/lastDays, BOINC @ TACC has been down since 18 July.
ID: 100437 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1450
United States
Message 100445 - Posted: 24 Aug 2020, 12:56:00 UTC - in response to Message 100359.  

Any update on Climate Prediction (CPDN)?
ID: 100445 · Report as offensive     Reply Quote
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15571
Netherlands
Message 100446 - Posted: 24 Aug 2020, 13:59:02 UTC - in response to Message 100445.  

I asked and got the following back:


Hi Jord,

There is no news I am afraid. The Engineering Department IT Support is working on bringing services online. Access to the department is very limited due to access limitations put in place due to the Coronavirus. So at the moment we are very much in the hands of Engineering IT Support.

Best regards,

Andy
ID: 100446 · Report as offensive     Reply Quote
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15571
Netherlands
Message 100470 - Posted: 25 Aug 2020, 21:15:39 UTC

Andy Bowery, CPDN head honcho wrote:

Hi All,

Just to give you an update on this: the project continues to be offline. A power outage across parts of Oxford on Thursday caused by a tree falling on power lines took out power to the machine room where the project's servers are based. Power was restored to the machine room, however the Department of Engineering IT Support have since had a lot of problems restoring the network to the machines based there. As a result the project continues to be offline. Engineering IT Support are continuing to work on the issue and will update us when they have more information.

Best regards,

Andy
ID: 100470 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2719
United Kingdom
Message 100477 - Posted: 26 Aug 2020, 13:30:35 UTC - in response to Message 100470.  

Andy Bowery, CPDN head honcho wrote:

Hi All,

Just to give you an update on this: the project continues to be offline. A power outage across parts of Oxford on Thursday caused by a tree falling on power lines took out power to the machine room where the project's servers are based. Power was restored to the machine room, however the Department of Engineering IT Support have since had a lot of problems restoring the network to the machines based there. As a result the project continues to be offline. Engineering IT Support are continuing to work on the issue and will update us when they have more information.

Best regards,

Andy

My understanding is that once Oxford IT people have done their bit, most if not all of what Andy needs to do can be done by remote access.
ID: 100477 · Report as offensive     Reply Quote
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15571
Netherlands
Message 100478 - Posted: 26 Aug 2020, 13:46:25 UTC - in response to Message 100477.  

Probably, but the hardware it runs on must be OK. Especially server hardware doesn't like it when one moment it's full on and the next off. So I do hope they had a good UPS when the power outage started to be able to power down everything gradually.
ID: 100478 · Report as offensive     Reply Quote
ProfileDave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2719
United Kingdom
Message 100479 - Posted: 26 Aug 2020, 15:37:40 UTC - in response to Message 100478.  

Probably, but the hardware it runs on must be OK. Especially server hardware doesn't like it when one moment it's full on and the next off. So I do hope they had a good UPS when the power outage started to be able to power down everything gradually.


Me too. Though I know the backup system was improved last year so long term not too worried. Testing site and main site zips are still getting uploaded as they go somewhere else than Oxford. No idea whether this will hold up any new work or not as I don't know if the current batch of testing is a prelude to new main site work or not.
ID: 100479 · Report as offensive     Reply Quote
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5130
United Kingdom
Message 100480 - Posted: 26 Aug 2020, 16:21:24 UTC - in response to Message 100479.  

I've just finished one from the short dev site batch. Final log:

26/08/2020 15:02:03 | cpdnboinc_dev | [sched_op] Fetching master file
26/08/2020 15:02:03 | cpdnboinc_dev | Fetching scheduler list
26/08/2020 15:04:05 | cpdnboinc_dev | [sched_op] Deferring communication for 1 days 00:00:00
26/08/2020 15:04:05 | cpdnboinc_dev | [sched_op] Reason: 8 consecutive failures fetching scheduler list
26/08/2020 16:38:12 | cpdnboinc_dev | Started upload of hadam4h_s002_201505_4_d248_000008643_0_r20107065_restart.zip
26/08/2020 16:38:14 | cpdnboinc_dev | Finished upload of hadam4h_s002_201505_4_d248_000008643_0_r20107065_restart.zip
26/08/2020 16:38:32 | cpdnboinc_dev | Started upload of hadam4h_s002_201505_4_d248_000008643_0_r20107065_4.zip
26/08/2020 16:40:13 | cpdnboinc_dev | Finished upload of hadam4h_s002_201505_4_d248_000008643_0_r20107065_4.zip
26/08/2020 17:09:58 | cpdnboinc_dev | Computation for task hadam4h_s002_201505_4_d248_000008643_0 finished
26/08/2020 17:10:00 | cpdnboinc_dev | Started upload of hadam4h_s002_201505_4_d248_000008643_0_r20107065_out.zip
26/08/2020 17:10:03 | cpdnboinc_dev | Finished upload of hadam4h_s002_201505_4_d248_000008643_0_r20107065_out.zip
So, we've run so long under this blackout that the local client has moved to the maximum backoff between attempts to report - 24 hours. Once you hear that it's back up, a manual update may help to get things going again.
ID: 100480 · Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1450
United States
Message 100561 - Posted: 1 Sep 2020, 17:32:21 UTC - in response to Message 100477.  

...My understanding is that once Oxford IT people have done their bit, most if not all of what Andy needs to do can be done by remote access.

Anyone have any new news / update on the status of the CPDN servers / project?
ID: 100561 · Report as offensive     Reply Quote
Les Bayliss
Help desk expert

Send message
Joined: 25 Nov 05
Posts: 1654
Australia
Message 100566 - Posted: 1 Sep 2020, 19:03:44 UTC

This just in:

Hi All,

An update: the Department of Engineering IT Support have now replaced the firewall box with a new box. This has now enabled us to restore the main services of the climateprediction.net project. The main website, the dev project, main project and the backup project are back on line.

Best regards,

Andy
ID: 100566 · Report as offensive     Reply Quote
ProfilePierre A Renaud
Avatar

Send message
Joined: 19 Jan 18
Posts: 66
Canada
Message 100662 - Posted: 9 Sep 2020, 8:10:34 UTC
Last modified: 9 Sep 2020, 8:11:57 UTC

Planned Maintenance on Thursday, September 10, 2020
https://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=642

We will be doing a file system check on our servers on Thursday, September 10 beginning at 14:00 UTC. We anticipate that the work will take approximately 24 hours.
ID: 100662 · Report as offensive     Reply Quote
ProfilePierre A Renaud
Avatar

Send message
Joined: 19 Jan 18
Posts: 66
Canada
Message 100761 - Posted: 16 Sep 2020, 10:54:13 UTC

Planned Maintenance on Wednesday, September 16, 2020
https://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=646

6 hours outage beginning later today at 5 pm UTC
ID: 100761 · Report as offensive     Reply Quote
Charles Raimondi

Send message
Joined: 1 Oct 20
Posts: 2
Message 100915 - Posted: 1 Oct 2020, 21:25:30 UTC

My machine DESKTOP-724EV7H has not received any new tasks from Einstein@Home since yesterday, 30 September 2020, but my machine is receiving work from Rosetta@home.

Coincidentally, or not, my machine received a new Windows 10 update last night, but I think I dealt with that by making a Windows Firewall Exception for BOINC.src, which allowed work from Rosetta@home, but no/no tasks are coming in from Einstein@Home.

Is Einstein@Home out of work, or down, or is there a change I need to make in my BOINC set up to resume getting work from Einstein@Home?
ID: 100915 · Report as offensive     Reply Quote
Jimbocous
Avatar

Send message
Joined: 1 Oct 15
Posts: 394
United States
Message 100916 - Posted: 1 Oct 2020, 21:55:15 UTC - in response to Message 100915.  


Is Einstein@Home out of work, or down, or is there a change I need to make in my BOINC set up to resume getting work from Einstein@Home?
Getting Einstein work here just fine on both machines working it.
ID: 100916 · Report as offensive     Reply Quote
Bryn Mawr
Help desk expert

Send message
Joined: 31 Dec 18
Posts: 301
United Kingdom
Message 100921 - Posted: 2 Oct 2020, 6:23:33 UTC - in response to Message 100915.  


Is Einstein@Home out of work, or down, or is there a change I need to make in my BOINC set up to resume getting work from Einstein@Home?


Turn on the work fetch debug logging and see if you are asking for Einstein work.

On new set-ups Rosetta is notorious for downloading far too much work and swamping the cache. It is very possible that the work fetch is saying it cannot cope with the jobs in hand and won’t get more until the cache empties.
ID: 100921 · Report as offensive     Reply Quote
Previous · 1 . . . 23 · 24 · 25 · 26 · 27 · 28 · 29 . . . 68 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.