Thread 'News on Project Outages'

Message boards : Projects : News on Project Outages
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 21 · Next

AuthorMessage
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15561
Netherlands
Message 50643 - Posted: 26 Sep 2013, 11:17:56 UTC

Volunteer supplied posts about project outages will continue to appear here. Please keep it on topic, post as much about the project outage itself or return of the project only. Want to have an in-depth discussion? There's a whole forum outside this thread that you can start a new thread in.

May we please ask that when you post, to do so without adding your signature? With thanks.

To do so, when you make a post in this thread, uncheck the "Add my signature to this reply" and then make your post. Your signature will still be enabled on other posts, just not this one. You can also edit your post during an hour and take the signature out.
We ask that you post without a signature to keep the purpose of this thread clear: News about project outages, not about how many credits you have or what team you want us to join.
It's not that difficult to do and it gives the moderators even less work to do, as we don't have to remove your post. So please...
ID: 50643 · Report as offensive
MarkJ
Volunteer tester
Help desk expert

Send message
Joined: 5 Mar 08
Posts: 272
Australia
Message 50644 - Posted: 26 Sep 2013, 11:21:16 UTC

Carrying on from the old thread, Asteroids is now back up after their extended server swap.
ID: 50644 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15561
Netherlands
Message 50645 - Posted: 26 Sep 2013, 11:22:51 UTC

A new thread. This time I will enforce that anyone posting with his signature on will have his post removed without remorse. If you find that you don't have to follow the request to not post with your signature in this one thread, then I don't think you have anything to say that others need to read.

And really, in the case of branjo, a 3 word sentence with a signature that takes up 3/4 of my 23" screen at 1920x1080 pixels? What were you thinking?

As such, everyone is twice warned now. No need to quickly post your message. It takes one extra click to not add your signature to this reply, must be less than a second. You have no excuse that you didn't do so, or after mistakenly posting couldn't come back and edit your post to take the signature off.
ID: 50645 · Report as offensive
Claggy

Send message
Joined: 23 Apr 07
Posts: 1112
United Kingdom
Message 50650 - Posted: 26 Sep 2013, 12:41:02 UTC

Apparently Orbit@home is rising from the dead:

orbit@home is upgrading!

We are upgrading the orbit@home server. A new version of the system should be online in fall 2013.


Claggy
ID: 50650 · Report as offensive
Keith T
Avatar

Send message
Joined: 26 Feb 07
Posts: 71
United Kingdom
Message 50693 - Posted: 1 Oct 2013, 13:19:15 UTC

ID: 50693 · Report as offensive
David S
Avatar

Send message
Joined: 15 Jan 13
Posts: 766
United States
Message 50694 - Posted: 1 Oct 2013, 13:28:24 UTC - in response to Message 50693.  
Last modified: 1 Oct 2013, 13:29:34 UTC

SETI and SETI Beta are both down
http://downforeveryoneorjustme.com/http://setiweb.ssl.berkeley.edu/beta/
http://downforeveryoneorjustme.com/http://setiathome.berkeley.edu/

It surprises me that this site is up. Doesn't Boinc also run off the UCB servers?

[edit]
Oops, sorry, forgot to turn off my sig right after reading the rule about it.
ID: 50694 · Report as offensive
ProfileGary Charpentier
Avatar

Send message
Joined: 23 Feb 08
Posts: 2493
United States
Message 50697 - Posted: 1 Oct 2013, 13:36:28 UTC

Seti may be into the regular Tuesday by now ...

For those that don't have the link / info ...
http://systemstatus.berkeley.edu/
Unscheduled Outage – Emergency Shutdown Data Center Servers
Posted by CSS IT ~jg
On 9/30/2013 at 6:42 pm PST
Modified on 9/30/2013 at 10:29 pm PST
Modified by CSS IT ~jg
Posted in Unscheduled Outage

Outage Type: UNSCHEDULED OUTAGE
Date Submitted: Monday, September 30, 2013
Outage Start/End Time: 18:15 – TBD
Groups Impacted: campus
Equipment: Data Center Servers

Description: UPDATE 1000pm – The Control-M job scheduling program will be restored at 10:30pm. Batch jobs for the following systems will likely be delayed impacting CalAnswers, HCM, PPS, BAIRS and BFS. All other enterprise services continue to be restored and monitored throughout the evening. Next update will be tomorrow morning.

0900- All Oracle and MySQL Production and Development database servers have been restored. Most enterprise systems are anticipated to be restored within the next hour.

0800pm- Email Services have been restored. Temperature is dropping to an acceptable level to begin restoring all services. Updates will continue as services are restored.

19:15 - Power is returning to the Data Center but systems remain shutdown, temperature is slowly dropping. Servers continue to be monitored.

Due to the campus widespread power outage the Data Center servers are experiencing extreme heat impacating central servers residing in the Data Center. With temperataures rising, all servers are gracefully being shutdown.

Services Impacted:

Calmail

Bconnected

Applications hosted on VM

All Development Database Systems

Telecom Gateways impacting Voice Services

BSpace

BCourses

CalPlanning

BAIRS

Blu Portal

BFS

HR Systems

Comet System

Sagebrush (Billing System)

PROSam

CalCentral

ImageNow

Control-M

Footprints ticket system

Informatica





CMR:TBD
ID: 50697 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5129
United Kingdom
Message 50699 - Posted: 1 Oct 2013, 13:37:52 UTC - in response to Message 50694.  

SETI and SETI Beta are both down
http://downforeveryoneorjustme.com/http://setiweb.ssl.berkeley.edu/beta/
http://downforeveryoneorjustme.com/http://setiathome.berkeley.edu/

It surprises me that this site is up. Doesn't Boinc also run off the UCB servers?

Same location, same power supply (it was down earlier), but a different server. And without such a complicated back-office network of linked database and storage servers, much easier to re-start cleanly without human intervention.
ID: 50699 · Report as offensive
ProfileJord
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 29 Aug 05
Posts: 15561
Netherlands
Message 50701 - Posted: 1 Oct 2013, 14:02:43 UTC

The cause for the electrical shutdown, was a campuswide power outage and transformer explosion.
ID: 50701 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 50708 - Posted: 1 Oct 2013, 16:50:44 UTC - in response to Message 50643.  

Physics@home is down

Website: http://physicshome.tk/physics/
ID: 50708 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1443
United States
Message 50716 - Posted: 1 Oct 2013, 17:19:27 UTC - in response to Message 50708.  
Last modified: 1 Oct 2013, 17:19:49 UTC

Just got work from physics in last hour and web site is up and running. so it is UP, maybe it is just your computers or ISP having problems accessing them.
ID: 50716 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 50720 - Posted: 1 Oct 2013, 17:38:49 UTC - in response to Message 50716.  
Last modified: 1 Oct 2013, 18:01:23 UTC

Since early this morning none of my browsers can reach it and tests say the site is down.

BOINC also says it is down:

01/10/2013 1:36:18 PM | physics@home | update requested by user
01/10/2013 1:36:20 PM | physics@home | Sending scheduler request: Requested by user.
01/10/2013 1:36:20 PM | physics@home | Requesting new tasks for CPU
01/10/2013 1:36:36 PM | physics@home | Scheduler request failed: Couldn't resolve host name
01/10/2013 1:36:40 PM | | Project communication failed: attempting access to reference site
01/10/2013 1:36:42 PM | | Internet access OK - project servers may be temporarily down.


It's strange I have no problems reaching any other website or project.

Sure we're talking the same project?

http://www.physicshome.tk/physics/

I've even tried through a proxy server which bypasses my ISP.

Edit: Further tests - the website is down. I think you are not talking about the same website.
ID: 50720 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1443
United States
Message 50721 - Posted: 1 Oct 2013, 18:10:46 UTC - in response to Message 50720.  

Are you using the new project address (http://physicshome.tk/physics/)?

If yes, try clearing your computers DNS settings:
  1) Open a command line window. (Windows key +R) then type in "CMD" without the quotes.
  2) On the command line type "ipconfig /flushdns" without the quotes.
  3) Close the command line window.
  4) Wait about 1 minute and retry communications/updating project within BOINC.
If no, Have you detached and re-attached to project since they changed the server signing key about 2 days ago (was done about 1 day after the new web address went live)?

If that does not help, you might need to stop BOINC, wait a minute or two and restart BOINC and try updating the project again.

If all else fails feel free to post back here.
ID: 50721 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 50727 - Posted: 1 Oct 2013, 18:38:28 UTC - in response to Message 50721.  
Last modified: 1 Oct 2013, 18:38:54 UTC

Yes I had already reattached the other day.

No luck and now I can't reattach because the communication failed. I tried 3 separate "is it just me or is that site down" - type websites and all of them say it's down and they are dotted all over the globe.

This is very strange.
ID: 50727 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 50729 - Posted: 1 Oct 2013, 18:45:40 UTC - in response to Message 50727.  

Perhaps you could do me a favour and tell me what this post says on their forums. It's an answer in a thread I started which I last accessed last night.

http://www.physicshome.tk/physics/forum_thread.php?id=57
ID: 50729 · Report as offensive
Dr Who Fan
Avatar

Send message
Joined: 10 May 07
Posts: 1443
United States
Message 50730 - Posted: 1 Oct 2013, 18:49:57 UTC - in response to Message 50729.  

Site is now down for me too, I will try again in ~1 Hour and 'Private Message' you so we don't clog the forum here.
ID: 50730 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 50732 - Posted: 1 Oct 2013, 18:52:26 UTC - in response to Message 50730.  

OK, that's a good idea. Thanks.
ID: 50732 · Report as offensive
ProfilePeter
Avatar

Send message
Joined: 7 Sep 09
Posts: 167
Canada
Message 50733 - Posted: 1 Oct 2013, 18:56:01 UTC - in response to Message 50732.  

physics@home is back up again....for me at least.
ID: 50733 · Report as offensive
SekeRob2

Send message
Joined: 6 Jul 10
Posts: 585
Italy
Message 50751 - Posted: 2 Oct 2013, 14:41:34 UTC - in response to Message 50733.  
Last modified: 2 Oct 2013, 14:43:49 UTC

This looks like having been the state at QMC@home for at least the last 36 hours:

LAPSED-02

14075 QMC@HOME 10/2/2013 1:16:53 PM Sending scheduler request: To report completed tasks.
14076 QMC@HOME 10/2/2013 1:16:53 PM Reporting 1 completed tasks
14077 QMC@HOME 10/2/2013 1:16:53 PM Not requesting tasks: "no new tasks" requested via Manager
14078 QMC@HOME 10/2/2013 1:16:53 PM [sched_op] CPU work request: 0.00 seconds; 0.00 devices
14079 QMC@HOME 10/2/2013 1:16:56 PM Scheduler request completed
14080 QMC@HOME 10/2/2013 1:16:56 PM Server error: feeder not running
14081 QMC@HOME 10/2/2013 1:16:56 PM Project requested delay of 3600 seconds
14082 QMC@HOME 10/2/2013 1:16:56 PM [sched_op] Deferring communication for 01:00:00
14083 QMC@HOME 10/2/2013 1:16:56 PM [sched_op] Reason: project is down
14084 QMC@HOME 10/2/2013 1:16:56 PM [sched_op] Deferring communication for 03:29:20
14085 QMC@HOME 10/2/2013 1:16:56 PM [sched_op] Reason: project is down

The result file did upload before this, and the 3600 second delay request by server suggests that pieces are working, but at least not the scheduler and the feeder... no new tasks coming, when work fetch is allowed [switched off, since it want to pull 400,000 seconds of work for 8 CPUs].

edit: 8 cores occupied by the currently available CMN jobs is actually about 400,000 seconds "per task" and that's too much.
ID: 50751 · Report as offensive
Thyme Lawn

Send message
Joined: 2 Sep 05
Posts: 103
United Kingdom
Message 50816 - Posted: 9 Oct 2013, 10:35:59 UTC - in response to Message 50751.  
Last modified: 9 Oct 2013, 10:38:17 UTC

This looks like having been the state at QMC@home for at least the last 36 hours:

The website has no direct link to it, but the server status page does exist and it tallies with what my work requests are being told:

08-Oct-2013 22:24:40 [QMC@HOME] Sending scheduler request: To fetch work.
08-Oct-2013 22:24:40 [QMC@HOME] Requesting new tasks for CPU
08-Oct-2013 22:24:41 [QMC@HOME] Scheduler request completed: got 0 new tasks
08-Oct-2013 22:24:41 [QMC@HOME] Server error: feeder not running
ID: 50816 · Report as offensive
1 · 2 · 3 · 4 . . . 21 · Next

Message boards : Projects : News on Project Outages

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.