Thread 'Anything and Everything to do with (WCG) World Community Grid'

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 21 · 22 · 23 · 24

AuthorMessage
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 500
Sweden
Message 116745 - Posted: 29 Aug 2025, 12:03:02 UTC

WCG Operational Staus update https://www.cs.toronto.edu/~juris/jlab/wcg.html (click operational status heading) - August 27

August 27, 2025
MAM1 7.07 updates:
The addition of spdlog as a dependency to replace the previous debug level printouts with more useful output for those who like to look at stderr.txt before it gets cleaned up by the BOINC client.
Some kludgey math, flags, and options now set in the application's main function to try and get Ensmallen -> which depends on Armadillo -> which depends on OpenMP/OpenBLAS -> which nested thread creation causing suspension of concurrent running tasks under the BOINC client and using more CPU than the plan class and --nthreads parameter dictated. Essentially, bad behaviour. Thanks for posting feedback for the first few batches released, I will endeavour to address any further feedback if MDMG/MAM1 continues to over-schedule w.r.t the plan class or otherwise behaves badly.
Added two built-in, configurable options for adjusting learning rate when using the LibTorch backend which was observed to improve the model's avg. loss progression during cross validation in fewer epochs, corresponding to:
https://docs.pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.ReduceLROnPlateau.html
https://docs.pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.CosineAnnealingLR.html
Other fixes, additional work on features that will release in later versions after the migration.
ID: 116745 · Report as offensive     Reply Quote
Grumpy Swede
Avatar

Send message
Joined: 30 Mar 20
Posts: 500
Sweden
Message 116748 - Posted: 29 Aug 2025, 22:34:10 UTC

New update: https://www.cs.toronto.edu/~juris/jlab/wcg.html (click operational status heading) - August 29.
Also pushed to the BOINC client.

August 29, 2025
Full migration of WCG from the Graham to Nibi cloud facilities will be completed between 3:00-5:00 p.m. on August 31st, 2025
Sharcnet will then power down all hardware at Graham.
We have put in a ticket with UHN Digital to move our DNS records to the new IP addresses we have been allocated in Nibi cloud, and all storage, networking, and compute resources are already provisioned at Nibi.
We continue testing QA and Prod on the new infrastructure.
We will experience some downtime as *.worldcommunitygrid.org URLs switch over. We will be bringing down workunit creation scripting, BOINC server components, and upload/download servers in sequence, halting the database, performing a final rsync and then bringing down the website, forums, and internal services over the next 48h.
In the best case, our DNS records will be switched over on the 31st and everything behind the load balancer will be up and running. However, we want to prepare users for the possibility of additional downtime as we stand up prod on Nibi.(
ID: 116748 · Report as offensive     Reply Quote
Previous · 1 . . . 21 · 22 · 23 · 24

Message boards : Projects : Anything and Everything to do with (WCG) World Community Grid

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.