CRC checks?

Message boards : Questions and problems : CRC checks?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2518
United Kingdom
Message 111048 - Posted: 14 Feb 2023, 8:22:24 UTC

I know there is the option to turn off cyclic redundancy checks on image files. Are these checks done on other files that get downloaded for different task types before the actual tasks are downloaded?
ID: 111048 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 111051 - Posted: 14 Feb 2023, 9:36:01 UTC - in response to Message 111048.  

There are two sets of files: "project" files and "task/workunit" files.The project files are typically downloaded when you you attach to a project, and often repeated at subsequent contacts. Image files are usually sent in the first group, and are often in a lossy compressible format like JPEG, for eye candy in simple view - intermediate hosts on the internet download path can re-compress them to save bandwidth, which is what the CRC option is designed to allow.

In general, task data will be sent in some format like zip, which doesn't introduce lossy changes - the unzipped file should be an exact match to the original. I saw the original posts which possibly triggered this question: I have a nagging fear that BOINC sometimes miscounts when a large download is interrupted part way through and restarted from an intermediate point, but that tends to show up as 'xxxxx expected, yyyyy received' errors - possibly only with debug logging. I'll keep my eye open.
ID: 111051 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2518
United Kingdom
Message 111053 - Posted: 14 Feb 2023, 11:28:37 UTC - in response to Message 111051.  

There are two sets of files: "project" files and "task/workunit" files.The project files are typically downloaded when you you attach to a project, and often repeated at subsequent contacts. Image files are usually sent in the first group, and are often in a lossy compressible format like JPEG, for eye candy in simple view - intermediate hosts on the internet download path can re-compress them to save bandwidth, which is what the CRC option is designed to allow.
My brain immediately went to disk image files for VB tasks! I was clearly overthinking things.

It seems to me that BOINC should ideally do a CRC check on all downloaded files and retry if the check doesn't match.

I wonder if there would be any mileage in putting in a feature request to git-hub?
ID: 111053 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 111054 - Posted: 14 Feb 2023, 11:47:07 UTC - in response to Message 111053.  

It seems to me that BOINC should ideally do a CRC check on all downloaded files and retry if the check doesn't match.
I think it already does, but I'll check the sources.
ID: 111054 · Report as offensive
Profile Dave
Help desk expert

Send message
Joined: 28 Jun 10
Posts: 2518
United Kingdom
Message 111055 - Posted: 14 Feb 2023, 12:52:44 UTC - in response to Message 111054.  

Thanks, no point in asking for something already there!
ID: 111055 · Report as offensive
Richard Haselgrove
Volunteer tester
Help desk expert

Send message
Joined: 5 Oct 06
Posts: 5077
United Kingdom
Message 111056 - Posted: 14 Feb 2023, 13:30:40 UTC - in response to Message 111055.  

All my CPDN data files (where this query first arose) have MD5 checksums - either calculated by the server when the file was first loaded to form part of a workunit, or (for an upload file) by the client, when that stage of the computation has completed. Application files are checked for a more sophisticated digital signature.

https://github.com/BOINC/boinc/blob/master/client/cs_files.cpp#L127 has this comment:

//  verify_contents
//      if true, validate the contents of the file based either on =
//      the digital signature of the file or its MD5 checksum.
//      Otherwise just check its existence and size.
(and more)

That looks good enough for now - we can dig deeper if the errors continue.
ID: 111056 · Report as offensive

Message boards : Questions and problems : CRC checks?

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.