Thread 'Python assimilator with multiple upload files.'

Author	Message
bdice Send message Joined: 5 Oct 15 Posts: 4	Message 64679 - Posted: 5 Oct 2015, 3:57:53 UTC Hello! I am a long-time fan of BOINC and have recently begun using it for my own scientific research in computational materials science. I am developing my result assimilator program in Python using the model assimilator.py. My client software is uploading multiple files to the project/upload/ directory, but I don't know how to locate them all when using the Python assimilator. My assimilate_handler function looks like this: def assimilate_handler(self, wu, results, canonical_result): """ This method is called for each workunit (wu) that needs to be processed. A canonical result is not guarenteed and several error conditions may be present on the wu. Call report_errors(wu) when overriding this method. Note that the -noinsert flag (self.noinsert) must be accounted for when overriding this method. """ try: f = open('log.txt', 'a+') f.write('wu.name: %s\n' % wu.name) for r in results: f.write(' result: %s\n' % Assimilator.get_file_path(self, r)) if canonical_result is not None: f.write(' has canonical_result: %s\n' % canonical_result) f.write('wu.__dict__: %s\n\n' % str(wu.__dict__)) finally: f.close() Unfortunately, when outputting the results, I get this: wu.name: PROJECTNAME-a965a898c8e74af17240b1a80aad6356-1444013135 result: /var/lib/PROJECTNAME/project/upload/293/PROJECTNAME-a965a898c8e74af17240b1a80aad6356-1444013135_0_0 result: /var/lib/PROJECTNAME/project/upload/89/PROJECTNAME-a965a898c8e74af17240b1a80aad6356-1444013135_1_0 Both of those are the first uploaded file, duplicated twice (the jobs automatically run twice for error-checking/validation, I assume). How could I find the paths to the other files, which end in _0_1, _0_2, etc.? I know that they're hidden in other directories in project/upload/, but I'm not sure how to retrieve them. ID: 64679 ·

Juha Volunteer developer Volunteer tester Help desk expert Send message Joined: 20 Nov 12 Posts: 801	Message 64706 - Posted: 5 Oct 2015, 19:58:34 UTC - in response to Message 64679. The Python side doesn't seem to provide any nice methods to get a list of all output files. You could use get_file_path() as a starting point. Just loop over all the regex matches. Alternatively, if output files of canonical result is all you need then the script_assimilator looks like it could work. ID: 64706 ·

bdice Send message Joined: 5 Oct 15 Posts: 4	Message 65008 - Posted: 21 Oct 2015, 4:12:45 UTC - in response to Message 64706. I added a couple functions to get the full list of output files. The first function uses some of the same approach as the filepath resolver in the default assimilator.py script, but this function gets the full filename so that I can read the file in another function. The second function returns a dictionary where the keys are the "open names" specified in the template XML and the values are the absolute paths. I use these key/value pairs to perform specific actions on each file, based on its "open name." import boinc_path_config import os import xml.etree.ElementTree as ET from assimilator import Assimilator def get_absolute_path(self, name): fanout = int(self.config.uldl_dir_fanout) hashed = self.filename_hash(name, fanout) updir = self.config.upload_dir result = os.path.join(updir,hashed,name) return result def get_multiple_file_paths(self, canonical_result): result_files = dict() rootxml = ET.fromstringlist(['<root>', canonical_result.xml_doc_in, '</root>']) resultxml = rootxml.find('result') for file_ref in resultxml.iter('file_ref'): file_name = file_ref.find('file_name').text open_name = file_ref.find('open_name').text result_files[open_name] = self.get_absolute_path(file_name) return result_files ID: 65008 ·

ChristianB Volunteer developer Volunteer tester Send message Joined: 4 Jul 12 Posts: 321	Message 65024 - Posted: 21 Oct 2015, 18:25:44 UTC Can you make a pull request on gihtub with your changes? Thanks. ID: 65024 ·

Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.