Thread 'Python assimilator with multiple upload files.'

Message boards : Server programs : Python assimilator with multiple upload files.
Message board moderation

To post messages, you must log in.

AuthorMessage
bdice

Send message
Joined: 5 Oct 15
Posts: 4
United States
Message 64679 - Posted: 5 Oct 2015, 3:57:53 UTC

Hello! I am a long-time fan of BOINC and have recently begun using it for my own scientific research in computational materials science.

I am developing my result assimilator program in Python using the model assimilator.py. My client software is uploading multiple files to the project/upload/ directory, but I don't know how to locate them all when using the Python assimilator.

My assimilate_handler function looks like this:

 def assimilate_handler(self, wu, results, canonical_result):
        """
        This method is called for each workunit (wu) that needs to be
        processed. A canonical result is not guarenteed and several error
        conditions may be present on the wu. Call report_errors(wu) when
        overriding this method.

        Note that the -noinsert flag (self.noinsert) must be accounted for when
        overriding this method.
        """
        try:
            f = open('log.txt', 'a+')
            f.write('wu.name: %s\n' % wu.name)
            for r in results:
                f.write(' *result: %s\n' % Assimilator.get_file_path(self, r))
            if canonical_result is not None:
                f.write(' *has canonical_result: %s\n' % canonical_result)
            f.write('wu.__dict__: %s\n\n' % str(wu.__dict__))
        finally:
            f.close()


Unfortunately, when outputting the results, I get this:
wu.name: PROJECTNAME-a965a898c8e74af17240b1a80aad6356-1444013135
 *result: /var/lib/PROJECTNAME/project/upload/293/PROJECTNAME-a965a898c8e74af17240b1a80aad6356-1444013135_0_0
 *result: /var/lib/PROJECTNAME/project/upload/89/PROJECTNAME-a965a898c8e74af17240b1a80aad6356-1444013135_1_0


Both of those are the first uploaded file, duplicated twice (the jobs automatically run twice for error-checking/validation, I assume). How could I find the paths to the other files, which end in _0_1, _0_2, etc.? I know that they're hidden in other directories in project/upload/, but I'm not sure how to retrieve them.
ID: 64679 · Report as offensive
Juha
Volunteer developer
Volunteer tester
Help desk expert

Send message
Joined: 20 Nov 12
Posts: 801
Finland
Message 64706 - Posted: 5 Oct 2015, 19:58:34 UTC - in response to Message 64679.  

The Python side doesn't seem to provide any nice methods to get a list of all output files.

You could use get_file_path() as a starting point. Just loop over all the regex matches.

Alternatively, if output files of canonical result is all you need then the script_assimilator looks like it could work.
ID: 64706 · Report as offensive
bdice

Send message
Joined: 5 Oct 15
Posts: 4
United States
Message 65008 - Posted: 21 Oct 2015, 4:12:45 UTC - in response to Message 64706.  

I added a couple functions to get the full list of output files. The first function uses some of the same approach as the filepath resolver in the default assimilator.py script, but this function gets the full filename so that I can read the file in another function. The second function returns a dictionary where the keys are the "open names" specified in the template XML and the values are the absolute paths. I use these key/value pairs to perform specific actions on each file, based on its "open name."

import boinc_path_config
import os
import xml.etree.ElementTree as ET
from assimilator import Assimilator

    def get_absolute_path(self, name):
        fanout = int(self.config.uldl_dir_fanout)
        hashed = self.filename_hash(name, fanout)
        updir = self.config.upload_dir
        result = os.path.join(updir,hashed,name)
        return result
        
    def get_multiple_file_paths(self, canonical_result):
        result_files = dict()
        rootxml = ET.fromstringlist(['<root>', canonical_result.xml_doc_in, '</root>'])
        resultxml = rootxml.find('result')
        for file_ref in resultxml.iter('file_ref'):
            file_name = file_ref.find('file_name').text
            open_name = file_ref.find('open_name').text
            result_files[open_name] = self.get_absolute_path(file_name)
        return result_files
ID: 65008 · Report as offensive
ChristianB
Volunteer developer
Volunteer tester

Send message
Joined: 4 Jul 12
Posts: 321
Germany
Message 65024 - Posted: 21 Oct 2015, 18:25:44 UTC

Can you make a pull request on gihtub with your changes? Thanks.
ID: 65024 · Report as offensive

Message boards : Server programs : Python assimilator with multiple upload files.

Copyright © 2024 University of California.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.