Message boards : Questions and problems : DCF Integrator
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
I don't pretend to know the inner workings of Boinc, all I can say is to describe this problem tonight. All week long my host 3378680 has been crunching CPU and GPU work with the cache level set at 4 days. It has been working and downloading work to keep the cache at that level all week. Tonight I find this host downloading hundreds of work units. And I do mean many hundred in one fell swoop. I jumped in and set No New Work. My cache level now stands at more than 9 days after all these downloads. As I said, I do not know what happened or changed. I do know that if there is an integrator on the <duration_correction_factor> it needs to be about 10 times larger if indeed that is what happened. It's the only thing I know of that could have done such a thing. Also posted on Boinc alpha testing list. Thanks for listening. |
![]() Send message Joined: 29 Aug 05 Posts: 15625 ![]() |
Your host where? On Einstein, GPUGrid, Seti, Milkyway, Collatz, Aqua or some other project? Which version of BOINC are you using? Or is that a secret? What is your DCF for that host? And while we're at it... REMINDER TO ALL ALPHA TESTERS: |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Jord Of course it's on Seti. Boinc 6.10.15 was running at this time but I have seen this problem occur with every version of Boinc I have ever used. Giving you the DCF is useless as it changes every time a work unit is completed. |
![]() Send message Joined: 29 Aug 05 Posts: 15625 ![]() |
Of course it's on Seti. Nothing much "of course" about that. You posted the same thing to the Alpha email list. Other than the post itself you gave no information. The developers there ask people to test 6.10 on Collatz, so "of course" just doesn't cut it. Boinc 6.10.15 was running at this time but I have seen this problem occur with every version of Boinc I have ever used. Ah and how come then you haven't given those snippets of information? Are we so sentient that we just know what is happening on your system? Or do you think that BOINC, just alike Windows, will send secret information about your use of it to a central database somewhere? (How's that for a conspiracy theory?) Giving you the DCF is useless as it changes every time a work unit is completed. That may be, but the DCF you have at this moment plays along with the estimate of work you ask for and get. So having a number would be nice, if you talk about it anyway. If you can't provide it, post a log with all three cc_config.xml flags, as you may well be in EDF now that you got 9 days worth of work on a 4 day request. For good measure, turn on <dcf_debug> as well, which will (weirdly enough) provide debug information on what is happening to the DCF. The more information you give, the better help you can get. When you're going to the doctor to tell him it hurts somewhere in your body, you also provide as much information about what kind of pain and where it is, so he can help you better. So... turn on the debug flags in cc_config.xml, allow new tasks, see if you can reproduce what you saw at the time of posting to the list, then post the log about that to the list. Or here. Or both. And while you're at it, open client_state.xml, find one or more of the affected Seti tasks and post the numbers of its <rsc_fpops_est> and <rsc_fpops_bound> values. |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Well I can see that this is totally a waste of my time. There is no way of knowing what happened to the DCF or what it's value was at the time it made all the work requests. Perhaps this problem will resolve itself if it's ignored. |
![]() Send message Joined: 29 Aug 05 Posts: 15625 ![]() |
There is no way of knowing what happened to the DCF or what it's value was at the time it made all the work requests. So for the future add the correct debug flags. Then you have a log of them. It's really not that difficult. Perhaps this problem will resolve itself if it's ignored. Sure, stick your head into the sand. Don't run alpha software if you cannot be bothered to give more information, run with debug flags and log (massive) logs for longer periods of time. If you just want to run BOINC, stick with the recommended version. I really don't get it, you willingly want to run the latest development software, you have this problem, you want it fixed, but you won't give more information or run with debug flags. Then how do you think the developers (most of which are in or en-route to Spain for the Workshop) are going to fix it? By magic? Just taking your word for it? Not good enough. |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Too late. As I type this that client is busy downloading hundreds more. I have put it on No new work. In fact all my boxes are going on NNW. |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Jord, I have no programing knowledge or data to report to you. I am only reporting unusual behavior I have observed with Boinc. As I have had a steady diet of VLAR (assigened to CPU) work lately on this machine, now I am crunching VLAR randomely on the work list. It seem that every time a VLAR completes on the CPU suddenly a load of work is requested for the GPU. ALL the incoming work for these 2 download burst requests ended up assigned to the GPU. Again, NO DATA, just observations. Is that ok to report? |
![]() Send message Joined: 20 Dec 07 Posts: 1069 ![]() |
Are you using the rescheduler? [edit]And since you are using optimised applications, did you set the <flops> directive in app_info.xml?[/edit] |
![]() Send message Joined: 29 Aug 05 Posts: 15625 ![]() |
One last try then. Add a cc_config.xml file to your BOINC Data directory. Add into it these lines: <cc_config> <log_flags> <cpu_sched_debug>1</cpu_sched_debug> <work_fetch_debug>1</work_fetch_debug> <rr_simulation>1</rr_simulation> <dcf_debug>1</dcf_debug> </log_flags> </cc_config> Save the file, make sure it got the .xml extension, not something else. Exit and restart BOINC. Let it run like this for a minute. Run with Allow New Tasks for a minute or two. Set NNT. Post the whole log. There's no programming knowledge needed, it's not rocket scientry. Should you find some truth in Gundolf's post, then it's app related, not BOINC. Then you should post about it on the Seti forums or the Lunatics forums. And still give them more info than "Help I have a problem". |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Are you using the rescheduler? Yes I reschedule only LAR work to the CPU where it is done more efficiently. Just over 2 hours. Yes I have <flops> sections in the app_info file to force Boinc to display the actual work time required for each work unit. |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Jord, I can do as you asked but what will be the name of the resulting log file? |
![]() Send message Joined: 29 Aug 05 Posts: 147 |
Jord, They appear in your messages tab. ![]() BOINC WIKI |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Ok running as Jord wanted with that file. I see sometheing called time_stats_log.xml I guess that is the file. Right now cpu's are all crunching on VLAR work and the GPU is crunching work about 3 and 12 minute intervals depending on it's work. I will manually log the DCF when each work unit ends for the next few hours and maybe you guys can find something. VLAR takes just over 2 hours on the CPU and it looks like they won't finnish for another hour or so. |
![]() Send message Joined: 29 Aug 05 Posts: 15625 ![]() |
I see sometheing called time_stats_log.xml I guess that is the file. No. BOINC Manager will show 1,000 lines of text in the Messages window (Simple view) or Messages tab (Advanced view). All normal messages are written to and stored in stdoutdae.txt in your data directory, which by default will grow to 2MB before it switches over to a new file and saves the old one as a *.old file. |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
I see sometheing called time_stats_log.xml I guess that is the file. Ok.........it's running and making logs then. I will let it go. As I said right now the 4 CPU's will complete 4 VLAR work in about one hour. Meanwhile the GPU is crunching work in 3 to 11 minute work units. I will keep an eye on it. Will the logs show the DCF or should I manually record it? |
![]() Send message Joined: 29 Aug 05 Posts: 15625 ![]() |
With the <dcf_debug> flag on, it will record changes to the DCF after each task has finished. It shows in lines like this: 20-Oct-09 15:07:54 Milkyway@home [dcf] DCF: 1.148090->1.130739, raw_ratio 0.974578, adj_ratio 0.848870 |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Ok will let it go as is and see what happens in 1 to 1.5 hours as these 4 VLAR work units complete. By the way not uploading completed work at this time. Is that normal with these flags set? edit.....never mind. I usually run with return results immediately on. With this config it.s off. Waiting............ |
![]() Send message Joined: 20 Jan 09 Posts: 70 ![]() |
Jorg, This may take days to capture this event in the logs. Is there any problem in running this configuration long term until it occurs? |
![]() Send message Joined: 20 Dec 07 Posts: 1069 ![]() |
The only problem will be that the log file(s) will grow dramatically. Gruß, Gundolf [edit]Since SETI is currently shut down for maintenance, I can't check, but there should be a thread (by Richard Haselgrove?) with a warning that your problem might occur when using the rescheduler. Yes I reschedule only LAR work to the CPU where it is done more efficiently. Just over 2 hours. Perhaps the value you have set for <flops> isn't right, since it should prevent the DCF from varying that much. Yes I have <flops> sections in the app_info file to force Boinc to display the actual work time required for each work unit.[/edit] |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.