Author | Message |
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Finally
2013-02-17 08:40:45 | SETI@home | Not requesting tasks: some download is stalled
Which is the reason you're 150 jobs short of what you expect there to be. Abort the stalled download, or in tasks tab abort the relevant task that's stuck.
That's all I see of note, and not being an expert, maybe there's more.
Thank you very much for your feedback here. I hope someone can fix this.
Don't need thing.
And to keep requesting new work even if Seti transfers are stalled which they will be until more BW is allocated to the servers.
But that has been asked for several years now and probebly will be asked for a long time in the future.
Fix theese 2 things and multi Gpu crunchers will continue better.
ID: 47796 · |
|
Claggy
Send message Joined: 23 Apr 07 Posts: 1112
|
Please get rid the "don't need" option. If it's not here for testing/debugging anymore.
The Don't need is an Informational Message, it is not an option, If Boinc feels it has eithier enough work Total, or has done enough work from a project (for it's resource share), then it doesn't need any more.
Boinc used to have no Informational messages about why it wasn't asking for work, so you didn't know why it wasn't asking for work, now it does.
Claggy
ID: 47797 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
You're right on that, Claggy. The config is interesting in that TRuEQ & TuVaLu has set 12 concurrent downloads from any project, a bunch, 'assumed' [which is bad], that the download was altogether borged permanently... not recoverable.
I wrote in another thread that allow only 1 download for every computor.
Then everyone would have more dl "pipes" to the server.
When installing and investigating BM 7.0.4+ options I saw this multi transfer option and tested it.
So far no problem with it.
I get the same stalled seti transfers and the server backoffs as before.
The problem will be when alot of people discover this multi option it will need more dl "pipes" from the server. Which it doesn't have.
ID: 47798 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Not sure about the wisdom of controlling operations with having both a app_config.xml and app_info.xml. The app_config is a replacement control file to app_info, so maybe there's a conflict between these 2.
They work fine together. But if I'm using both, I'd put everything in app_info that is defined there, and only spill over into app_config for the tags which can't be defined anywhere else - basically, just max_concurrent at this stage.
When I have work it runs fine.
The Gpu's get loaded as planed and no problems.
The problem is as with BM 7.0.x and up to 24(i think) was getting work from the servers properly.
With thees 2 options "don't need" and not dl when there is stalled transfers.
It seems like being back from start.....(almost).
That is why I created this thread.
ID: 47799 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Regarding the thread title:
"don't need" is not a server response. It is a client-generated information message, explaining why the client didn't request new work: as is clear from the context
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:36:04 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (0.00 sec, 0.00 inst)
2013-02-17 08:36:04 | Moo! Wrapper | Sending scheduler request: Requested by user.
2013-02-17 08:36:04 | Moo! Wrapper | Not requesting tasks: don't need
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
If the OP could amend the thread title, and focus on the real issue, there might be something there we can look into: I'm not quite sure why no work would be needed, from that mess of a log file.
If you need other config from cc_config.xml logging just let me know.
ID: 47800 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Regarding the thread title:
"don't need" is not a server response. It is a client-generated information message, explaining why the client didn't request new work: as is clear from the context
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:36:04 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (0.00 sec, 0.00 inst)
2013-02-17 08:36:04 | Moo! Wrapper | Sending scheduler request: Requested by user.
2013-02-17 08:36:04 | Moo! Wrapper | Not requesting tasks: don't need
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:36:04 | Moo! Wrapper | [sched_op] ATI work request: 0.00 seconds; 0.00 devices
If the OP could amend the thread title, and focus on the real issue, there might be something there we can look into: I'm not quite sure why no work would be needed, from that mess of a log file.
Is the OP me or someone else.
Can I change the title, how and what will it say??
ID: 47801 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Please get rid the "don't need" option. If it's not here for testing/debugging anymore.
The Don't need is an Informational Message, it is not an option, If Boinc feels it has eithier enough work Total, or has done enough work from a project (for it's resource share), then it doesn't need any more.
Boinc used to have no Informational messages about why it wasn't asking for work, so you didn't know why it wasn't asking for work, now it does.
Claggy
Well, the "new" work cache setting I have 3days + 1day of work.
But I only have like maybe 1-2 days of work and the clients response is don't need".
This don't need must be new to 7.0.4+ versions.
I ran 7.0.28 and all of my options and workfetch and cue worked.
But I wanted to run more then 1 cal_ati beta ap task so I added app_config.xml to my seti beta folder and upgraded to BM 7.0.4+
Now I see how I can fix this for me.
When finnished the Seti Beta tasks I will downgrade to the functioning version of 7.0.28 and continue as before.
That will solve this.
Thank you for your feedback.
ID: 47802 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Where do I find a moderator here to change thread name to " GPU starved and client gives answer "don't need"???
I made this post only to click the red cross and be able to contact a moderator.
ID: 47803 · |
|
Richard Haselgrove Volunteer tester Help desk expert
Send message Joined: 5 Oct 06 Posts: 5139
|
Where do I find a moderator here to change thread name to " GPU starved and client gives answer "don't need"???
I made this post only to click the red cross and be able to contact a moderator.
You can change it yourself by editing that last post.
ID: 47805 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
ID: 47806 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Where do I find a moderator here to change thread name to " GPU starved and client gives answer "don't need"???
I made this post only to click the red cross and be able to contact a moderator.
You can change it yourself by editing that last post.
Now I saw that.
I wonder why I haven't noticed that option when editing before.
:)
ID: 47807 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
I removed the app_config.xml as suggested.
So now I am running only 1 SETI Beta on Gpu 0.
and running 1 Moowrap task on 3 Gpu's(Gpu 1(multi=1).Which is, use all gpu's fo rthat task.
and as many seti ap's I am able to dl on Gpu 0 and 2.
I clicked update on Seti with 1 stalled dl.
It says, transfers stalled=no new tasks.
I clicked update on Mowrap that has 1 task in cue.
cue is set to 3+1 days
2013-02-17 18:16:09 | Moo! Wrapper | update requested by user
2013-02-17 18:16:10 | Moo! Wrapper | Sending scheduler request: Requested by user.
2013-02-17 18:16:10 | Moo! Wrapper | Not requesting tasks: project is not highest priority
2013-02-17 18:16:13 | Moo! Wrapper | Scheduler request completed
A new message for me: Project is not highest priority......
Any more ideas before i downgrade to 7.0.28 again?
ID: 47822 · |
|
Richard Haselgrove Volunteer tester Help desk expert
Send message Joined: 5 Oct 06 Posts: 5139
|
Any more ideas before i downgrade to 7.0.28 again?
Learn to read your own logs. Here's a single work fetch cycle from lower down, and its outcome:
2013-02-17 08:45:38 | | [work_fetch] work fetch start
2013-02-17 08:45:38 | | [work_fetch] ATI: buffer_low: yes; sim_excluded_instances 0
2013-02-17 08:45:38 | | [work_fetch] set_request(): ninst 3 nused_total 1.000000 nidle_now 0.000000 fetch share 1.000000 req_inst 0.000000
2013-02-17 08:45:38 | | [work_fetch] ------- start work fetch state -------
2013-02-17 08:45:38 | | [work_fetch] target work buffer: 259200.00 + 43200.00 sec
2013-02-17 08:45:38 | | [work_fetch] --- project states ---
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] REC 69099.529 prio -1.000982 can req work
2013-02-17 08:45:38 | OProject@Home | [work_fetch] REC 82.896 prio -0.000236 can't req work: scheduler RPC backoff (backoff: 515.56 sec)
2013-02-17 08:45:38 | SETI@home | [work_fetch] REC 59664.099 prio -0.778463 can't req work: scheduler RPC backoff (backoff: 45.02 sec)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] REC 46196.671 prio -4.096825 can't req work: "no new tasks" requested via Manager
2013-02-17 08:45:38 | WUProp@Home | [work_fetch] REC 0.002 prio -0.000009 can't req work: non CPU intensive
2013-02-17 08:45:38 | FreeHAL@home | [work_fetch] REC 0.020 prio 0.000000 can't req work: non CPU intensive
2013-02-17 08:45:38 | | [work_fetch] --- state for CPU ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 1209096.29 nidle 3.00 saturated 0.00 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000 (blocked by prefs)
2013-02-17 08:45:38 | | [work_fetch] --- state for ATI ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 842913.07 nidle 0.00 saturated 6016.78 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 1.000
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000 (no apps)
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | | [work_fetch] ------- end work fetch state -------
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (842913.07 sec, 0.00 inst)
2013-02-17 08:45:38 | Moo! Wrapper | Sending scheduler request: To report completed tasks.
2013-02-17 08:45:38 | Moo! Wrapper | Reporting 1 completed tasks
2013-02-17 08:45:38 | Moo! Wrapper | Requesting new tasks for ATI
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] ATI work request: 842913.07 seconds; 0.00 devices
2013-02-17 08:45:43 | Moo! Wrapper | Scheduler request completed: got 10 new tasks
At that time, OProject had the highest priority (note that all numbers are negative): SETI@home had middle priority: and Moo! Wrapper had lowest priority.
Both OP and SETI were being delayed before attempting to pester their servers again ('RPC backoff' - Remote Procedure Call). Moo was fetchable - work was requested and allocated. That's how it works.
If you try to bypass normal scheduling by clicking the update button, BOINC will only request work from the highest priority project. We have had bugs with that: it should be the highest priority fetchable project, and I think it's fixed now in v7.0.52. You would have to match up the work fetch cycle in the log with the time you clicked the button. Please do that in the peace and comfort of your own home: we don't need a new log snippet for every twist and turn in your search.
ID: 47823 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
Any more ideas before i downgrade to 7.0.28 again?
Learn to read your own logs. Here's a single work fetch cycle from lower down, and its outcome:
2013-02-17 08:45:38 | | [work_fetch] work fetch start
2013-02-17 08:45:38 | | [work_fetch] ATI: buffer_low: yes; sim_excluded_instances 0
2013-02-17 08:45:38 | | [work_fetch] set_request(): ninst 3 nused_total 1.000000 nidle_now 0.000000 fetch share 1.000000 req_inst 0.000000
2013-02-17 08:45:38 | | [work_fetch] ------- start work fetch state -------
2013-02-17 08:45:38 | | [work_fetch] target work buffer: 259200.00 + 43200.00 sec
2013-02-17 08:45:38 | | [work_fetch] --- project states ---
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] REC 69099.529 prio -1.000982 can req work
2013-02-17 08:45:38 | OProject@Home | [work_fetch] REC 82.896 prio -0.000236 can't req work: scheduler RPC backoff (backoff: 515.56 sec)
2013-02-17 08:45:38 | SETI@home | [work_fetch] REC 59664.099 prio -0.778463 can't req work: scheduler RPC backoff (backoff: 45.02 sec)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] REC 46196.671 prio -4.096825 can't req work: "no new tasks" requested via Manager
2013-02-17 08:45:38 | WUProp@Home | [work_fetch] REC 0.002 prio -0.000009 can't req work: non CPU intensive
2013-02-17 08:45:38 | FreeHAL@home | [work_fetch] REC 0.020 prio 0.000000 can't req work: non CPU intensive
2013-02-17 08:45:38 | | [work_fetch] --- state for CPU ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 1209096.29 nidle 3.00 saturated 0.00 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000 (blocked by prefs) (no apps)
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000 (blocked by prefs)
2013-02-17 08:45:38 | | [work_fetch] --- state for ATI ---
2013-02-17 08:45:38 | | [work_fetch] shortfall 842913.07 nidle 0.00 saturated 6016.78 busy 0.00
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] fetch share 1.000
2013-02-17 08:45:38 | OProject@Home | [work_fetch] fetch share 0.000 (no apps)
2013-02-17 08:45:38 | SETI@home | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | SETI@home Beta Test | [work_fetch] fetch share 0.000
2013-02-17 08:45:38 | | [work_fetch] ------- end work fetch state -------
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] Starting scheduler request
2013-02-17 08:45:38 | Moo! Wrapper | [work_fetch] request: CPU (0.00 sec, 0.00 inst) ATI (842913.07 sec, 0.00 inst)
2013-02-17 08:45:38 | Moo! Wrapper | Sending scheduler request: To report completed tasks.
2013-02-17 08:45:38 | Moo! Wrapper | Reporting 1 completed tasks
2013-02-17 08:45:38 | Moo! Wrapper | Requesting new tasks for ATI
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
2013-02-17 08:45:38 | Moo! Wrapper | [sched_op] ATI work request: 842913.07 seconds; 0.00 devices
2013-02-17 08:45:43 | Moo! Wrapper | Scheduler request completed: got 10 new tasks
At that time, OProject had the highest priority (note that all numbers are negative): SETI@home had middle priority: and Moo! Wrapper had lowest priority.
Both OP and SETI were being delayed before attempting to pester their servers again ('RPC backoff' - Remote Procedure Call). Moo was fetchable - work was requested and allocated. That's how it works.
If you try to bypass normal scheduling by clicking the update button, BOINC will only request work from the highest priority project. We have had bugs with that: it should be the highest priority fetchable project, and I think it's fixed now in v7.0.52. You would have to match up the work fetch cycle in the log with the time you clicked the button. Please do that in the peace and comfort of your own home: we don't need a new log snippet for every twist and turn in your search.
But even if oproject is the highest prio to fetch work...
Wasn't cpu projects and gpu projects treated differently when running work-fetch cycle?
And oproject ALX is an NCI project and they was also treated differently from ord. cpu projects.
But that was in the early state of BM 7.x.x
And where do I read about the rpc process when running work fetch cycle?
I was mostly curious why the "don't need" answer came when work cache was far from full.
And why Seti wasn't able to request more tasks when there where tasks stalled in cue which they are most of the day.
I can't write code in C++ so I can't do it better then the dev's
I hope I didn't bother too much.
//Me out
ID: 47824 · |
|
TRuEQ & TuVaLu
Send message Joined: 23 May 11 Posts: 108
|
I suspended a couple of seti tasks to give room to 1 seti beta task.
And see this...
2013-02-18 17:25:28 | SETI@home | Sending scheduler request: To report completed tasks.
2013-02-18 17:25:28 | SETI@home | Reporting 1 completed tasks
2013-02-18 17:25:28 | SETI@home | Not requesting tasks: some task is suspended via Manager
2013-02-18 17:25:34 | SETI@home | Scheduler request completed
But I still want to dl the limited 100 tasks to my cue.
I just resumed the tasks that was suspended. So problem is solved.
I am still evaluating this .52 version....
I am about to run some Einstein soon.
And after that I will run WCG with app_config.xml to have it run 2 tasks on each gpu win cpu setting 0.45 and gpu setting 0.5
I will run the projects one by one since it feels more stable that way.
ID: 47837 · |
|