Message boards : Questions and problems : Cores vs Threads ? (hyperthread matter?)
Message board moderation
Author | Message |
---|---|
Send message Joined: 9 Jan 10 Posts: 18 |
For Boinc projects (since they are computationally intensive), does having hyperthreading matter? For example, a computer with 4 cores/4 threads, vs a computer with 2 cores/4threads (due to HT) i think the 4core/4thread will get more work done, but, at what percentage will it get work done over a 2core/4thread system? |
Send message Joined: 18 Jan 08 Posts: 36 ![]() |
The answer is highly application dependent, and also will vary with the particular CPU implementation. In direct careful comparisons, I've commonly seen same system net throughput improvement on the order of 10 to 20% in comparing running HT vs. running with HT disabled. But there certainly have been cases well outside that range (including a pathological case in which running HT actually lowered net throughput on one short series of Einstein third-party aps). Now if, on the other hand you are comparing completely different architectures or generations, then the HT portion of the comparison is of minor importance compared to everything else. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
The answer is highly application dependent, and also will vary with the particular CPU implementation. Peter, Would it be fair to say that those earlier comparisons were done on NetBurst-era HT processors? Have you had any chance to repeat them on the Core iN range, or do you know anyone else who has? |
Send message Joined: 9 Jan 10 Posts: 18 |
I looked around based on Netburst/etc words, and found: Hyper-Threading on vSphere http://vpivot.com/2010/03/06/hyper-threading-on-vsphere/ (summary: suggests to enable HT. slight 10% to 24% increase) thought, the comments point to some debate http://vpivot.com/2010/03/17/vsphere-4-0-hyper-threading-and-terminal-services/ Intel engineers stage CPU coup (2yr old article) http://www.techworld.com.au/article/257165/intel_engineers_stage_cpu_coup so, my guess, is that HT is good. Further, that I should think of a 4core/8thread pc as being upto 30% faster than a 4core/4thread pc. ? |
Send message Joined: 18 Jan 08 Posts: 36 ![]() |
Would it be fair to say that those earlier comparisons were done on NetBurst-era HT processors? Have you had any chance to repeat them on the Core iN range, or do you know anyone else who has?I wrote a long answer yesterday in this thread. Not sure if it was moderated away, or whether I failed to click on the post button after previewing it. I'll recast the text part of my answer: my own personal owned system comparisons were done on a Gallatin, which is the large-cache variant of Northwood, which in turn was the next-process implementation of Willamette (with some appreciable improvement). So, yes, marketing called them all NetBurst, and they all were from a diseased branch of the Intel microprocessor tree--now happily cut off in favor of the vastly better Conroe and Nehalem branches. I don't currently operate any hosts capable of HT. But my most recent measurement of this kind was on msattler's Frozen Nehi. The first was before it got frozen, and sadly, it also only had one of the three channels of RAM populated at the time, rendering the results of rather limited application. Still, they showed a quite modest hyperthreading productivity benefit in two Angle Ranges which had quite a bit of work at the time, and a slight disadvantage in another Angle Range region. On the chance that embedded images are forbidden here, but links permitted, I'll include a couple of links this time: Single RAM channel Nehalem HT comparison by AR same comparison--expanded view near 0.4 AR Much later in the Frozen Nehi's life, Mark undertook another comparison--this time on Astropulse, and this time running with RAM channels fully populated with high-performance RAM, overclocked as Mark would. With RAM starvation not getting in the way to nearly the degree seen in those first comparisons, HT but highly consistent productivity improvement--ballpark 10%. Astropulse comparison on fully populated and overclocked system |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
Thanks. IIRC, some of the earlier NetBurst experiments showed much better results with dis-similar tasks - one SETI with one Einstein was a favourite pairing. I know Mark is an avowedly one-project cruncher (SETI first, the rest nowhere) - and shortly to become a no-project cruncher, such is his disgust at the cack-handed way the latest BOINC server updates have been rolled out - so presumably mixed project pairing wasn't part of your tests with him. We've also lost Tony (mmciastro), who used to do similar testing with mainly AMD processors (if I may be permitted to use those letters in this company!) So I wonder if the opening question has yet been answered with iN technology and diverse projects? (posting from a Williamette, as it happens) |
Send message Joined: 18 Jan 08 Posts: 36 ![]() |
IIRC, some of the earlier NetBurst experiments showed much better results with dis-similar tasks - one SETI with one Einstein was a favourite pairing. No objection from me--I as a former employee have more concrete reasons to dislike Intel than most people do, though AMD fans tend to have serious blind spots to flaws and less than competitive aspects of that product set. So I wonder if the opening question has yet been answered with iN technology and diverse projects? I've looked at ap diversification benefit (specifically for Einstein and ordinary SETI) myself, as it happens, and even got the honor of having my results pointed to several times by one Joe Segur! and commented on by you. Those results were observed on a Q6600 (4-core Conroe). But I never looked at the question of application diversity benefit vs. hyperthreading (Conroe does not do HT). You are right in assuming there was nothing but SETI on Mark's system when I monitored it. I've certainly done nothing at all on i7 behavior in the face of ap diversity at all, still less on the HT interaction. |
Send message Joined: 9 Jan 10 Posts: 18 |
The serious number stat crunchers can give a much better answer, but, my 'really rough' comparisons looking at the WGC project stats seem to indicate that HT may help 20% over non-HT. (well, its comparing I7 enabled HT to non-HT harpertowns so my comparison weak) ps. I do appreciate reading your responses. The IT folks at work can't even fathom apps that normally run at 100% of the cpu except parrot that they must be badly written. |
Send message Joined: 19 Apr 09 Posts: 23 ![]() |
Darwincollins, explain to them that BOINC doesn't actually run at 100%, it uses the extra CPU cycles that aren't being used by anything else. That brings your CPU usage up to 100%. As other work asks for more space, BOINC cuts back out of the way and takes less of the share of cycles. To get back on topic though, some of the people with i7s have tried running both with and without HT on. With HT the work units were slower but there were 8 of them at a time because they are running two WUs on each core. I believe they decided it was better with HT on but not twice as fast. If I remember right it was about like having two extra cores speed wise. |
![]() Send message Joined: 13 Aug 06 Posts: 778 ![]() |
The answer is highly application dependent, and also will vary with the particular CPU implementation. Last year I saw the tasks web page of a hyperthreaded computer running 8 CPDN HadAM3P climate models that were not designed for hyperthreading. IIRC it was a decent computer but the models were advancing I think 8 times more slowly than on my C2D 6600. I've never seen a slower speed on any other computer. The lesson is not to use HP for CPDN models until a type is developed specially for it. This is planned but not soon. Or if you do try it, check what's happening. |
Send message Joined: 18 Jan 08 Posts: 36 ![]() |
Last year I saw the tasks web page of a hyperthreaded computer running 8 CPDN HadAM3P climate models that were not designed for hyperthreading. IIRC it was a decent computer but the models were advancing I think 8 times more slowly than on my C2D 6600. I've never seen a slower speed on any other computer.I don't know what coding for HT benefit would mean, other than trying to get a smaller working set and other measures of RAM footprint. If you saw a really dramatic slowdown, and there was not something non-comparable going on, then the most obvious possibility would be that the HT variant, with double the memory demand, pushed the system into heavy enough disk-swapping to slow it severely. That is not what was going on in the case I observed. My systems generally have substantial RAM relative to the demands of the BOINC projects I've used them on. But performance degradation when an appreciable amount of memory activity spills down to the next speed tier, whether from cache to RAM, or from RAM to disk, is really severe. So depending on configuration and code, could be a source of HT performance loss well below break-even. Not that HT is required to get this effect. Someone was running a monster server possessing a large number (at least eight, I think) of the Intel hex-core processor Dunnington chips designed in India on BOINC a while back. The total throughput per core, and the execution time per result were just awful. I think the problem was that the system configuration provided far less RAM bandwidth per core than the smaller also Penryn-generation systems to which I compared it, so the processors spent most of their time waiting for RAM requests to complete. |
Send message Joined: 5 Oct 06 Posts: 5149 ![]() |
I think Mo meant to draw a distinction with 'multithreading': the distinction is probably between 8 separate applications, running on 8 different tasks with eight different datasets: or a single application, spawning eight threads, all working on different aspects of the same task, and only accessing one dataset. Hopefully the latter case would suffer from far less memory bus contention. Those CPDN HadAM3P jobs have a heavy memory demand at the best of times: I've got one at the moment which is holding almost 220 MB in RAM even while 'waiting to run'. |
Send message Joined: 4 Sep 10 Posts: 4 ![]() |
Just wondering if turning HT off on my i7 930 will I get a boost in the performance of non boinc apps whilst crunching at 100%. As it stands now, I'm having trouble running things whilst BOINC is at full. Would it be better to keep hyperthreading and reduce BOINC to use 7 cores??? Also, I believe you can overclock higher with HT off as it wont get as hot! Could I compensate for no HT with a good OC??? |
Send message Joined: 4 Sep 10 Posts: 4 ![]() |
Hey let me answer my own question... YES! I am seeing an improvement with other programs when HT is off and BOINC is on 100%. Such programs as Media Player Classic and Firefox and Opera. Media player and/or VLC's audio would pop and crackle and weren't playable until boinc was off. Now that seems to be fixed with HT off. I've also been able to OC by 200mHz with no real change in temps... waiting for my h50 to return so I can OC more. Hope this helps some ppl, Jim. |
Send message Joined: 9 Jan 10 Posts: 18 |
I have heard similar ideas over on GPUGrid. |
Copyright © 2025 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.