Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post your testing results with HandBrake.
Post Reply
Duke
Posts: 5
Joined: Wed Jan 09, 2008 5:55 am

Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by Duke »

Machine Type: Mac Pro Penryn
CPU Speed: 2.8 Ghz
Number of CPUs: 8
Rip Format (MP4, AVI etc) M4v
Encoder: Handbrake
Video Size & settings: (640x480, anamorphic, deinterlace etc) Apple TV preset 2 pass first pass turbo
Quality / Bit Rate: Apple TV preset 2500
1 or 2 Pass: 2 pass
Min/Max or Average Frames Per Second (FPS): 225 fps average running handbrake in 2 instances. This consumes 85-90% of all processor capacity.
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Post by jbrjake »

Wow, I wasn't expecting benchmarks on these machine so soon!

Out of curiosity, why did you run mutliple instances of HB? And was that the average fps on your turbo 1st pass or the real 2nd pass?
Duke
Posts: 5
Joined: Wed Jan 09, 2008 5:55 am

Post by Duke »

When running one instance the processor utilization was around 600% and with two instances it was 850-900%.

First pass was 225 fps second pass was 170 fps total.
Cavalicious
Moderator
Posts: 1804
Joined: Mon Mar 26, 2007 12:07 am

Post by Cavalicious »

Hmm...now where are those lottery tickets?
Duke
Posts: 5
Joined: Wed Jan 09, 2008 5:55 am

Post by Duke »

I should clarify:

when running 2 instances of HB the percent processor utilization is 85-90% for the application, this added with os and other apps running yeild total utilization of 90-95%. The total CPU activity monitor for both applications combined is 700-720 (800 being the highest possible obviously).
Deleted User 134

Post by Deleted User 134 »

Ouch, that's swift!
sasha
Posts: 38
Joined: Mon Jan 29, 2007 3:16 pm

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by sasha »

Out of curiosity could you try to a clip at the Deux Six Quatre setting @ 3000kbps?

It is the setting I personally use and I’m wondering how this beast deals with it.

Thanks!
eddyg
Veteran User
Posts: 798
Joined: Mon Apr 23, 2007 3:34 am

Re:

Post by eddyg »

Duke wrote:When running one instance the processor utilization was around 600% and with two instances it was 850-900%.

First pass was 225 fps second pass was 170 fps total.
There's only so much threading that you can do before interdependencies create a limit. After that it's just clock speed and efficiency of processing that will make a difference. I wonder if SSE4 will make much of an impact once the compilers and x264 get on board.

I've been thinking recently that the current thread model may require two modes, one for single or two CPUS, and another for more than that. I'm not convinced that we can use a one size fits all approach and assume a linear scaling from 1 to n. I don't have any design in mind, it's just a nagging feeling at present. Maybe I need an extra 4 cores to test this out on, and see whether there is a linear path from 1 to 8 that makes sense in-between.

Cheers, Ed.
eddyg
Veteran User
Posts: 798
Joined: Mon Apr 23, 2007 3:34 am

Re:

Post by eddyg »

Duke wrote:I should clarify:

when running 2 instances of HB the percent processor utilization is 85-90% for the application, this added with os and other apps running yeild total utilization of 90-95%. The total CPU activity monitor for both applications combined is 700-720 (800 being the highest possible obviously).
This sounds as if HB is basically tuned for 4 cores and isn't scaling well to 8. Not that surprising given that I tuned the latest version using 4 cores, and then assumed a linear scale backwards to 1, and forwards to 8.

Then there is the fact that I'm sure that x264 doesn't scale up that well past a certain number of threads - let's face it by definition video compression has interdependencies both intra frame and inter frame, so it could be hard to split the workload.

I'd be curious to see where HB blocks with the profiling tools on an 8 core to determine where the bottleneck lies. If anyone with an 8 core feels like it - go for it, and let me know where the bottleneck lies.

For starters I know it isn't the MPEG2 decoding which will run at 1500fps on a 2.6Ghz intel core.

Cheers, Ed.
Duke
Posts: 5
Joined: Wed Jan 09, 2008 5:55 am

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by Duke »

If I had the technical knowledge to do what you are asking I would gladly participate. I will help out in any way I can!

Also- would the Deux Six Quatre @ 3000Kbps be better than the Apple TV preset at 2500 Kbps? Encoding time doesn't matter to me as I just let the thing run while I am at work. I have tried to find out more about this setting but haven't had much luck.

Thx
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by jbrjake »

Duke wrote:Also- would the Deux Six Quatre @ 3000Kbps be better than the Apple TV preset at 2500 Kbps?
No. I don't want to tell sasha what to do, but I added that preset, and in my opinion it's ridiculous to use it with anything over 2000. And I set it to 1600 on purpose.

The point of using slow x264 options is to reduce the bitrate needed to get a given quality. But when you use an absurdly excessive bitrate, you're not going to see much visible improvement over a faster encode using less intensive options, like the AppleTV ones.
sasha
Posts: 38
Joined: Mon Jan 29, 2007 3:16 pm

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by sasha »

I have chosen to use the Deux Six Quarte @ 3000 Kbps since personally I prefer the result.

On darker and grayish areas I really started noticing bigger blocked areas that became less distracting if I increased the bitrate. Therefore I decided to use the Deux Six Quatre setting @ 3000Kbps and of course the ability to do AC3 pass-through. Personally I’m quite pleased with this setting.

At the moment I’m getting an average of 6 to 7 fps on the second pass during encoding on my Dual G5 2.7Ghz (4.5Gb Ram) and I was just wondering what Duke’s machine was able to do with this setting.

I'm using a Mac Mini as a media centre with a 750Gb Mini Max for storage.

Cheers!
Sasha
User avatar
canoehead
Posts: 11
Joined: Wed Aug 22, 2007 8:56 pm

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by canoehead »

Hi - I just got one of the new Mac Pros - I was previously a Windows guy, and I am just blown away by how much better the Mac version of HB is (it goes without saying the MacPro is screaming fast). Not only does it not crash all the time, but the options are just better organized - and the 6 channel mp4s work properly. Anyway, I generally encode vid to be played on a PS3 to a 46" LCD, and am generally willing to accept a somewhat bigger file size in return for high quality. I've been encoding shows or movies with a lot of action at 3500 (basically PS3 presets, then change to 3500, 2 pass with turbo - deinterlace on fast if the source material requires it), and I have convinced myself that this reduces macrobloching of grays/blacks and less motion blur. Am I just kidding myelf here?

At any rate, thanks for a great product.
bilbo--baggins
Posts: 18
Joined: Sun Apr 15, 2007 8:50 am

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by bilbo--baggins »

I'm using Handbrake 0.9.1 with an 8 core 2.8 Ghz Harpertown Penryn, 4GB RAM. For the Apple TV setting, with Slowest deinterlace, I'm getting about 5.6fps encode and Handbrake is using around 130% in Activity Monitor with very little else running. Any ideas why the huge discrepancy - I was really expecting something much faster? Could the deinterlace not be properly multithreaded or something?
eddyg
Veteran User
Posts: 798
Joined: Mon Apr 23, 2007 3:34 am

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by eddyg »

bilbo--baggins wrote:I'm using Handbrake 0.9.1 with an 8 core 2.8 Ghz Harpertown Penryn, 4GB RAM. For the Apple TV setting, with Slowest deinterlace, I'm getting about 5.6fps encode and Handbrake is using around 130% in Activity Monitor with very little else running. Any ideas why the huge discrepancy - I was really expecting something much faster? Could the deinterlace not be properly multithreaded or something?
Correct - deinterlace is not multi-threaded AFAIK - turn it on and you have a bottleneck.

Cheers, Ed.
bilbo--baggins
Posts: 18
Joined: Sun Apr 15, 2007 8:50 am

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by bilbo--baggins »

eddyg wrote:Correct - deinterlace is not multi-threaded AFAIK - turn it on and you have a bottleneck.

Cheers, Ed.
Excellent. So hopefully I should get significantly better results for stuff that doesn't need deinterlacing, given that it will remove this bottleneck, and not deinterlacing is much faster anyway. Will give it a try with something more modern!
cheerful
Posts: 7
Joined: Fri Apr 25, 2008 4:51 pm

Re: Mac Pro Penryn 8 core 2.8 Ghz 2 GB Ram

Post by cheerful »

I just want to share my experiences with the community. Pardon me in advance for not really following the usual typical posting procedures. I appreciate all the diligence, time and effort spent by both the community and the developers/managers.

Similar system as thread starter, Duke. Running off an external 120 GB Samsung 5400 rpm hdd (between 20 - 35 GB of free space)

- Handbrake v0.9.2, Mac OS 10.5.2
- Using a slightly modified Television Preset (nb: I notice that this preset is highly similar to the favoured Deux Six Quatre Preset as discussed in this thread)
- 2-pass encoding, turbo first pass NOT enabled.
- Deinterlace and Weak Denoise IS enabled by default.
- Picture dimensions adjusted to suit respective video sources).
- Bitrate varies from 480 kbps - 1280 kbps (again to suit respective video sources).
- Target size: 630 MB (typically used for shows which has lots of action scenes).
- Input source: 704 x 576 or 352 x 288 --- Output source: 704 x 560 or 320 x 256

Generally average framerates are between 38 fps to 45 fps.
However, I do note that certain times, the average fps is between 120 - 200 (sorry I can't recall the exacts, but it certainly is above 100; nb: Dinterlace and Weak Denoise IS enabled). What I do note is that typically this occurs when the input soruce is 352 x 288, although not all the time. For the 704 x 576 input source, the fps is definitely the 38 to 45 fps for this modified TV preset.

As for the CPU usage, typically it falls between 560% - 670%. Only 1 handbrake is in action.
The above framerates hold true also when I perform the encode from the optical drive (Optiarc).

I've also notice a very interesting abnormality --- there were times when even though handbrake was using less than 400%, but the system appeared bogged down and I expreienced severe unresponsiveness. E.g. I click on the menu bar in the Finder, it doesn't register the click instantaneously. This is a new Mac Pro and only Handbrake 0.92 has been added to the system. In fact, the system has not even gone online. I transfered Handbrake using a fresh thumbdrive.

The other thing which many people has already pointed out, a minor/major memory leak occurs after some time of encoding use, esp. so when there are multiple Qs.

I hope my relatively sharing of information will be useful to the future development and improvements of Handbrake. This is the very least of what I can offer at the moment.

Once again, thanks for everything.

p.s. Certainly I would love the 3.2 GHz machine. In fact, I do wish I could hold out till the raw GHz of 12 is out. I figured that with that no. I would be able to encode pretty much at my desired speed. My previous machine is/was (resting in peace now) TiPB G4 400.
Eug
Posts: 27
Joined: Tue Apr 10, 2007 2:01 am

Multi-core, multi-queue, and memory leaks.

Post by Eug »

cheerful wrote:I've also notice a very interesting abnormality --- there were times when even though handbrake was using less than 400%, but the system appeared bogged down and I expreienced severe unresponsiveness. E.g. I click on the menu bar in the Finder, it doesn't register the click instantaneously. This is a new Mac Pro and only Handbrake 0.92 has been added to the system. In fact, the system has not even gone online. I transfered Handbrake using a fresh thumbdrive.

The other thing which many people has already pointed out, a minor/major memory leak occurs after some time of encoding use, esp. so when there are multiple Qs.
How many do you have in the queue? I never have more than 2, but one of the reasons is because my machine is an older 2-core iMac. I'm sure that will change once I get a 4-core iMac. With a faster clockspeed and that many cores (and seeing how well HB scales with multiple cores), I could see myself making queues of 4-5 jobs. (I wonder what HyperThreading will do the speed too.)

I anxiously await 2009, for a new release of HB, and a new iMac.
cheerful
Posts: 7
Joined: Fri Apr 25, 2008 4:51 pm

Re: Multi-core, multi-queue, and memory leaks.

Post by cheerful »

Eug wrote:How many do you have in the queue? I never have more than 2, but one of the reasons is because my machine is an older 2-core iMac. I'm sure that will change once I get a 4-core iMac. With a faster clockspeed and that many cores (and seeing how well HB scales with multiple cores), I could see myself making queues of 4-5 jobs. (I wonder what HyperThreading will do the speed too.)

I anxiously await 2009, for a new release of HB, and a new iMac.
Thanks for the tip! I'm looking forward to the new Mac Pros. 2.8 GHz, 8-cores still ain't sufficient, but these machines are pretty efficient, speed, reliability and energy.

p.s. at times, i have more than 9 in the Q. but on the average, between 3 to 4.
seong
Posts: 2
Joined: Wed Mar 28, 2007 8:59 pm

Re: Decoder

Post by seong »

eddyg wrote:For starters I know it isn't the MPEG2 decoding which will run at 1500fps on a 2.6Ghz intel core.
:shock: Am I missing a way to make decoding super fast? Libmpeg2 is single-threaded, cannot be multi-threaded by application, and it has fairly low speed limit. So I thought my observation below was true...

In a testing using standalone decoding program (mpeg2 video stream extracted from DVD, preloaded in memory), the best I got was 300 fps (without color conversion of libmpeg2). And using HB with custom PSP profile, I'm getting around 290 fps with 70~80% total CPU load, so HB is giving me about same speed as a standalone decoder. This is on dual Xeon 2.0MHz(harpertown) 1333 Fsb, 4GB, Vista 64bit, 15k SAS.

Still in process of confirming, but based on high level observation: In case of the custom profile I'm using, CPU load ratio between decoding and H.264 encoding is roughly 1:5. If the ratio is true, then decoding won't be a bottleneck on 4 cores, but it will be a bottleneck on 8 cores.
Post Reply