dual socket xeon saturation x265

Discussion of the HandBrake command line interface (CLI)
Forum rules
An Activity Log is required for support requests. Please read How-to get an activity log? for details on how and why this should be provided.
Post Reply
webeindustry
Posts: 4
Joined: Wed Jul 20, 2016 2:05 am

dual socket xeon saturation x265

Post by webeindustry » Wed Jul 20, 2016 2:09 am

I'm trying to encode x265 on a dual socket xeon. It has 18cores/36threads per cpu for a total of 72threads.

It appears I can only saturate ~15-25% of this systems capacity with standard settings. How to go about using 72threads with handbreak?

I am on ubuntu server 16.04, so it's all CLI of course. These are broadwell cpus. I tried a couple of things so far, handbreak-cli, and ffmpeg with various settings. They seem to be showing about the same usage. Using the --threads option changes nothing with ffmpeg.

Does anyone have luck saturating dual sockets with Handbreak using x265 encoder?

User avatar
Ritsuka
HandBrake Team
Posts: 1034
Joined: Fri Jan 12, 2007 11:29 am

Re: dual socket xeon saturation x265

Post by Ritsuka » Wed Jul 20, 2016 5:26 am

x265 performance doesn't scale up linearly with more threads. Your only option is to run multiple encodes.

rollin_eng
Veteran User
Posts: 3039
Joined: Wed May 04, 2011 11:06 pm

Re: dual socket xeon saturation x265

Post by rollin_eng » Wed Jul 20, 2016 6:56 am

Could you please post your logs, instructions can be found here:

viewtopic.php?f=6&t=31236

webeindustry
Posts: 4
Joined: Wed Jul 20, 2016 2:05 am

Re: dual socket xeon saturation x265

Post by webeindustry » Wed Jul 20, 2016 9:23 pm

I think Ritsuka answered the question; though I will go ahead and do a sample run real quick to provide the community some logs. I'm flipping over to win10 afterwards for more testing.

It seems I will need to run ~4-6 encodes at a time to make good use of these processors. I've decided to do this once an evening along with late afternoon and overnite 3d rendering for projects (until the real purpose of having this server comes ready for production).


Image

& here again at last second.

Image

Log is here:
http://pastebin.com/HmUKhdqk

I'm actually amazed at the power efficiency of these processors. Killawatt never busted 2a at 120v. The system maxes at 3.5a, 420w under prime95. It appears best bet for highest performance under x265 is a 6-10 core intel unlocked with aftermarket cooler for 4ghz+

webeindustry
Posts: 4
Joined: Wed Jul 20, 2016 2:05 am

Re: dual socket xeon saturation x265

Post by webeindustry » Sun Jul 24, 2016 4:02 pm

I thought someone might find this of interest. It's ffmpeg here, same settings two different rigs.

On the top is an W3380 so westmere 6core at 3.3Ghz
Bottom is the dual 2697s.

It appears there's about a 45% increase. The gist I get from this is best bet might be a 6core overclocked if solely used for encoding.

Image

This is 6 encodes at once:

Image

It seems about 20% degradation in fps 6 vs 1

Load goes from ~200w with single to ~365w with six

webeindustry
Posts: 4
Joined: Wed Jul 20, 2016 2:05 am

Re: dual socket xeon saturation x265

Post by webeindustry » Thu Jul 28, 2016 3:26 pm

I got it rockin' ~70% utilization with handbrake + x265 in win10 by disabling HT. Turns out windows can only allocate up to 64 logical cores per processor group, and most programs are not processor group aware to span out more than 1. Disabling HT brings me from 72 logical, to 36 physical = less than 64 or just 1 processor group.

Not sure if handbrake fixed this recently, or newer ffmpeg library as well, but x265 now loads more cores than x264. I'm getting 250fps+ on the rig set to slowest, and 18 q. Takes a few minutes to encode a show.

nhyone
Bright Spark User
Posts: 198
Joined: Fri Jul 24, 2015 4:13 am

Re: dual socket xeon saturation x265

Post by nhyone » Mon Aug 01, 2016 5:43 am

When running multiple instances, are you passing any parameters to x265 to restrict the threads each instance use?

For example, if you are running 4 instances, I think it would be more efficient (overall) if you set each instance to use 9 distinct cores via processor affinity rather than let them fight over 36 cores.

As for HT, you should be able to avoid using the HT cores by using processor affinity as well. A HT-core has only 15% compute power of a real core (based on my tests on Ivy Bridge and Haswell CPUs), so it's best to think of them as extras (i.e. 9 cores + 15% boost instead of 18 logical cores).

Post Reply