Page 2 of 2

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Tue Jun 30, 2020 12:28 am
by mduell
8Ringer wrote: Mon Jun 29, 2020 8:35 pmis is possible to get that perforamnce bump with a single encode? Or is x265 encoding really just limited to not fully using all 24 threads? And before you snipe back (as I know you're going to do), the fact that there is 12% more performance left on the table with 2 encodes proves that the CPU isn't out of execution resources as you glibly stated above, mduell. Are there tweaks that can be done os is this the best I can hope for? If thats it, then thats fine, just curious if I can optimize my workflow and wring all the things out of these CPUs. While 12% is nothing in the scheme of things, if its on the table I'd like to leverage it if I can.
Sure, what are you trying to optimize for:
FPS? Pick a faster preset.
CPU usage? Try both slower and faster presets to see where it ends up. Try removing the advanced options you've picked, or picking a different aq-mode.
Seeing a significant benefit from x265 over x264? Upgrade to 4K content.

If the CPU doesn't have the right execution resources available at the right time, your encode isn't going any faster. Doing 2 encodes at the same time is going to shift those demands, since now you've got another decode thread running.

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Tue Jun 30, 2020 3:41 am
by 8Ringer
I'm not trying to optimize for anything except maximum thread usage during the encode stage. The preset, as I've set it up at least, walks the line between encoder speed and quality so changing it isn't something I'm particularly interested in doing, and I'm happy with the encoding speed. Compared to the i5-3450 rig I was using previously, this machine is a rocketship.

I'm using x265, in particular, because its generally a good bit more efficient than x264 and lower average bitrate was something I was hoping to achieve to avoid transcoding in my Plex environment when streaming remotely, lower bitrate equals more streams I could squeeze out of my paltry upload bandwidth. The decoding burden on the other end isn't an issue as any device I'm streaming to supports native x265 decode. x264 encodes much faster (and curiously saturates all 24 threads even if HT isn't helping all that much, ~12% faster with HT enabled), but the tradeoff is larger filesize and higher bitrate for a given quality. The majority of my content is 1080p but I'll be upgrading to 4K eventually, I just don't have a 4K tv, or anything 4K compatible so its not been a priority yet.

I mean, if I'm just hitting the limit of thread scaling for this CPU combination then thats fine, and I'm cool with the performance, but a small part of me just dislikes leaving performance on the table if I can avoid it.

My other thought is getting a pair of e5-2637v3 or 2643v3s and chucking them in as the base clocks are so much higher it could potentially bring a good chunk of performance. My current CPUs stay quite cool under full load, so theres plenty of cooling headroom, but the 2620s are pretty limited due to how they're configured and Xeons just don't overclock so thats out of the question. But then again, spending $300 on a pair of CPUs just to eke out more encoder performance sorta is a waste realistically, as those likely come with a power efficiency penalty, even at idle.

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Tue Jun 30, 2020 4:12 am
by 8Ringer
Also, I realize me filling this thread with data and exposition when its really about something else is sort of a tangent, I'll make another thread later on rather than continue in here.

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Thu Sep 10, 2020 6:36 pm
by mike693
In case anyone is interested, I am writing to follow-up with my results. Sorry for bringing this thread back after so long; I had a lot going on this spring and didn’t replace the Mac until now. I ended up getting a refurbished iMac Pro (3.2GHz 8-Core Xeon W).

I observe the following results with the “Apple 2160p60 4K HEVC Surround” preset.

For a 720p source, around 500% of CPU is utilized. The 8 physical cores are just over half-way utilized. The 8 virtual cores are partially utilized. Transcoding performance is usually double real-time (I.e., a one hour video could transcode in half an hour).

For a 1080p source, between 950% and 1150% of CPU is utilized. The 8 physical cores are fully utilized. The 8 virtual cores are partially utilized. Transcoding performance is usually a bit faster than real-time.

For a 2180p source, almost all 1600% of CPU is utilized. All virtual and physical cores are close to 100% most of the time. Transcoding performance is usually about half-time.

I assume the following may be true.

For a 720p source, additional physical cores would not materially increase frames per second.

For a 1080p source, a few additional physical cores might materially increase frames per second.

For a 2160p source, additional physical cores almost certainly will materially increase frames per second.

Thanks for everything!

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Thu Sep 10, 2020 6:45 pm
by s55
I wouldn't associate core count to performance necessarily. For example, On a non-xeon system that only has dual channel memory, you could very easily hit a RAM *bandwidth* bottleneck that slows down the encode and fails to fully utilise a high core count CPU.

It can vary quite a bit depending on source material resolution, size, type, tracks and output settings.

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Thu Sep 10, 2020 6:45 pm
by mduell
Your post doesn't really reflect how SMT works... there's 2n virtual threads scheduled on to n physical cores. There's not separate groups of virtual cores and physical cores.

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Fri Sep 11, 2020 1:19 pm
by mike693
@s55
Thanks for that. For future reference, is there a good rule of thumb for memory bandwidth requirements (I.e., something like a multiple of file size per second?), or is it far too complex?

I usually source 1080p24 (4-5GB file) or 2180p24 (bigger) with multiple audio tracks, sub tracks, and chapters in an MKV, outputting to the stock preset Apple2160p60 4K HEVC in an MP4. (I can probably turn off interlace detection and decomb.)

@mduell
Thank you. I will try to articulate that better next time. :)

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Fri Sep 11, 2020 4:35 pm
by mduell
The memory bandwidth requirements would be on the uncompressed video; no relation to the compressed video size.

Re: Optimal cores for Apple 2160p60 4K HEVC Surround preset?

Posted: Fri Sep 11, 2020 9:51 pm
by Rodeo
IIRC it's aligned_width * aligned_height * bits_per_sample * frame_rate, where bit_per_sample depends on the source's color format (at least 12).

Aligned width and height can vary, but aligning to a multiple of 128 should cover most every case (so e.g. 1920x1080 becomes 1920x1152, 1920x794 becomes 1920x896 etc.).