CPU Encode much larger than GPU Encode

General questions or discussion about HandBrake, Video and/or audio transcoding, trends etc.
Post Reply
wyliec2
Posts: 35
Joined: Sat Apr 11, 2020 3:06 pm

CPU Encode much larger than GPU Encode

Post by wyliec2 »

Description of problem or question:

After getting a 3080 TI GPU in mid-2021, I have started experimenting with NVenc encodes versus straight CPU encodes for 4K content using H265.

I have found the general expectation that NVenc encodes are much faster but also larger then the CPU encodes. Times are usually 15-20 times faster with NVenc with output sizes 50% to 100% larger than the CPU encodes.

There is one movie which is an enormous outlier to this, in fact, the output MKV is not that much smaller than the full ISO file.

The movie is SAVING PRIVATE RYAN (NOTE: I own the physical media for all content – I use a media server for convenience).

My standard CPU encode uses:
- Encoder H.265 10-bit (Passthru Common Metadata is checked)
- constant framerate/same as source
- RF 18
- Encoder profile SLOW
- Other video settings are AUTO
- Audio is always the best quality passthru – in this case TrueHD 7.1 + TrueHD Passthru

For SAVING PRIVATE RYAN the output results are:

CPU Encode
MKV = 87463 MB / 3.82 avg FPS throughput

GPU Encode
MKV = 30499 MB / 59.85 avg FPS throughput

So the CPU encode is approaching 3x LARGER than the GPU output. The full movie ISO including all languages and extras is only 95 GB – the MKV output is over 90% this size.

I know the common reality with Handbrake is that every movie is different (encode speed, output size, etc) and I expect this “is what it is” – but the results are so outside the norms I’ve experienced, I thought I would present it for thoughts/comments.

The logs are below.

Steps to reproduce the problem (If Applicable):

Encode the identified movie using NVenc and CPU - compare outputs

HandBrake version (e.g., 1.0.0):

1.5.1

Operating system and version (e.g., Ubuntu 16.04 LTS, macOS 10.13 High Sierra, Windows 10 Creators Update):

Windows 10 Pro; CPU AMD 5950X; GPU 3080 TI

HandBrake Activity Log ***required*** (see How-to get an activity log)

CPU Encode
https://pastebin.com/iHeqwimy

GPU Encode
https://pastebin.com/DhCAW5w0
mduell
Veteran User
Posts: 7684
Joined: Sat Apr 21, 2007 8:54 pm

Re: CPU Encode much larger than GPU Encode

Post by mduell »

The quality scales aren't the same. Try x265 around RF 28 and see how that compares to your GPU encode.

Also RF 18 is an unreasonable setting at 4K, and for that content in particular is unsurprisingly not reducing the size.
wyliec2
Posts: 35
Joined: Sat Apr 11, 2020 3:06 pm

Re: CPU Encode much larger than GPU Encode

Post by wyliec2 »

I understand the RF scales are not comparable - I use RF 18 for my standard CPU setting because it provides quite acceptable results for me - SAVING PRIVATE RYAN is an extreme outlier which is why I brought it up.

Here are full ISO, GPU and CPU results from several 4K encodes:
-1917 ISO: 94 GB GPU: 19053 MB CPU: 10179 MB
- SOUL ISO: 63 GB GPU: 16083 MB CPU: 7757 MB
- PATRIOTS DAY ISO: 97 GB GPU: 22136 MB CPU: 11948 MB
- BIG LEBOWSKI ISO: 62 GB GPU: 20795 MB CPU: 10893 MB
- TROLLS ISO: 77 GB GPU: 12918 CPU: 6654 MB

Again, these all use my standard setting (including RF18) and were done on the same machine. The above is typical for dozens of 4K encodes I've done:
- The GPU output is approximately double the CPU encode
- Output MKV of 10-15 GB is entirely acceptable for me (NOTE: the TrueHD 7.1 adds substantial size regardless of the video settings)

While the above examples are 'typical,' I do have examples where the GPU and CPU outputs come out to similar sizes. Despite this, they all represent significant size reduction compared to the full ISO files.

The SAVING PRIVATE RYAN outlier here is that the CPU is nearly 3x the size of GPU output and represents minimal size reduction compared to the full ISO. This is something I have never seen and wondered if there were any thoughts or similar experiences among Handbrake users.
rollin_eng
Veteran User
Posts: 4347
Joined: Wed May 04, 2011 11:06 pm

Re: CPU Encode much larger than GPU Encode

Post by rollin_eng »

Saving Private Ryan is notorious for producing high bitrate encodes, Aliens is similar.
mduell
Veteran User
Posts: 7684
Joined: Sat Apr 21, 2007 8:54 pm

Re: CPU Encode much larger than GPU Encode

Post by mduell »

Sure, this is expected and widely experienced with that particular movie. It's super grainy and high motion, so if you try to preserve everything it gets very large.

The GPU encode isn't doing any better on efficiency than usual, it's just giving up on detail retention.
wyliec2
Posts: 35
Joined: Sat Apr 11, 2020 3:06 pm

Re: CPU Encode much larger than GPU Encode

Post by wyliec2 »

Thanks for the responses - that explains it! :)
User avatar
Rodeo
HandBrake Team
Posts: 12942
Joined: Tue Mar 03, 2009 8:55 pm

Re: CPU Encode much larger than GPU Encode

Post by Rodeo »

mduell wrote: Wed Jan 19, 2022 5:17 pmThe GPU encode isn't doing any better on efficiency than usual, it's just giving up on detail retention.
Spot on :-)
rollin_eng
Veteran User
Posts: 4347
Joined: Wed May 04, 2011 11:06 pm

Re: CPU Encode much larger than GPU Encode

Post by rollin_eng »

From a technical point of view, why doesn’t the GPU just throw more bits at it?
wyliec2
Posts: 35
Joined: Sat Apr 11, 2020 3:06 pm

Re: CPU Encode much larger than GPU Encode

Post by wyliec2 »

FWIW - I played with the NVenc RF values increasing quality from 18 to 16, 14, 12, 10 and finally 4. I wondered if there was a setting that would pick up the grain that the CPU was encoding. There was virtually no change in output size - the original RF18 produced 30499 MB and the largest file with the RF changes was 30542 MB. Not sure if it was a reasonable expectation that increasing quality on GPU encode would make a difference??

Conversely, I'm running CPU encodes at RF 24, 26 and 28. I started the RF28 first and it looks to be running faster - estimated completion at 9.5 hours. The RF18 encode took close to 18 hours. My 5950X is dedicated for special use so I can set up encodes and just let them run and get clean encode times without being affected by other applications.

Another FWIW - when doing CPU encodes I played with the 'threads=nn' parameter but it didn't seem to have any effect - the 16 core/32 thread processor averages around 85% regardless of setting.
User avatar
Rodeo
HandBrake Team
Posts: 12942
Joined: Tue Mar 03, 2009 8:55 pm

Re: CPU Encode much larger than GPU Encode

Post by Rodeo »

rollin_eng wrote: Wed Jan 19, 2022 6:33 pm From a technical point of view, why doesn’t the GPU just throw more bits at it?
Its rate control algorithm is probably less flexible than that of a software encoder.
wyliec2 wrote: Thu Jan 20, 2022 4:34 am Conversely, I'm running CPU encodes at RF 24, 26 and 28. I started the RF28 first and it looks to be running faster - estimated completion at 9.5 hours. The RF18 encode took close to 18 hours. My 5950X is dedicated for special use so I can set up encodes and just let them run and get clean encode times without being affected by other applications.
The higher the bitrate, the more time spent in arithmetic coding, among other things. It's quite processor-intensive.
wyliec2 wrote: Thu Jan 20, 2022 4:34 am Another FWIW - when doing CPU encodes I played with the 'threads=nn' parameter but it didn't seem to have any effect - the 16 core/32 thread processor averages around 85% regardless of setting.
HEVC and x265's threading model is more complex than that for older codecs. There's thread pools comprised of frame threads but also WPP (and sometimes slices as well); you usually need to play with all parameters to see any effects -- best to leave it alone in most cases.
wyliec2
Posts: 35
Joined: Sat Apr 11, 2020 3:06 pm

Re: CPU Encode much larger than GPU Encode

Post by wyliec2 »

Again, thanks for the explanations!!

The RF28 encode completed - execution time was 9:19 (hours:mins) for 14972 MB output. Original RF18 encode took 17:40 with the 87463 MB output.

For thread usage, my only concern was if too many might be detrimental - I think I had some H264 discussions regarding this. Version 1.4.2 H265 was throwing a log message "...application requested 17 threads. No more than 16 is recommended". This looks to be gone in 1.5.1 logs.
User avatar
Rodeo
HandBrake Team
Posts: 12942
Joined: Tue Mar 03, 2009 8:55 pm

Re: CPU Encode much larger than GPU Encode

Post by Rodeo »

x265 does not need as many frame threads as x264, because of WPP. If you let the encoder do it's thing (i.e. don't use the threads options), you should be fine.

Latest x265 will forcefully limit the number of frame threads to 16 maximum, hence the warning gone (but if you don't use the threads options, you'll get a much lower count of frame threads anyway, no more than 6 -- again, x265 threading doesn't work like that of x264).
wyliec2
Posts: 35
Joined: Sat Apr 11, 2020 3:06 pm

Re: CPU Encode much larger than GPU Encode

Post by wyliec2 »

A few days ago, I had run some encodes with thread count settings of 1) no parameter; 2) threads=8; 3) threads=12; and 4) threads=16.

The encode times and processor utilization was essentially identical on all 4 runs (+/- 1%).

These runs all involved the same 4K source, HB 1.5.1, H265 10-bit encodes, RF=18, SLOW preset.

I appreciate your patience in answering my questions!! I retired from a career of building/coding/optimizing systems and with hobbies of high-end A/V and high-performance desktops I'm a bit OCD on trying to find the best methodologies....
mduell
Veteran User
Posts: 7684
Joined: Sat Apr 21, 2007 8:54 pm

Re: CPU Encode much larger than GPU Encode

Post by mduell »

rollin_eng wrote: Wed Jan 19, 2022 6:33 pmFrom a technical point of view, why doesn’t the GPU just throw more bits at it?
wyliec2 wrote: Thu Jan 20, 2022 4:34 am FWIW - I played with the NVenc RF values increasing quality from 18 to 16, 14, 12, 10 and finally 4. I wondered if there was a setting that would pick up the grain that the CPU was encoding. There was virtually no change in output size - the original RF18 produced 30499 MB and the largest file with the RF changes was 30542 MB.
It sounds like NVENC is hitting some internal limit on how many bits it's willing/capable of throwing at the video, so the quality setting is effectively ignored.
User avatar
Rodeo
HandBrake Team
Posts: 12942
Joined: Tue Mar 03, 2009 8:55 pm

Re: CPU Encode much larger than GPU Encode

Post by Rodeo »

wyliec2 wrote: Thu Jan 20, 2022 11:45 pm A few days ago, I had run some encodes with thread count settings of 1) no parameter; 2) threads=8; 3) threads=12; and 4) threads=16.
Which is expected, since frame threads are only one component of x265's threading model. Hence why it makes sense to just leave the encoder alone to decide what kind and how many of each thread type to use.
Post Reply