Request for (cpu-)benchmarks

Speed kills.
Post Reply
N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Mon Feb 27, 2017 1:57 pm

Hey guys,

first of all:
i dont know where to post this. currently this is just a question, in future this is hopefully a collection of benchmarks. so feel free to move if you think there is a subforum that would be a better choice.

since i plan to upgrade my cpu (and other hardware) from an i7-3770 to an i7-7770 (k) or ryzen i'm curious for some handbrake-cpu-benchmarks...

therefore i downloaded the blender-rendering "bbb_sunflower_1080p_30fps_normal.mp4" (link later in this post). Encoding this file with my current i7-3770 (source and destination on the same hdd) with the following settings took 14m34s.

ui-settings:

Code: Select all

settings
    QuickSync:          disabled
    x264 granularity:   .25 (default)
picture
    size:               1920x1080
    cropping:           custom, 0:0:0:0
video
    codec:              x264
    fps:                constant, same-as-source
    optimize:           slower, film, high, 4.1
    quality:            constant, 18.5
audio:
    1st stream:         pass-through (2.0)
    2nd stream:         ignore (5.1)
filters: none
container: mkv
command line:

Code: Select all

set input=q:\bench\bbb_sunflower_1080p_30fps_normal.mp4
set output=q:\bench\out.mkv
set log=q:\bench\log.txt
set handbrake=%programfiles%\handbrake\handbrakecli.exe
"%handbrake%" --disable-qsv-decoding --format av_mkv --crop 0:0:0:0 --width 1920 --height 1080 --encoder x264 --encoder-preset slower --encoder-tune film --encoder-profile high --encoder-level 4.1 --quality 18.5 --rate 30 --cfr -x level=4.1:deblock=-1,-1:psy-rd=1,0.15:ref=4:analyse=all:b-adapt=2:direct=auto:me=umh:rc-lookahead=60:subme=9:trellis=2:vbv-bufsize=78125:vbv-maxrate=62500 --audio 1 --aencoder copy -i "%input%" -o "%output%" 2> "%log%"
pause
So if you have 10-20min to perform the same encoding and post your time (logs and hardware too) here that would be really nice :)
To get the encoding time have a look in the log-file, search for "reader: done" and calculate the difference between the "reader: done" line and the previous one. Average fps is in the next line. For Windows 10 and (possibly 7, 8, and 8.1) the logs can be found here:

Code: Select all

C:\Users\%username%\AppData\Roaming\HandBrake\logs
downloads (if its not ok to post direct links or links to other software feel free to edit my post)
http://distribution.bbb3d.renderfarming ... normal.mp4

If your cpu is already listed here... Well, post your results anyway. This hopefully confirm the benchmarks...

disclaimer: there are many things that can affect benchmark results (so even if you own the listed hardware you will possibly produce different result(s)): reasons can be: operating system, security-software, updates, drivers, other running software, hardware (mainboard, ram-timings, ssd), ... Because of that this benchmark(s) can ONLY give a HINT about the differences!

EDIT1: added i7-4770 benchmark, thx @ rollin_eng
EDIT2: added i7-6700 benchmark, thx @ RobD
EDIT3: removed cpu-z and encoding log
EDIT4: added i7-5820k, E5-2670, E5-2670v2 and E5-2660v3 benchmarks, thx @ Admiral_Akbar and nyhone

results
[os]; [ram]; [cpu]; [overclocked]; [source]; [destination]; [handbrake-version]; [fps]; [encoding-time]; [ui-or-cli]; [username]
windows10 v1607; 16gb; i7-3770 (IvyBridge); no; hdd1; hdd1; 1.0.3; 21.69fps; 14m34s; cli
windows 8.1; 32gb; i7-4770 (Haswell); ?; ssd1; ssd1; 1.0.3; 24.56fps, 12m52s; cli; rollin_eng
windows10 v1607; 32GB; i7-6700 (Skylake); stock 3.40GHz; ssd; ssd; 1.0.3; 28.45fps; 11m07s; cli; RobD
Win 8.1; 16GB; i7-5820k (HaswellE); Overclocked 4.3Ghz; HDD; HDD; 20170311203819-3a4beb1-master; 36.398fps; 8:45; Cli; Admiral_Akbar
?, ?, E5-2670 (SandyBridge), ?, LAN, LAN, ?, 41.055, ?, ?, nyhone
?, ?, E5-2670v2 (IvyBridge), ?, LAN, LAN, ?, 49.841, ?, ?, nyhone
Linux, ?, E5-2660v3 (Haswell), ?, LAN, LAN, 1.0.2, 42.620, 7:25, ?, nyhone
Last edited by N!ghtW4lk3r on Fri Mar 24, 2017 10:26 am, edited 5 times in total.

Woodstock
Veteran User
Posts: 1836
Joined: Tue Aug 27, 2013 6:39 am

Re: Request for (cpu-)benchmarks

Post by Woodstock » Mon Feb 27, 2017 3:25 pm

If your objective is to figure out what speed comes from which CPU's features, you should run the test using a specific command line (rather than a GUI preset).

N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Re: Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Tue Feb 28, 2017 4:23 pm

what i want to know is how much another cpu (f.e. i7-7770k or ryzen x1700 or ...) will reduce encoding time using the same settings.

Hopefully this question isn't to stupid... but whats the benefit of using the command line? The only advantage that i see is that its easier to use my settings since you only need to copy&paste the command, adjust the file-paths and execute it. But my personal experience is that posting long-cryptic-command-lines stops most people from reading...

But anyway, this is my windows-command line.

Code: Select all

set input=q:\bench\bbb_sunflower_1080p_30fps_normal.mp4
set output=q:\bench\out.mkv
set log=q:\bench\log.txt
set handbrake=%programfiles%\handbrake\handbrakecli.exe
"%handbrake%" --disable-qsv-decoding --format av_mkv --strict-anamorphic --crop 0:0:0:0 --width 1920 --height 1080 --encoder x264 --encoder-preset slower --encoder-tune film --encoder-profile high --encoder-level 4.1 --quality 18.5 --rate 30 --cfr -x level=4.1:deblock=-1,-1:psy-rd=1,0.15:ref=4:analyse=all:b-adapt=2:direct=auto:me=umh:rc-lookahead=60:subme=9:trellis=2:vbv-bufsize=78125:vbv-maxrate=62500 --audio 1 --aencoder copy -i "%input%" -o "%output%" 2> "%log%"
pause
but i have two problems:
  • the file is binary different to the one created by the ui. the only difference in settings-dump i see is "Framerate Shaper (1:27000000:900000)" in command line and "Framerate Shaper (mode=1)" in UI.
  • is it possible to write log files? redirecting the process-output with pipes to a file only writes the encoding-progress... (solved)
EDIT:
changed command line to create a log file
Last edited by N!ghtW4lk3r on Wed Mar 01, 2017 3:21 pm, edited 1 time in total.

Woodstock
Veteran User
Posts: 1836
Joined: Tue Aug 27, 2013 6:39 am

Re: Request for (cpu-)benchmarks

Post by Woodstock » Tue Feb 28, 2017 4:54 pm

When using the command line, you have to redirect stderr to the file. For Windows/linux/unix systems (which should include Mac), that would be "2>error" (where "error" is the name of the file you want it written to) on the end of the command line.

Using the command line is "better" because it is consistent across all operating systems. There are subtle differences in the GUI implementations which can affect the speed.

rollin_eng
Veteran User
Posts: 2049
Joined: Wed May 04, 2011 11:06 pm

Re: Request for (cpu-)benchmarks

Post by rollin_eng » Tue Feb 28, 2017 6:08 pm

I had to alter your cli as the strict anamorphic is not valid.

Code: Select all

set input=d:\bbb_sunflower_1080p_30fps_normal.mp4
set output=d:\out.mkv
set handbrake=%programfiles%\handbrake\handbrakecli.exe
"%handbrake%" --disable-qsv-decoding --format av_mkv --crop 0:0:0:0 --width 1920 --height 1080 --encoder x264 --encoder-preset slower --encoder-tune film --encoder-profile high --encoder-level 4.1 --quality 18.5 --rate 30 --cfr -x level=4.1:deblock=-1,-1:psy-rd=1,0.15:ref=4:analyse=all:b-adapt=2:direct=auto:me=umh:rc-lookahead=60:subme=9:trellis=2:vbv-bufsize=78125:vbv-maxrate=62500 --audio 1 --aencoder copy -i "%input%" -o "%output%" 2>>d:\out.txt
pause
i7 4770 with 32gig ram, windows 8.1, read/write to temp ssd, HB 1.0.3

12min 52sec.
24.560808 fps.

N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Re: Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Wed Mar 01, 2017 3:19 pm

@woodstock:
nice. the "2>" did the trick :)

@rollin_eng:
thx. added your results to my first post.
the anamorphic switch is strange. i run my original batch with anamorphic-switch and everything went fine.

rollin_eng
Veteran User
Posts: 2049
Joined: Wed May 04, 2011 11:06 pm

Re: Request for (cpu-)benchmarks

Post by rollin_eng » Wed Mar 01, 2017 3:45 pm

Are you using 1.0.3 CLI?

N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Re: Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Thu Mar 02, 2017 3:15 pm

I think so. The ui-executable has an version attribute 1.0.3. the cli-executable has no version attribute and also no switch that i can see to display version information.

EDIT:
My fault. I did not know that the command-line version is not installed along with the ui version. seems that I was using some old stuff :oops:

rollin_eng
Veteran User
Posts: 2049
Joined: Wed May 04, 2011 11:06 pm

Re: Request for (cpu-)benchmarks

Post by rollin_eng » Thu Mar 02, 2017 3:24 pm

Handbrakecli --version

N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Re: Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Thu Mar 02, 2017 3:36 pm

yeah. the current version has this switch. not my old version. must be (very) old since i can't remember installing the cli-version... anyway... seems that i have to re-run my benchmark...

RobD
Posts: 7
Joined: Tue Jan 17, 2017 5:58 pm

Re: Request for (cpu-)benchmarks

Post by RobD » Thu Mar 02, 2017 8:36 pm

Used the command-line code that rollin_eng posted a few posts above me.

Win10 v1607; 32GB; i7-6700; stock 3.40GHz; SSD; SSD; 1.0.3; 28.451494 fps; 11:07; RobD

Admiral_Akbar
New User
Posts: 1
Joined: Tue Mar 21, 2017 12:51 am

Re: Request for (cpu-)benchmarks

Post by Admiral_Akbar » Tue Mar 21, 2017 12:57 am

Used the command-line code that rollin_eng posted.

Win 8,1; 16GB; i7-5820k; Overclocked 4.3Ghz; HDD; HDD; 20170311203819-3a4beb1-master; 36.397678 Fps; 8:45; Cli; Admiral_Akbar

nhyone
Bright Spark User
Posts: 162
Joined: Fri Jul 24, 2015 4:13 am

Re: Request for (cpu-)benchmarks

Post by nhyone » Wed Mar 22, 2017 7:33 am

CPU is often the bottleneck for video encoding. Looking at a random CPU benchmark, i7-7700K @ 4.2 GHz is 40% faster than i7-3770 @ 3.4 GHz. I would expect it to be in the ballpark for HandBrake.

If you are encoding HEVC, Haswell and above will be much faster due to AVX 2.0.

I have two suggestions:

1. Do you need to use slower preset? Can you live with veryfast? It has the best speed/size tradeoff, at the expense of some quality. It is around 9x faster than slower.

2. Do you need 1080p? I would keep the source and encode to 720p (for 10", 27" to 40" screen) or 480p (<8" screen). It saves both size and bandwidth. For 12-24" screens, some people can accept 720p, some can't -- due to viewing distance.

nhyone
Bright Spark User
Posts: 162
Joined: Fri Jul 24, 2015 4:13 am

Re: Request for (cpu-)benchmarks

Post by nhyone » Thu Mar 23, 2017 6:30 am

Wow, I got really interesting results.

Code: Select all

                   Sandy B                    Ivy B                  Haswell
                   E5-2670 2.60 GHz           E5-2670v2 2.50 GHz     E5-2660v3 2.60 GHz
                #  fps     cpu size        #  fps     cpu size       fps     cpu size
1080p slower   32   41.055 40% 393.8 MB   40   49.841 40% 395.1 MB    42.620 30% 395.0 MB
1080p slower   16   27.704 60% 392.5 MB   20   30.025 60% 392.3 MB    33.674 60% 392.4 MB
1080p slower    8   23.083 80% 379.1 MB   10   27.279 70% 379.2 MB    31.446 90% 379.2 MB
1080p slower    4   13.506  -  379.1 MB    4   13.430  -  379.1 MB    15.457  -  379.1 MB

Code: Select all

                   Sandy B                    Ivy B                  Haswell
                   E5-2670 2.60 GHz           E5-2670v2 2.50 GHz     E5-2660v3 2.60 GHz
                #  fps     cpu size        #  fps     cpu size       fps     cpu size
1080p veryfast 32  187.802 45% 379.4 MB   40  196.326 45% 379.4 MB   135.237 30% 379.4 MB
1080p veryfast 16  125.227 75% 374.4 MB   20  148.385 75% 376.9 MB   116.606 55% 376.9 MB
1080p veryfast  8  102.107 85% 364.0 MB   10  128.762 80% 365.9 MB    94.910 65% 365.9 MB
1080p veryfast  4   64.640 90% 362.1 MB    4   63.421 90% 362.1 MB    69.276 90% 362.1 MB

Code: Select all

                   Sandy B                    Ivy B                  Haswell
                   E5-2670 2.60 GHz           E5-2670v2 2.50 GHz     E5-2660v3 2.60 GHz
                #  fps     cpu size        #  fps     cpu size       fps     cpu size
1080p slower   32   41.055 40% 393.8 MB   40   49.841 40% 395.1 MB    42.620 30% 395.2 MB
 720p slower   32   75.724 50% 261.2 MB   40   84.813 50% 261.2 MB    77.997 40% 261.2 MB
 480p slower   32  119.323 40% 152.3 MB   40  128.311 30% 152.5 MB   126.017 40% 152.5 MB
Details
  • 3 machines (simultaneously), 1 file server, accessed over 1000Base-T network
  • Processors #32 = 2 CPU x 8 cores HyperThread, #16 = 1 CPU x 8 cores HT, #8 = 1 CPU x 8 cores, #4 = 1 CPU x 4 cores; using CPU affinity
  • Processors #40 = 2 CPU x 10 cores HT, #20 = 1 CPU x 10 cores HT, #10 = 1 CPU x 10 cores, #4 = 1 CPU x 4 cores; using CPU affinity
  • For slower preset, the user-provided encoding options are used. The only diff from the default preset is "ref=4"
  • For veryfast preset, the default encoding options are used
  • "threads=" is used to limit the threads used
  • cpu% is for the active processors only. If 10 out of 40 processors are used, the other 30 are ~0%
  • Size is in MB (1,000,000 bytes)
Results
  • x264 does not scale well to high number of threads; it cannot even saturate the processors used
  • Performance is sometimes worse on Haswell [CPU contention, throttling or memory bandwidth issue?]
  • veryfast preset can only achieve 3-4x speed-up for high #processors used
  • File size depends on the number of threads
  • Results for veryfast preset, 480p and 720p are for comparison

Log for the E5-2660v3, 40 threads, 1080p, slower encoding:

Code: Select all

[09:54:05] hb_init: starting libhb thread
[09:54:05] thread 7ff7cfd60700 started ("libhb")
HandBrake 1.0.2 (2017020700) - Linux x86_64 - https://handbrake.fr
40 CPUs detected
Opening bbb_sunflower_1080p_30fps_normal.mp4...
[09:54:05] CPU: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
[09:54:05]  - Intel microarchitecture Haswell
[09:54:05]  - logical processor count: 40
[09:54:05] hb_scan: path=bbb_sunflower_1080p_30fps_normal.mp4, title_index=1
udfread ERROR: ECMA 167 Volume Recognition failed
disc.c:274: failed opening UDF image bbb_sunflower_1080p_30fps_normal.mp4
disc.c:352: error opening file BDMV/index.bdmv
disc.c:352: error opening file BDMV/BACKUP/index.bdmv
[09:54:05] bd: not a bd - trying as a stream/file instead
libdvdnav: Using dvdnav version 5.0.1
libdvdread: Encrypted DVD support unavailable.
libdvdread:DVDOpenFileUDF:UDFFindFile /VIDEO_TS/VIDEO_TS.IFO failed
libdvdread:DVDOpenFileUDF:UDFFindFile /VIDEO_TS/VIDEO_TS.BUP failed
libdvdread: Can't open file VIDEO_TS.IFO.
libdvdnav: vm: failed to read VIDEO_TS.IFO
[09:54:05] dvd: not a dvd - trying as a stream/file instead
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'bbb_sunflower_1080p_30fps_normal.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    creation_time   : 2013-12-16 17:44:39
    title           : Big Buck Bunny, Sunflower version
    artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
    comment         : Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net
    genre           : Animation
    composer        : Sacha Goedegebure
  Duration: 00:10:34.60, start: 0.000000, bitrate: 3481 kb/s
    Stream #0:0(und): Video: h264 (High) [avc1 / 0x31637661]
      yuv420p, 1920x1080 [PAR 1:1 DAR 16:9], 2998 kb/s
      30 fps, 30k tbn (default)
    Metadata:
      creation_time   : 2013-12-16 17:44:39
      handler_name    : GPAC ISO Video Handler
    Stream #0:1(und): Audio: mp3 [mp4a / 0x6134706D]
      48000 Hz, 2 channels, s16p, 160 kb/s (default)
    Metadata:
      creation_time   : 2013-12-16 17:44:42
      handler_name    : GPAC ISO Audio Handler
    Stream #0:2(und): Audio: ac3 [ac[45]3 / 0x332D6361]
      48000 Hz, 5.1, fltp, 320 kb/s (default)
    Metadata:
      creation_time   : 2013-12-16 17:44:42
      handler_name    : GPAC ISO Audio Handler
    Side data:
      audio service type: main
[09:54:05] scan: decoding previews for title 1
[09:54:05] scan: audio 0x1: mp3, rate=48000Hz, bitrate=160000 Unknown (MP3) (2.0 ch)
[09:54:06] scan: audio 0x2: ac3, rate=48000Hz, bitrate=320000 Unknown (AC3) (5.1 ch)

Scanning title 1 of 1, preview 4, 40.00 %
Scanning title 1 of 1, preview 10, 100.00 %[09:54:06] scan: 10 previews, 1920x1080, 30.000 fps, autocrop = 0/0/0/0, aspect 16:9, PAR 1:1
[09:54:06] libhb: scan thread found 1 valid title(s)
+ Using preset: CLI Default
+ title 1:
  + stream: bbb_sunflower_1080p_30fps_normal.mp4
  + duration: 00:10:34
  + size: 1920x1080, pixel aspect: 1/1, display aspect: 1.78, 30.000 fps
  + autocrop: 0/0/0/0
  + support opencl: no
  + chapters:
    + 1: cells 0->0, 0 blocks, duration 00:10:34
  + audio tracks:
    + 1, Unknown (MP3) (2.0 ch) (iso639-2: und)
    + 2, Unknown (AC3) (5.1 ch) (iso639-2: und), 48000Hz, 320000bps
  + subtitle tracks:
[09:54:06] 1 job(s) to process
[09:54:06] json job:
{
    "Audio": {
        "AudioList": [
            {
                "Encoder": 1125984256,
                "Track": 0
            }
        ],
        "CopyMask": [
            "copy:aac",
            "copy:ac3",
            "copy:eac3",
            "copy:dtshd",
            "copy:dts",
            "copy:mp3",
            "copy:truehd",
            "copy:flac"
        ],
        "FallbackEncoder": "fdk_aac"
    },
    "Destination": {
        "ChapterList": [
            {
                "Name": ""
            }
        ],
        "ChapterMarkers": false,
        "File": "out/bbb_sunflower_1080p_30fps_normal.mach500.threads40.slower.1080p.out.mkv",
        "Mp4Options": {
            "IpodAtom": false,
            "Mp4Optimize": false
        },
        "Mux": "mkv"
    },
    "Filters": {
        "FilterList": [
            {
                "ID": 6,
                "Settings": {
                    "mode": 1,
                    "rate": "27000000/900000"
                }
            },
            {
                "ID": 11,
                "Settings": {
                    "crop-bottom": 0,
                    "crop-left": 0,
                    "crop-right": 0,
                    "crop-top": 0,
                    "height": 1080,
                    "width": 1920
                }
            }
        ]
    },
    "Metadata": {
        "Artist": "Blender Foundation 2008, Janus Bager Kristensen 2013",
        "Comment": "Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net",
        "Composer": "Sacha Goedegebure",
        "Genre": "Animation",
        "Name": "Big Buck Bunny, Sunflower version"
    },
    "PAR": {
        "Den": 1,
        "Num": 1
    },
    "SequenceID": 0,
    "Source": {
        "Angle": 0,
        "Path": "bbb_sunflower_1080p_30fps_normal.mp4",
        "Range": {
            "End": 1,
            "Start": 1,
            "Type": "chapter"
        },
        "Title": 1
    },
    "Subtitle": {
        "Search": {
            "Burn": true,
            "Default": false,
            "Enable": false,
            "Forced": false
        },
        "SubtitleList": []
    },
    "Video": {
        "ColorMatrixCode": 0,
        "Encoder": "x264",
        "Level": "4.1",
        "OpenCL": false,
        "Options": "threads=40:level=4.1:deblock=-1,-1:psy-rd=1,0.15:ref=4:analyse=all:b-adapt=2:direct=auto:me=umh:rc-lookahead=60:subme=9:trellis=2:vbv-bufsize=78125:vbv-maxrate=62500",
        "Preset": "slower",
        "Profile": "high",
        "QSV": {
            "AsyncDepth": 4,
            "Decode": false
        },
        "Quality": 18.5,
        "Tune": "film",
        "Turbo": false,
        "TwoPass": false
    }
}
[09:54:06] starting job
[09:54:06] Auto Passthru: allowed codecs are AAC, AC3, E-AC3, TrueHD, DTS, DTS-HD, MP3, FLAC
[09:54:06] Auto Passthru: fallback is AAC (FDK)
[09:54:06] Auto Passthru: using MP3 Passthru for track 1
[09:54:06] job configuration:
[09:54:06]  * source
[09:54:06]    + bbb_sunflower_1080p_30fps_normal.mp4
[09:54:06]    + title 1, chapter(s) 1 to 1
[09:54:06]    + container: mov,mp4,m4a,3gp,3g2,mj2
[09:54:06]    + data rate: 3481 kbps
[09:54:06]  * destination
[09:54:06]    + out/bbb_sunflower_1080p_30fps_normal.mach500.threads40.slower.1080p.out.mkv
[09:54:06]    + container: Matroska (libavformat)
[09:54:06]  * video track
[09:54:06]    + decoder: h264
[09:54:06]      + bitrate 2998 kbps
[09:54:06]    + filters
[09:54:06]      + Framerate Shaper (mode=1:rate=27000000/900000)
[09:54:06]        + frame rate: 30.000 fps -> constant 30.000 fps
[09:54:06]      + Crop and Scale (width=1920:height=1080:crop-top=0:crop-bottom=0:crop-left=0:crop-right=0)
[09:54:06]        + source: 1920 * 1080, crop (0/0/0/0): 1920 * 1080, scale: 1920 * 1080
[09:54:06]    + Output geometry
[09:54:06]      + storage dimensions: 1920 x 1080
[09:54:06]      + pixel aspect ratio: 1 : 1
[09:54:06]      + display dimensions: 1920 x 1080
[09:54:06]    + encoder: H.264 (libx264)
[09:54:06]      + preset:  slower
[09:54:06]      + tune:    film
[09:54:06]      + options: threads=40:level=4.1:deblock=-1,-1:psy-rd=1,0.15:ref=4:analyse=all:b-adapt=2:direct=auto:me=umh:rc-lookahead=60:subme=9:trellis=2:vbv-bufsize=78125:vbv-maxrate=62500
[09:54:06]      + profile: high
[09:54:06]      + level:   4.1
[09:54:06]      + quality: 18.50 (RF)
[09:54:06]  * audio track 1
[09:54:06]    + decoder: Unknown (MP3) (2.0 ch) (track 1, id 0x1)
[09:54:06]      + bitrate: 160 kbps, samplerate: 48000 Hz
[09:54:06]    + MP3 Passthru
[09:54:06] sync: expecting 19038 video frames
[09:54:06] encx264: min-keyint: 30, keyint: 300
[09:54:06] encx264: encoding at constant RF 18.500000
[09:54:06] encx264: unparsed options: threads=40:level=4.1:deblock=-1,-1:psy-rd=1,0.15:ref=4:analyse=all:b-adapt=2:direct=auto:me=umh:rc-lookahead=60:subme=9:trellis=2:vbv-bufsize=78125:vbv-maxrate=62500
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
x264 [info]: profile High, level 4.1
[09:54:06] sync: first pts audio 0x1 is 0
[09:54:06] sync: first pts video is 6000
[09:54:06] sync: Chapter 1 at frame 1 time 6000
[10:01:31] reader: done. 1 scr changes
[10:01:34] work: average encoding speed for job is 42.620026 fps
[10:01:34] vfr: 19038 frames output, 0 dropped and 2 duped for CFR/PFR
[10:01:34] vfr: lost time: 0 (0 frames)
[10:01:34] vfr: gained time: 0 (0 frames) (0 not accounted for)
[10:01:34] mp3-decoder done: 26425 frames, 0 decoder errors
[10:01:34] h264-decoder done: 19036 frames, 0 decoder errors
[10:01:34] sync: got 19036 frames, 19038 expected
[10:01:34] sync: framerate min 30.000 fps, max 30.000 fps, avg 30.000 fps
x264 [info]: frame I:154   Avg QP:12.00  size:323523
x264 [info]: frame P:6421  Avg QP:17.00  size: 39587
x264 [info]: frame B:12463 Avg QP:22.54  size:  6259
x264 [info]: consecutive B-frames:  5.2% 13.0% 28.8% 53.0%
x264 [info]: mb I  I16..4: 13.1% 63.4% 23.5%
x264 [info]: mb P  I16..4:  1.8%  7.0%  0.8%  P16..4: 28.0%  8.0%  6.9%  0.4%  0.2%    skip:46.8%
x264 [info]: mb B  I16..4:  0.3%  0.8%  0.1%  B16..8: 21.4%  2.2%  0.5%  direct: 1.1%  skip:73.6%  L0:41.0% L1:49.4% BI: 9.5%
x264 [info]: 8x8 transform intra:70.7% inter:47.3%
x264 [info]: direct mvs  spatial:99.8% temporal:0.2%
x264 [info]: coded y,uvDC,uvAC intra: 59.4% 57.6% 30.7% inter: 7.3% 6.6% 1.4%
x264 [info]: i16 v,h,dc,p: 32% 19%  8% 41%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 12% 16%  5%  8% 13%  8%  9% 10%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 18% 11% 10%  7% 12% 14% 10%  8% 10%
x264 [info]: i8c dc,h,v,p: 48% 22% 17% 13%
x264 [info]: Weighted P-Frames: Y:4.1% UV:1.9%
x264 [info]: ref P L0: 62.7% 15.7% 15.1%  6.0%  0.5%  0.0%
x264 [info]: ref B L0: 81.1% 15.9%  3.0%
x264 [info]: ref B L1: 97.3%  2.7%
x264 [info]: kb/s:4815.76
[10:01:36] mux: track 0, 19038 frames, 382003974 bytes, 4815.43 kbps, fifo 1024
[10:01:36] mux: track 1, 26425 frames, 12684000 bytes, 159.89 kbps, fifo 2048
[10:01:36] libhb: work result = 0

Encode done!
HandBrake has exited.
Edited: added slower 4 cores, veryfast results.

Edit 2: it looks like the increase in output file size is due to lookahead-threads (40 processors = 10 lookahead threads). OTOH, it could be the bottleneck at high #processors. Speed vs size tradeoff.

PS: I just realized the source file is 276.1 MB, so the output is bigger than the input! CRF 18.5 is too high for this source.

N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Re: Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Fri Mar 24, 2017 10:40 am

@Admiral_Akbar: thx

@nhyone: thx :)

HEVC would be a nice. But not all of my devices (TV, Tablet, ...) support hardware HEVC decoding. So i could possibly store in HEVC and let my NAS transcode it. For a single stream this might work... But accessing multiple files at the same time... I dont think so. Testing HEVC (slower-preset) with my current cpu and the blender movie, the fps are constantly below 3. (medium-preset around 13fps).

1) Since this is about digitalization and not about transcoding... Quality comes before Speed.

2) Well, the idea is to be able to access all my movies easily from any device. So, yes, i keep the original source as disc. Since there are multiple target devices from 7" up to 55" i store in 1080p and if i know that the target device will be 7" or 10" i can (temporary) downsize the file to 720p with an faster preset.

3) Yes. CRF 18.5 is to much for this file. As I said i use this settings for 1080p movie digitalization. And i used the blender movie since it is public availiable and everyone can access it. This file (animated) is not really representative for my usual case - if I encode a movie with my settings i'm often below 10fps. But I am also assuming that f.e. if an i7-4770 is about 13% faster than i7-3770 encoding an animated file i expect the improvement encoding a real movie should be arround 6-7% (~50% less then the improvement of encoding an animated movie)...

4) Yeah, the file-size-thing is interisting. And also that using the slower-preset and all cores E5-2670v2 < E5-2660v3 < E5-2670, but with veryfast-preset the E5-2670 seems to be much better then E5-2660v3.

nhyone
Bright Spark User
Posts: 162
Joined: Fri Jul 24, 2015 4:13 am

Re: Request for (cpu-)benchmarks

Post by nhyone » Sat Mar 25, 2017 5:08 pm

The increase in file size isn't strange in hindsight. It is the tradeoff of lookahead: encoding speed vs size efficiency. For smallest size, it is best to use lookahead of one thread, which then limits encoding speed (because the encoding threads are starved).

Haswell+ CPUs have a 40% speed increase due to the use of AVX2 in x265 v1.9. It is extremely significant.

The low CPU usage and Haswell inversion were surprising to me. I had not seen it before. Previously, my results were pretty standard:
  • Overall, Haswell is 5% faster than IvyB, and IvyB is 10% faster than SandyB, and SandyB just blows X5650 out of the water
  • Clock-for-clock, per-core, it's a different story. Haswell is 10% faster than IvyB; IvyB is 3% faster than SandyB
  • veryfast is 3x faster than medium and medium is 3x faster than slower. This might have been specific to my test video/settings and I generalized it wrongly. It still has the best time/size/quality trade-off, though
  • Encoding speed increases linearly. HyperThread speeds up by 15%. If two physical CPUs are used, there may be no penalty, or up to 25%! (Speed-up is 1.5x instead of the expected 2x)
  • CPU usage is in the range of 60-80%
Because the Haswell machine performed so poorly, I tested it on another Haswell machine, but got the same results. I then checked the physical machine to make sure the memory configuration was optimal. It should be -- all 8 memory slots were filled in (4 per CPU). 4-channel doesn't matter much, but the slots must be filled in correctly.

Maybe when I have some time, I'll investigate why I got these strange results. :D

nhyone
Bright Spark User
Posts: 162
Joined: Fri Jul 24, 2015 4:13 am

Re: Request for (cpu-)benchmarks

Post by nhyone » Tue Apr 04, 2017 2:34 am

I ran some more tests (CRF 18.5, veryfast, high, level 4.1, tune film) on a Haswell machine:

Code: Select all

        Th/LA  cpu   fps         size
2x10x2  40/1    15%   83.138443  367,969,283
        40/2    20%   89.687965  368,693,090
        40/4    25%  105.647507  372,379,622
        40/10   35%  140.687973  381,441,796
        40/20   40%  167.331696  386,208,589
        40/40   40%  149.477509  386,198,677
        40/60   40%  174.871857  386,172,894
        40/80   40%  172.604691  386,191,572
There are two physical CPUs, each with 10 cores (HyperThreaded).

Lookahead thread is indeed the first limiter. But there are three problems with using more lookahead threads: (i) it is still not able to saturate the CPU, (ii) encoding efficiency goes down, (iii) 2-CPU overhead [not apparent here].

How about using just one CPU?

Code: Select all

        Th/LA  cpu   fps         size
1x10x1  10/1    55%   80.206764  362,821,037    # using one set of cores
1x10x2  20/1    30%   72.592583  368,378,414    # w/HT
1x10x2  20/2    40%   86.008881  369,142,986    # w/HT
(CPU% is for the active CPUs only.)

We need 2 lookahead threads to feed 20 processors, but size goes up. Even with one lookahead thread, size depends on the number of threads. 20 will use more space than 10.

And lastly, how about going slow:

Code: Select all

        Th/LA  cpu   fps         size
1x 1x1   1/1   100%   22.973820  362,678,693
1x 1x2   1/1    70%   24.140602  362,678,693    # using one core w/HT
1x 2x1   1/1    70%   27.276686  362,678,693    # using two cores
1x 2x2   4/1    60%   46.623272  362,610,642    # using two cores w/HT
1x 3x2   6/1    40%   60.376286  362,628,534
(CPU% is for the active CPUs only.)

Surprisingly, using several encoding threads is more space efficient than single-threading, but it is very slight and could be specific to this video.


Basically, the takeaway for large multi-core machine:
  • Encode in parallel
  • Use just one lookahead thread for the smallest size
  • Use 4-8 threads per encoding for maximum CPU efficiency
  • Don't use cores from different CPUs on the same encoding [not apparent from the results here]

Log for 40 threads / 10 lookahead threads:

Code: Select all

[17:10:43] hb_init: starting libhb thread
[17:10:43] thread 7f9457526700 started ("libhb")
HandBrake 1.0.3 (2017032400) - Linux x86_64 - https://handbrake.fr
40 CPUs detected
Opening bbb_sunflower_1080p_30fps_normal.mp4...
[17:10:43] CPU: Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz
[17:10:43]  - Intel microarchitecture Haswell
[17:10:43]  - logical processor count: 40
[17:10:43] hb_scan: path=bbb_sunflower_1080p_30fps_normal.mp4, title_index=1
udfread ERROR: ECMA 167 Volume Recognition failed
disc.c:274: failed opening UDF image bbb_sunflower_1080p_30fps_normal.mp4
disc.c:352: error opening file BDMV/index.bdmv
disc.c:352: error opening file BDMV/BACKUP/index.bdmv
[17:10:43] bd: not a bd - trying as a stream/file instead
libdvdnav: Using dvdnav version 5.0.1
libdvdread: Encrypted DVD support unavailable.
libdvdread:DVDOpenFileUDF:UDFFindFile /VIDEO_TS/VIDEO_TS.IFO failed
libdvdread:DVDOpenFileUDF:UDFFindFile /VIDEO_TS/VIDEO_TS.BUP failed
libdvdread: Can't open file VIDEO_TS.IFO.
libdvdnav: vm: failed to read VIDEO_TS.IFO
[17:10:43] dvd: not a dvd - trying as a stream/file instead
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'bbb_sunflower_1080p_30fps_normal.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 1
    compatible_brands: isomavc1
    creation_time   : 2013-12-16 17:44:39
    title           : Big Buck Bunny, Sunflower version
    artist          : Blender Foundation 2008, Janus Bager Kristensen 2013
    comment         : Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net
    genre           : Animation
    composer        : Sacha Goedegebure
  Duration: 00:10:34.60, start: 0.000000, bitrate: 3481 kb/s
    Stream #0:0(und): Video: h264 (High) [avc1 / 0x31637661]
      yuv420p, 1920x1080 [PAR 1:1 DAR 16:9], 2998 kb/s
      30 fps, 30k tbn (default)
    Metadata:
      creation_time   : 2013-12-16 17:44:39
      handler_name    : GPAC ISO Video Handler
    Stream #0:1(und): Audio: mp3 [mp4a / 0x6134706D]
      48000 Hz, 2 channels, s16p, 160 kb/s (default)
    Metadata:
      creation_time   : 2013-12-16 17:44:42
      handler_name    : GPAC ISO Audio Handler
    Stream #0:2(und): Audio: ac3 [ac[45]3 / 0x332D6361]
      48000 Hz, 5.1, fltp, 320 kb/s (default)
    Metadata:
      creation_time   : 2013-12-16 17:44:42
      handler_name    : GPAC ISO Audio Handler
    Side data:
      audio service type: main
[17:10:43] scan: decoding previews for title 1
[17:10:43] scan: audio 0x1: mp3, rate=48000Hz, bitrate=160000 Unknown (MP3) (2.0 ch)
[17:10:43] scan: audio 0x2: ac3, rate=48000Hz, bitrate=320000 Unknown (AC3) (5.1 ch)

Scanning title 1 of 1, preview 4, 40.00 %
Scanning title 1 of 1, preview 10, 100.00 %[17:10:44] scan: 10 previews, 1920x1080, 30.000 fps, autocrop = 0/0/0/0, aspect 16:9, PAR 1:1
[17:10:44] libhb: scan thread found 1 valid title(s)
+ Using preset: CLI Default
+ title 1:
  + stream: bbb_sunflower_1080p_30fps_normal.mp4
  + duration: 00:10:34
  + size: 1920x1080, pixel aspect: 1/1, display aspect: 1.78, 30.000 fps
  + autocrop: 0/0/0/0
  + support opencl: no
  + chapters:
    + 1: cells 0->0, 0 blocks, duration 00:10:34
  + audio tracks:
    + 1, Unknown (MP3) (2.0 ch) (iso639-2: und)
    + 2, Unknown (AC3) (5.1 ch) (iso639-2: und), 48000Hz, 320000bps
  + subtitle tracks:
[17:10:44] 1 job(s) to process
[17:10:44] json job:
{
    "Audio": {
        "AudioList": [
            {
                "Encoder": 1125984256,
                "Track": 0
            }
        ],
        "CopyMask": [
            "copy:aac",
            "copy:ac3",
            "copy:eac3",
            "copy:dtshd",
            "copy:dts",
            "copy:mp3",
            "copy:truehd",
            "copy:flac"
        ],
        "FallbackEncoder": "fdk_aac"
    },
    "Destination": {
        "ChapterList": [
            {
                "Name": ""
            }
        ],
        "ChapterMarkers": false,
        "File": "out/bbb_sunflower_1080p_30fps_normal.m512.threads40_10.veryfast.1080p.out.mkv",
        "Mp4Options": {
            "IpodAtom": false,
            "Mp4Optimize": false
        },
        "Mux": "mkv"
    },
    "Filters": {
        "FilterList": [
            {
                "ID": 6,
                "Settings": {
                    "mode": 1,
                    "rate": "27000000/900000"
                }
            },
            {
                "ID": 11,
                "Settings": {
                    "crop-bottom": 0,
                    "crop-left": 0,
                    "crop-right": 0,
                    "crop-top": 0,
                    "height": 1080,
                    "width": 1920
                }
            }
        ]
    },
    "Metadata": {
        "Artist": "Blender Foundation 2008, Janus Bager Kristensen 2013",
        "Comment": "Creative Commons Attribution 3.0 - http://bbb3d.renderfarming.net",
        "Composer": "Sacha Goedegebure",
        "Genre": "Animation",
        "Name": "Big Buck Bunny, Sunflower version"
    },
    "PAR": {
        "Den": 1,
        "Num": 1
    },
    "SequenceID": 0,
    "Source": {
        "Angle": 0,
        "Path": "bbb_sunflower_1080p_30fps_normal.mp4",
        "Range": {
            "End": 1,
            "Start": 1,
            "Type": "chapter"
        },
        "Title": 1
    },
    "Subtitle": {
        "Search": {
            "Burn": true,
            "Default": false,
            "Enable": false,
            "Forced": false
        },
        "SubtitleList": []
    },
    "Video": {
        "ColorMatrixCode": 0,
        "Encoder": "x264",
        "Level": "4.1",
        "OpenCL": false,
        "Options": "threads=40:lookahead-threads=10",
        "Preset": "veryfast",
        "Profile": "high",
        "QSV": {
            "AsyncDepth": 4,
            "Decode": false
        },
        "Quality": 18.5,
        "Tune": "film",
        "Turbo": false,
        "TwoPass": false
    }
}
[17:10:44] starting job
[17:10:44] Auto Passthru: allowed codecs are AAC, AC3, E-AC3, TrueHD, DTS, DTS-HD, MP3, FLAC
[17:10:44] Auto Passthru: fallback is AAC (FDK)
[17:10:44] Auto Passthru: using MP3 Passthru for track 1
[17:10:44] job configuration:
[17:10:44]  * source
[17:10:44]    + bbb_sunflower_1080p_30fps_normal.mp4
[17:10:44]    + title 1, chapter(s) 1 to 1
[17:10:44]    + container: mov,mp4,m4a,3gp,3g2,mj2
[17:10:44]    + data rate: 3481 kbps
[17:10:44]  * destination
[17:10:44]    + out/bbb_sunflower_1080p_30fps_normal.m512.threads40_10.veryfast.1080p.out.mkv
[17:10:44]    + container: Matroska (libavformat)
[17:10:44]  * video track
[17:10:44]    + decoder: h264
[17:10:44]      + bitrate 2998 kbps
[17:10:44]    + filters
[17:10:44]      + Framerate Shaper (mode=1:rate=27000000/900000)
[17:10:44]        + frame rate: 30.000 fps -> constant 30.000 fps
[17:10:44]      + Crop and Scale (width=1920:height=1080:crop-top=0:crop-bottom=0:crop-left=0:crop-right=0)
[17:10:44]        + source: 1920 * 1080, crop (0/0/0/0): 1920 * 1080, scale: 1920 * 1080
[17:10:44]    + Output geometry
[17:10:44]      + storage dimensions: 1920 x 1080
[17:10:44]      + pixel aspect ratio: 1 : 1
[17:10:44]      + display dimensions: 1920 x 1080
[17:10:44]    + encoder: H.264 (libx264)
[17:10:44]      + preset:  veryfast
[17:10:44]      + tune:    film
[17:10:44]      + options: threads=40:lookahead-threads=10
[17:10:44]      + profile: high
[17:10:44]      + level:   4.1
[17:10:44]      + quality: 18.50 (RF)
[17:10:44]  * audio track 1
[17:10:44]    + decoder: Unknown (MP3) (2.0 ch) (track 1, id 0x1)
[17:10:44]      + bitrate: 160 kbps, samplerate: 48000 Hz
[17:10:44]    + MP3 Passthru
[17:10:44] sync: expecting 19038 video frames
[17:10:44] encx264: min-keyint: 30, keyint: 300
[17:10:44] encx264: encoding at constant RF 18.500000
[17:10:44] encx264: unparsed options: threads=40:lookahead-threads=10:ref=1:level=4.1:trellis=0:deblock=-1,-1:mixed-refs=0:weightp=1:subme=2:psy-rd=1,0.15:vbv-maxrate=62500:vbv-bufsize=78125:rc-lookahead=10
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX FMA3 AVX2 LZCNT BMI2
x264 [info]: profile High, level 4.1
[17:10:44] sync: first pts audio 0x1 is 0
[17:10:44] sync: first pts video is 6000
[17:10:44] sync: Chapter 1 at frame 1 time 6000
[17:13:04] reader: done. 1 scr changes
[17:13:05] work: average encoding speed for job is 135.756454 fps
[17:13:05] vfr: 19038 frames output, 0 dropped and 2 duped for CFR/PFR
[17:13:05] vfr: lost time: 0 (0 frames)
[17:13:05] vfr: gained time: 0 (0 frames) (0 not accounted for)
[17:13:05] mp3-decoder done: 26425 frames, 0 decoder errors
[17:13:05] h264-decoder done: 19036 frames, 0 decoder errors
[17:13:05] sync: got 19036 frames, 19038 expected
[17:13:05] sync: framerate min 30.000 fps, max 30.000 fps, avg 30.000 fps
x264 [info]: frame I:158   Avg QP:13.50  size:272210
x264 [info]: frame P:6559  Avg QP:17.86  size: 37082
x264 [info]: frame B:12321 Avg QP:21.33  size:  6672
x264 [info]: consecutive B-frames:  6.2% 17.9% 13.7% 62.2%
x264 [info]: mb I  I16..4: 16.9% 23.3% 59.8%
x264 [info]: mb P  I16..4:  3.9%  4.0%  1.0%  P16..4: 26.4% 10.4%  6.5%  0.0%  0.0%    skip:47.8%
x264 [info]: mb B  I16..4:  0.5%  0.5%  0.1%  B16..8:  9.5%  3.1%  0.6%  direct: 3.8%  skip:82.0%  L0:36.7% L1:45.6% BI:17.7%
x264 [info]: 8x8 transform intra:41.3% inter:39.4%
x264 [info]: coded y,uvDC,uvAC intra: 51.9% 50.3% 22.2% inter: 8.4% 8.3% 1.0%
x264 [info]: i16 v,h,dc,p: 56% 23% 15%  7%
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 26% 20% 20%  4%  6% 10%  5%  5%  5%
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 17% 12%  6%  8% 10%  7%  8%  7%
x264 [info]: i8c dc,h,v,p: 55% 20% 18%  7%
x264 [info]: Weighted P-Frames: Y:3.0% UV:1.5%
x264 [info]: kb/s:4644.59
[17:13:07] mux: track 0, 19038 frames, 368426106 bytes, 4644.27 kbps, fifo 1024
[17:13:07] mux: track 1, 26425 frames, 12684000 bytes, 159.89 kbps, fifo 2048
[17:13:07] libhb: work result = 0

Encode done!
HandBrake has exited.

N!ghtW4lk3r
Posts: 18
Joined: Mon May 19, 2014 5:14 pm

Re: Request for (cpu-)benchmarks

Post by N!ghtW4lk3r » Sun Apr 09, 2017 8:42 am

So that means that high-end server cpu's with more physical cores don't have any 'real' advantage over current desktop/gaming processors in x264 video encoding? at least if we're looking at encoding-speed and filesize. Thats... unexpected... :shock:
But i think this is a general encoding problem, due to linear processing, right? If the encoder would split the source material (depending on physical/logical cpu count and content length) and encode multiple parts in parallel and then merge the results back together.... The encoding time should be much lower... But the filesize might be increased slightly because of the cut/merge points... But this... is just a little off-topic :)

Woodstock
Veteran User
Posts: 1836
Joined: Tue Aug 27, 2013 6:39 am

Re: Request for (cpu-)benchmarks

Post by Woodstock » Sun Apr 09, 2017 6:10 pm

Well, once you get past 8 physical cores, that would be true. That's when you have to throw multiple encodes at it. :)

mduell
Veteran User
Posts: 5379
Joined: Sat Apr 21, 2007 8:54 pm

Re: Request for (cpu-)benchmarks

Post by mduell » Sun Apr 09, 2017 6:27 pm

Or filters, or high res content, or slower encoding settings, etc.

Post Reply