Optimize your Handbrake compile for your hardware

Random chit-chat and anything that doesn't belong elsewhere
Post Reply
masterasia
Posts: 6
Joined: Sat May 23, 2009 1:07 am

Optimize your Handbrake compile for your hardware

Post by masterasia »

This guide is for people who have already sucessfully compiled their handbrake.

First, upgrade your gcc to 4.4.0, i followed the steps here: http://trac.handbrake.fr/wiki/CygWin/, but used gcc4.4.0 bz instead of the 4.2.4 in the guide:

follow these steps,launch bash:

cd handbrake-source/build

at bash command prompt type:

make xxx.clean

xxx refers to the module: x264, mpeg2dec, lame and etc...(depending which module you wish to optimize, i didn't optimize xvid since i don't really use that module)

this will flush out your previous handbrake compile, but retain the module source codes.

using your favourite text editor,
inside handbrake-source/build/contrib/xxx(module name), there are:

"makefile", open this file with your text editor. there could be more than 1 makefile in each module, so edit them all. For some modules like x264 and ffmpeg, look for a .mak file, do read the code abit, you will understand how they batch up the cflags to work within the modules.

after you open the file, look for:
=CFLAGS (or OPTFLAGS for some modules)

change -mcpu=pentiumpro to -mtune=core2 (this code depends on your processor type), or if it does not exist, add the code.

note that there could be more than one =CFLAGS lines, so do add them all.

various optimization codes can be found here:

http://gcc.gnu.org/onlinedocs/gcc-4.4.0 ... el-Options

after you have edited the makefile, you can save the makefile and re-run make to recompile your handbrake. Do note that not all optimization flags work, for example -msse4.1 broke down ffmpeg, once i removed the flag, it compiles ok, just trial and error.

I am not sure how much performance you can squeeze out of gcc optimizations since much of x264 is assembly code, my guess is that the other modules could use some optimizations especially those that still use the outdated -mcpu=pentiumpro. I do notice the massive speed up in gcc compiling times when i used -mtune=core2.

I haven't notice a siginficant fps increase since i use very high profile encode settings to encode(with decombing), but I do notice that my system is much more responsive with handbrake encoding in the background than before the optimzations.
Last edited by masterasia on Wed May 27, 2009 5:40 pm, edited 2 times in total.
masterasia
Posts: 6
Joined: Sat May 23, 2009 1:07 am

Re: Optimize your Handbrake compile for your hardware

Post by masterasia »

Updates:

reconfigured and recompiled yasm,gmp and mpfr with gcc 4.4.0, recompiled perl, python as well(not sure that would improve handbrake, but heck)
added optimization flags to yasm by editing makefile.

I have used more aggressive optimizations flags on all relevant handbrake modules:

-march=core2 -msse4.1 -pipe -mfpmath=sse -malign-double -fomit-frame-pointer

I am not a programming guru, just speculated the optimizations by reading googling around. Not sure if that's the most ideal flags to use for Core2.

notes:

There are many makefiles inside mpeg2dec and other modules, do edit them all.

-march=core2 is more aggressive than -mtune=core2, some modules like faac and ffmpeg only accepts -mtune=core2, anything else extra, breaks compiling.

End result: There is at least 20-25% improve encoding fps! my guess that the most gains came from optimizing ffmpeg & mpeg2dec!
KonaBlend
Novice
Posts: 72
Joined: Tue Nov 04, 2008 2:35 am

Re: Optimize your Handbrake compile for your hardware

Post by KonaBlend »

masterasia wrote:End result: There is at least 20-25% improve encoding fps! my guess that the most gains came from optimizing ffmpeg & mpeg2dec!
Smells like a highly dubious claim. First, you need to tell us before/after details. Details being HB version, input media, and encoding parameters. Without any of this, such claims are meaningless.
masterasia
Posts: 6
Joined: Sat May 23, 2009 1:07 am

Re: Optimize your Handbrake compile for your hardware

Post by masterasia »

Hi Konablend,

I agree that without any statistics to show I am just smoking. Sadly I do not have own any kind of popular western DVD/Bluray disc to reproduce results.

If anyone who is willing to benchmark, i am willing to send my windows32 svn2455(sse4.1 required) file to them.
My config:
input media: DVD disc (Asian drama/movies) converted to ISO image, using Magiciso virtual disc
Windows 7 RC7100 x86
Core 2 duo E7200 wolfdale(SSE4.1) oc to 3.29GHz


Handbrake SVN2455 GCC 4.4.0 compiled, Cygwin 1.5
Handbrake Process Priority set to normal
Handbrake Encoding Preset: Bulit-in Constant Quality Preset with Decomb on:
ref=3:mixed-refs=1:bframes=3:b-pyramid=1:weightb=1:deblock=-2,-1:trellis=1:analyse=all:8x8dct=1:me=umh:subq=9:psy-rd=1,1
KonaBlend
Novice
Posts: 72
Joined: Tue Nov 04, 2008 2:35 am

Re: Optimize your Handbrake compile for your hardware

Post by KonaBlend »

post a full log of your session. and you still haven't clarified very important things like; you are comparing 2 binaries: A,B. What is the HB version for A and what is the HB version for B? And which gcc version was used for A and which for B? etc. etc.
masterasia
Posts: 6
Joined: Sat May 23, 2009 1:07 am

Re: Optimize your Handbrake compile for your hardware

Post by masterasia »

Yup, two compiled handbrakecli.exe:

Mounted ISO image(no dvdrom spinup up delays), set to bulit-in Handbrake High profile Television preset, 2 pass, decomb, detelecine.

No multitasking. All background task ended. Windows search Off, Windows Defender Off, System Restore Off, Media Sharing Off, NIC Disabled, Indexing Disabled.

gcc 4.2.4 handbrake cygwin guide, handbrake svn 2430 compiled on 20 may 2009

Code: Select all

### Windows GUI svn2455 2009052701 
### Running: Microsoft Windows NT 6.1.7100.0 
###
### CPU: Intel(R) Core(TM)2 Duo CPU     E7200  @ 2.53GHz 
### Ram: 2047 MB 
### Screen: 1920x1200 
### Temp Dir: C:\Users\masterasia\AppData\Local\Temp\ 
### Install Dir: C:\Users\masterasia\Desktop\Release 
### Data Dir: C:\Users\masterasia\AppData\Roaming\HandBrake\HandBrake\0.9.3.5 
#########################################

### CLI Query:  -i "H:\VIDEO_TS" -t 3 -c 2 -o "C:\Users\masterasia\Desktop\VIDEO_TS-3-2.mkv" -f mkv -p  --detelecine --decomb -e x264 -b 1300 -2  -T  -a 1 -6 auto -R Dolby Pro Logic II -B auto -D 160 --markers="C:\Users\masterasia\AppData\Local\Temp\VIDEO_TS-3-2-3-chapters.csv" -x ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1 -v 2

#########################################

[12:08:54] hb_init: checking cpu count
[12:08:54] hb_init: starting libhb thread
cygwin warning:
HandBrake svn2430 (2009052001)  MS-DOS style path detected: C:/Users/masterasia/AppData/Local/Temp/hb.2104
  Preferred POSIX equivalent is: /cygdrive/c/Users/masterasia/AppData/Local/Temp/hb.2104
 - Cygwin i686  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
 - http://handbrake.fr
2 CPUs detected
Opening H:\VIDEO_TS...
[12:08:54] hb_scan: path=H:\VIDEO_TS, title_index=3
[12:08:54] scan: trying to open with libdvdread
libdvdread: Encrypted DVD support unavailable.
libdvdread: Couldn't find device name.
[12:08:54] scan: DVD has 3 title(s)
[12:08:54] scan: scanning title 3
[12:08:54] scan: opening IFO for VTS 1
[12:08:54] pgc_id: 3, pgn: 1: pgc: 0x1c1f5d0
[12:08:54] scan: vts=1, ttn=3, cells=0->11, blocks=0->1985313, 1985314 blocks
[12:08:54] scan: duration is 01:08:37 (4117266 ms)
[12:08:54] scan: checking audio 1
[12:08:54] scan: id=80bd, lang=Japanese (AC3), 3cc=jpn ext=0
[12:08:54] scan: title 3 has 12 chapters
[12:08:54] scan: chap 1 c=0->0, b=0->4856 (4857), 15147 ms
[12:08:54] scan: chap 2 c=1->1, b=4857->51185 (46329), 108255 ms
[12:08:54] scan: chap 3 c=2->2, b=51186->211639 (160454), 330354 ms
[12:08:54] scan: chap 4 c=3->3, b=211640->428581 (216942), 446487 ms
[12:08:54] scan: chap 5 c=4->4, b=428582->615122 (186541), 386450 ms
[12:08:54] scan: chap 6 c=5->5, b=615123->750861 (135739), 280450 ms
[12:08:54] scan: chap 7 c=6->6, b=750862->1054587 (303726), 623631 ms
[12:08:54] scan: chap 8 c=7->7, b=1054588->1199512 (144925), 299568 ms
[12:08:54] scan: chap 9 c=8->8, b=1199513->1387865 (188353), 387398 ms
[12:08:54] scan: chap 10 c=9->9, b=1387866->1598710 (210845), 431607 ms
[12:08:54] scan: chap 11 c=10->10, b=1598711->1749202 (150492), 310312 ms
[12:08:54] scan: chap 12 c=11->11, b=1749203->1985313 (236111), 497602 ms
[12:08:54] scan: aspect = 0
[12:08:54] scan: decoding previews for title 3
[12:08:54] scan: title angle(s) 1
[12:08:54] scan: audio 0x80bd: AC-3, rate=48000Hz, bitrate=192000 Japanese (AC3) (2.0 ch)
[12:08:54] scan: 10 previews, 720x480, 29.970 fps, autocrop = 2/0/0/0, aspect 4:3, PAR 8:9
[12:08:54] Title is likely interlaced or telecined (5 out of 10 previews). You should do something about that.
[12:08:54] scan: title (0) job->width:640, job->height:480
[12:08:54] libhb: scan thread found 1 valid title(s)
+ title 3:
  + vts 1, ttn 3, cells 0->11 (1985314 blocks)
  + duration: 01:08:37
  + size: 720x480, aspect: 1.33, 29.970 fps
  + autocrop: 2/0/0/0
  + chapters:
    + 1: cells 0->0, 4857 blocks, duration 00:00:15
    + 2: cells 1->1, 46329 blocks, duration 00:01:48
    + 3: cells 2->2, 160454 blocks, duration 00:05:30
    + 4: cells 3->3, 216942 blocks, duration 00:07:26
    + 5: cells 4->4, 186541 blocks, duration 00:06:26
    + 6: cells 5->5, 135739 blocks, duration 00:04:40
    + 7: cells 6->6, 303726 blocks, duration 00:10:24
    + 8: cells 7->7, 144925 blocks, duration 00:05:00
    + 9: cells 8->8, 188353 blocks, duration 00:06:27
    + 10: cells 9->9, 210845 blocks, duration 00:07:12
    + 11: cells 10->10, 150492 blocks, duration 00:05:10
    + 12: cells 11->11, 236111 blocks, duration 00:08:18
  + audio tracks:
    + 1, Japanese (AC3) (2.0 ch), 48000Hz, 192000bps
  + subtitle tracks:
  + combing detected, may be interlaced or telecined
Reading chapter markers from file C:\Users\masterasia\AppData\Local\Temp\VIDEO_TS-3-2-3-chapters.csv
Invalid sample rate 0, using input rate 48000
Modified x264 options for pass 1 to append turbo options: ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0
[12:08:54] 2 job(s) to process
[12:08:54] starting job
[12:08:54] job configuration:
[12:08:54]  * source
[12:08:54]    + H:\VIDEO_TS
[12:08:54]    + title 3, chapter(s) 2 to 2
[12:08:54]  * destination
[12:08:54]    + C:\Users\masterasia\Desktop\VIDEO_TS-3-2.mkv
[12:08:54]    + container: Matroska (.mkv)
[12:08:54]      + chapter markers
[12:08:54]  * video track
[12:08:54]    + decoder: mpeg2
[12:08:54]      + bitrate 8500 kbps
[12:08:54]    + frame rate: same as source (around 29.970 fps)
[12:08:54]    + strict anamorphic
[12:08:54]      + storage dimensions: 720 * 480 -> 720 * 478, crop 2/0/0/0
[12:08:54]      + pixel aspect ratio: 8 / 9
[12:08:54]      + display dimensions: 640 * 478
[12:08:54]    + filters
[12:08:54]      + Detelecine (pullup) (default settings)
[12:08:54]      + Decomb (default settings)
[12:08:54]    + encoder: x264
[12:08:54]      + options: ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0
[12:08:54]      + bitrate: 1300 kbps, pass: 1
[12:08:54]  * audio track 0
[12:08:54]    + decoder: Japanese (AC3) (2.0 ch) (track 1, id 80bd)
[12:08:54]      + bitrate: 192 kbps, samplerate: 48000 Hz
[12:08:54]    + AC3 passthrough
libdvdread: Encrypted DVD support unavailable.
libdvdread: Couldn't find device name.
[12:08:54] reader: first SCR 146
[12:08:54] yadif thread started for segment 0
[12:08:54] yadif thread started for segment 1
[12:08:54] decomb thread started for segment 0
[12:08:54] mpeg2: "Chapter 2" (2) at frame 0 time 9009
[12:08:54] decomb thread started for segment 1
[12:08:54] encx264: keyint-min: 30, keyint-max: 300
[12:08:54] encx264: encoding with stored aspect 8/9
x264 [warning]: width or height not divisible by 16 (720x478), compression will suffer.
x264 [info]: using SAR=8/9
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1 Cache64
x264 [info]: profile Main, level 3.0
No accelerated IMDCT transform found
[12:08:54] output track 0: ac3 in sync after skipping 579 bytes
[12:08:54] sync: expecting 3274 video frames
[12:08:54] sync: first pts is 9009
[12:09:56] reader: end of chapter 2 (media 2) reached at media chapter 3
[12:09:56] reader: done. 1 scr changes
[12:09:58] sync: got 3259 frames, 3274 expected
[12:09:58] work: average encoding speed for job is 51.023483 fps
[12:09:58] mpeg2 done: 3260 frames
[12:09:58] render: lost time: 90090 (30 frames)
[12:09:58] render: gained time: 90090 (120 frames) (0 not accounted for)
[12:09:58] render: average dropped frame duration: 3003
x264 [info]: slice I:16    Avg QP:22.28  size: 24437  PSNR Mean Y:42.15 U:47.79 V:47.45 Avg:43.25 Global:39.84
x264 [info]: slice P:1107  Avg QP:24.32  size:  8404  PSNR Mean Y:38.97 U:46.15 V:45.47 Avg:40.24 Global:37.79
x264 [info]: slice B:2105  Avg QP:25.66  size:  2906  PSNR Mean Y:38.93 U:46.40 V:45.95 Avg:40.22 Global:37.75
x264 [info]: consecutive B-frames:  8.3% 18.8% 10.0% 30.8% 16.5%  7.1%  8.5%
x264 [info]: mb I  I16..4: 42.7%  0.0% 57.3%
x264 [info]: mb P  I16..4: 15.3%  0.0%  0.0%  P16..4: 65.5%  0.0%  0.0%  0.0%  0.0%    skip:19.2%
x264 [info]: mb B  I16..4:  0.9%  0.0%  0.0%  B16..8: 27.6%  0.0%  0.0%  direct:19.7%  skip:51.8%  L0:27.5% L1:57.2% BI:15.3%
x264 [info]: final ratefactor: 24.32
x264 [info]: direct mvs  spatial:99.5%  temporal:0.5%
x264 [info]: coded y,uvDC,uvAC intra:47.0% 56.3% 15.1% inter:17.7% 13.8% 0.4%
x264 [info]: SSIM Mean Y:0.9508703
x264 [info]: PSNR Mean Y:38.962 U:46.319 V:45.795 Avg:40.241 Global:37.776 kb/s:1174.34
[12:09:58] decomb: yadif deinterlaced 1579 | blend deinterlaced 596 | unfiltered 1053 | total 3228
[12:09:58] starting job
[12:09:58] job configuration:
[12:09:58]  * source
[12:09:58]    + H:\VIDEO_TS
[12:09:58]    + title 3, chapter(s) 2 to 2
[12:09:58]  * destination
[12:09:58]    + C:\Users\masterasia\Desktop\VIDEO_TS-3-2.mkv
[12:09:58]    + container: Matroska (.mkv)
[12:09:58]      + chapter markers
[12:09:58]  * video track
[12:09:58]    + decoder: mpeg2
[12:09:58]      + bitrate 8500 kbps
[12:09:58]    + frame rate: same as source (around 29.970 fps)
[12:09:58]    + strict anamorphic
[12:09:58]      + storage dimensions: 720 * 480 -> 720 * 478, crop 2/0/0/0
[12:09:58]      + pixel aspect ratio: 8 / 9
[12:09:58]      + display dimensions: 640 * 478
[12:09:58]    + filters
[12:09:58]      + Detelecine (pullup) (default settings)
[12:09:58]      + Decomb (default settings)
[12:09:58]    + encoder: x264
[12:09:58]      + options: ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1
[12:09:58]      + bitrate: 1300 kbps, pass: 2
[12:09:58]  * audio track 0
[12:09:58]    + decoder: Japanese (AC3) (2.0 ch) (track 1, id 80bd)
[12:09:58]      + bitrate: 192 kbps, samplerate: 48000 Hz
[12:09:58]    + AC3 passthrough
libdvdread: Encrypted DVD support unavailable.
libdvdread: Couldn't find device name.
[12:09:58] yadif thread started for segment 0
[12:09:58] yadif thread started for segment 1
[12:09:58] reader: first SCR 146
[12:09:58] decomb thread started for segment 0
[12:09:58] decomb thread started for segment 1
[12:09:58] encx264: keyint-min: 30, keyint-max: 300
[12:09:58] encx264: encoding with stored aspect 8/9
x264 [warning]: width or height not divisible by 16 (720x478), compression will suffer.
x264 [info]: using SAR=8/9
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1 Cache64
[12:09:58] mpeg2: "Chapter 2" (2) at frame 0 time 9009
x264 [info]: profile High, level 3.0
No accelerated IMDCT transform found
[12:09:58] output track 0: ac3 in sync after skipping 579 bytes
[12:09:58] sync: expecting 3274 video frames
[12:09:58] sync: first pts is 9009
[12:12:35] reader: end of chapter 2 (media 2) reached at media chapter 3
[12:12:35] reader: done. 1 scr changes
[12:12:37] sync: got 3259 frames, 3274 expected
[12:12:37] work: average encoding speed for job is 20.358496 fps
[12:12:37] mux: track 0, 3228 frames, 17481032 bytes, 1286.45 kbps, fifo 32
[12:12:37] mux: track 1, 3397 frames, 2608896 bytes, 191.99 kbps, fifo 256
[12:12:37] mpeg2 done: 3260 frames
[12:12:37] render: lost time: 90090 (30 frames)
[12:12:37] render: gained time: 90090 (120 frames) (0 not accounted for)
[12:12:37] render: average dropped frame duration: 3003
x264 [info]: slice I:16    Avg QP:22.20  size: 30695  PSNR Mean Y:42.60 U:49.48 V:49.11 Avg:43.89 Global:42.18
x264 [info]: slice P:1107  Avg QP:25.53  size:  9534  PSNR Mean Y:39.45 U:47.14 V:46.48 Avg:40.76 Global:38.82
x264 [info]: slice B:2105  Avg QP:27.12  size:  3057  PSNR Mean Y:39.37 U:47.27 V:46.82 Avg:40.69 Global:38.52
x264 [info]: consecutive B-frames:  8.3% 18.8% 10.0% 30.8% 16.5%  7.1%  8.5%
x264 [info]: mb I  I16..4: 12.2% 69.7% 18.1%
x264 [info]: mb P  I16..4:  1.3%  3.9%  0.8%  P16..4: 54.6% 13.3% 12.2%  0.2%  0.3%    skip:13.4%
x264 [info]: mb B  I16..4:  0.0%  0.2%  0.1%  B16..8: 44.1%  0.8%  1.6%  direct: 2.6%  skip:50.5%  L0:37.0% L1:54.4% BI: 8.6%
x264 [info]: 8x8 transform  intra:66.3%  inter:76.9%
x264 [info]: direct mvs  spatial:96.2%  temporal:3.8%
x264 [info]: coded y,uvDC,uvAC intra:73.5% 76.0% 30.9% inter:23.6% 24.7% 0.6%
x264 [info]: ref P L0  64.3% 23.7% 12.1%
x264 [info]: ref B L0  84.7% 15.3%
x264 [info]: ref B L1  93.3%  6.7%
x264 [info]: SSIM Mean Y:0.9574788
x264 [info]: PSNR Mean Y:39.416 U:47.239 V:46.713 Avg:40.733 Global:38.632 kb/s:1298.35
[12:12:37] decomb: yadif deinterlaced 1579 | blend deinterlaced 596 | unfiltered 1053 | total 3228
[12:12:37] libhb: work result = 0

Rip done!
HandBrake has exited.


 ############ End of Log ############## 

gcc 4.4.0 with cflag opt, handbrake svn2455 compiled 27 May 2009

Code: Select all

### Windows GUI svn2455 2009052701 
### Running: Microsoft Windows NT 6.1.7100.0 
###
### CPU: Intel(R) Core(TM)2 Duo CPU     E7200  @ 2.53GHz 
### Ram: 2047 MB 
### Screen: 1920x1200 
### Temp Dir: C:\Users\masterasia\AppData\Local\Temp\ 
### Install Dir: C:\Users\masterasia\Desktop\Release 
### Data Dir: C:\Users\masterasia\AppData\Roaming\HandBrake\HandBrake\0.9.3.5 
#########################################

### CLI Query:  -i "H:\VIDEO_TS" -t 3 -c 2 -o "C:\Users\masterasia\Desktop\VIDEO_TS-3-2.mkv" -f mkv -p  --detelecine --decomb -e x264 -b 1300 -2  -T  -a 1 -6 auto -R Dolby Pro Logic II -B auto -D 160 --markers="C:\Users\masterasia\AppData\Local\Temp\VIDEO_TS-3-2-3-chapters.csv" -x ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1 -v 2

#########################################

[12:04:36] hb_init: checking cpu count
[12:04:36] hb_init: starting libhb thread
cygwin warning:
  MS-DOS style path detected: C:/Users/masterasia/AppData/Local/Temp/hb.3888
  Preferred POSIX equivalent is: /cygdrive/c/Users/masterasia/AppData/Local/Temp/hb.3888
HandBrake svn2455 (2009052701)  CYGWIN environment variable option "nodosfilewarning" turns off this warning.
  Consult the user's guide for more details about POSIX paths:
    http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
 - Cygwin i686 - http://handbrake.fr
2 CPUs detected
Opening H:\VIDEO_TS...
[12:04:36] hb_scan: path=H:\VIDEO_TS, title_index=3
[12:04:36] scan: trying to open with libdvdread
libdvdread: Encrypted DVD support unavailable.
libdvdread: Couldn't find device name.
[12:04:36] scan: DVD has 3 title(s)
[12:04:36] scan: scanning title 3
[12:04:36] scan: opening IFO for VTS 1
[12:04:36] pgc_id: 3, pgn: 1: pgc: 0x1052f660
[12:04:36] scan: vts=1, ttn=3, cells=0->11, blocks=0->1985313, 1985314 blocks
[12:04:36] scan: duration is 01:08:37 (4117266 ms)
[12:04:36] scan: checking audio 1
[12:04:36] scan: id=80bd, lang=Japanese (AC3), 3cc=jpn ext=0
[12:04:36] scan: title 3 has 12 chapters
[12:04:36] scan: chap 1 c=0->0, b=0->4856 (4857), 15147 ms
[12:04:36] scan: chap 2 c=1->1, b=4857->51185 (46329), 108255 ms
[12:04:36] scan: chap 3 c=2->2, b=51186->211639 (160454), 330354 ms
[12:04:36] scan: chap 4 c=3->3, b=211640->428581 (216942), 446487 ms
[12:04:36] scan: chap 5 c=4->4, b=428582->615122 (186541), 386450 ms
[12:04:36] scan: chap 6 c=5->5, b=615123->750861 (135739), 280450 ms
[12:04:36] scan: chap 7 c=6->6, b=750862->1054587 (303726), 623631 ms
[12:04:36] scan: chap 8 c=7->7, b=1054588->1199512 (144925), 299568 ms
[12:04:36] scan: chap 9 c=8->8, b=1199513->1387865 (188353), 387398 ms
[12:04:36] scan: chap 10 c=9->9, b=1387866->1598710 (210845), 431607 ms
[12:04:36] scan: chap 11 c=10->10, b=1598711->1749202 (150492), 310312 ms
[12:04:36] scan: chap 12 c=11->11, b=1749203->1985313 (236111), 497602 ms
[12:04:36] scan: aspect = 0
[12:04:36] scan: decoding previews for title 3
[12:04:36] scan: title angle(s) 1
[12:04:36] scan: audio 0x80bd: AC-3, rate=48000Hz, bitrate=192000 Japanese (AC3) (2.0 ch)
[12:04:36] scan: 10 previews, 720x480, 29.970 fps, autocrop = 2/0/0/0, aspect 4:3, PAR 8:9
[12:04:36] Title is likely interlaced or telecined (5 out of 10 previews). You should do something about that.
[12:04:36] scan: title (0) job->width:640, job->height:480
[12:04:36] libhb: scan thread found 1 valid title(s)
+ title 3:
  + vts 1, ttn 3, cells 0->11 (1985314 blocks)
  + duration: 01:08:37
  + size: 720x480, aspect: 1.33, 29.970 fps
  + autocrop: 2/0/0/0
  + chapters:
    + 1: cells 0->0, 4857 blocks, duration 00:00:15
    + 2: cells 1->1, 46329 blocks, duration 00:01:48
    + 3: cells 2->2, 160454 blocks, duration 00:05:30
    + 4: cells 3->3, 216942 blocks, duration 00:07:26
    + 5: cells 4->4, 186541 blocks, duration 00:06:26
    + 6: cells 5->5, 135739 blocks, duration 00:04:40
    + 7: cells 6->6, 303726 blocks, duration 00:10:24
    + 8: cells 7->7, 144925 blocks, duration 00:05:00
    + 9: cells 8->8, 188353 blocks, duration 00:06:27
    + 10: cells 9->9, 210845 blocks, duration 00:07:12
    + 11: cells 10->10, 150492 blocks, duration 00:05:10
    + 12: cells 11->11, 236111 blocks, duration 00:08:18
  + audio tracks:
    + 1, Japanese (AC3) (2.0 ch), 48000Hz, 192000bps
  + subtitle tracks:
  + combing detected, may be interlaced or telecined
Reading chapter markers from file C:\Users\masterasia\AppData\Local\Temp\VIDEO_TS-3-2-3-chapters.csv
Invalid sample rate 0, using input rate 48000
Modified x264 options for pass 1 to append turbo options: ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0
[12:04:36] 2 job(s) to process
[12:04:36] starting job
[12:04:36] job configuration:
[12:04:36]  * source
[12:04:36]    + H:\VIDEO_TS
[12:04:36]    + title 3, chapter(s) 2 to 2
[12:04:36]  * destination
[12:04:36]    + C:\Users\masterasia\Desktop\VIDEO_TS-3-2.mkv
[12:04:36]    + container: Matroska (.mkv)
[12:04:36]      + chapter markers
[12:04:36]  * video track
[12:04:36]    + decoder: mpeg2
[12:04:36]      + bitrate 8500 kbps
[12:04:36]    + frame rate: same as source (around 29.970 fps)
[12:04:36]    + strict anamorphic
[12:04:36]      + storage dimensions: 720 * 480 -> 720 * 478, crop 2/0/0/0
[12:04:36]      + pixel aspect ratio: 8 / 9
[12:04:36]      + display dimensions: 640 * 478
[12:04:36]    + filters
[12:04:36]      + Detelecine (pullup) (default settings)
[12:04:36]      + Decomb (default settings)
[12:04:36]    + encoder: x264
[12:04:36]      + options: ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0
[12:04:36]      + bitrate: 1300 kbps, pass: 1
[12:04:36]  * audio track 0
[12:04:36]    + decoder: Japanese (AC3) (2.0 ch) (track 1, id 80bd)
[12:04:36]      + bitrate: 192 kbps, samplerate: 48000 Hz
[12:04:36]    + AC3 passthrough
libdvdread: Encrypted DVD support unavailable.
libdvdread: Couldn't find device name.
[12:04:36] yadif thread started for segment 0
[12:04:36] yadif thread started for segment 1
[12:04:36] decomb thread started for segment 0
[12:04:36] decomb thread started for segment 1
[12:04:36] encx264: keyint-min: 30, keyint-max: 300
[12:04:36] encx264: encoding with stored aspect 8/9
x264 [warning]: width or height not divisible by 16 (720x478), compression will suffer.
x264 [info]: using SAR=8/9
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1 Cache64
x264 [info]: profile Main, level 3.0
No accelerated IMDCT transform found
[12:04:36] sync: expecting 3274 video frames
[12:04:36] reader: first SCR 146
[12:04:36] output track 0: ac3 in sync after skipping 579 bytes
[12:04:36] mpeg2: "Chapter 2" (2) at frame 0 time 9009
[12:04:36] sync: first pts is 9009
[12:05:30] reader: end of chapter 2 (media 2) reached at media chapter 3
[12:05:30] reader: done. 1 scr changes
[12:05:32] sync: got 3259 frames, 3274 expected
[12:05:32] work: average encoding speed for job is 58.038410 fps
[12:05:32] mpeg2 done: 3260 frames
[12:05:32] render: lost time: 90090 (30 frames)
[12:05:32] render: gained time: 90090 (120 frames) (0 not accounted for)
[12:05:32] render: average dropped frame duration: 3003
x264 [info]: slice I:16    Avg QP:22.28  size: 24115  PSNR Mean Y:42.19 U:47.79 V:47.45 Avg:43.29 Global:39.88
x264 [info]: slice P:1107  Avg QP:24.31  size:  8405  PSNR Mean Y:38.98 U:46.16 V:45.47 Avg:40.25 Global:37.80
x264 [info]: slice B:2105  Avg QP:25.65  size:  2908  PSNR Mean Y:38.93 U:46.41 V:45.96 Avg:40.22 Global:37.76
x264 [info]: consecutive B-frames:  8.3% 18.8% 10.0% 30.8% 16.5%  7.1%  8.5%
x264 [info]: mb I  I16..4: 37.5%  0.0% 62.5%
x264 [info]: mb P  I16..4: 15.2%  0.0%  0.0%  P16..4: 65.6%  0.0%  0.0%  0.0%  0.0%    skip:19.2%
x264 [info]: mb B  I16..4:  0.9%  0.0%  0.0%  B16..8: 27.6%  0.0%  0.0%  direct:19.7%  skip:51.8%  L0:27.5% L1:57.3% BI:15.2%
x264 [info]: final ratefactor: 24.31
x264 [info]: direct mvs  spatial:99.2%  temporal:0.8%
x264 [info]: coded y,uvDC,uvAC intra:46.9% 56.2% 15.0% inter:17.7% 13.8% 0.4%
x264 [info]: SSIM Mean Y:0.9509096
x264 [info]: PSNR Mean Y:38.962 U:46.333 V:45.800 Avg:40.241 Global:37.781 kb/s:1174.35
[12:05:32] decomb: yadif deinterlaced 1579 | blend deinterlaced 596 | unfiltered 1053 | total 3228
[12:05:32] starting job
[12:05:32] job configuration:
[12:05:32]  * source
[12:05:32]    + H:\VIDEO_TS
[12:05:32]    + title 3, chapter(s) 2 to 2
[12:05:32]  * destination
[12:05:32]    + C:\Users\masterasia\Desktop\VIDEO_TS-3-2.mkv
[12:05:32]    + container: Matroska (.mkv)
[12:05:32]      + chapter markers
[12:05:32]  * video track
[12:05:32]    + decoder: mpeg2
[12:05:32]      + bitrate 8500 kbps
[12:05:32]    + frame rate: same as source (around 29.970 fps)
[12:05:32]    + strict anamorphic
[12:05:32]      + storage dimensions: 720 * 480 -> 720 * 478, crop 2/0/0/0
[12:05:32]      + pixel aspect ratio: 8 / 9
[12:05:32]      + display dimensions: 640 * 478
[12:05:32]    + filters
[12:05:32]      + Detelecine (pullup) (default settings)
[12:05:32]      + Decomb (default settings)
[12:05:32]    + encoder: x264
[12:05:32]      + options: ref=3:mixed-refs=1:bframes=6:weightb=1:direct=auto:b-pyramid=1:me=umh:subq=9:analyse=all:8x8dct=1:trellis=1:nr=150:no-fast-pskip=1:psy-rd=1,1
[12:05:32]      + bitrate: 1300 kbps, pass: 2
[12:05:32]  * audio track 0
[12:05:32]    + decoder: Japanese (AC3) (2.0 ch) (track 1, id 80bd)
[12:05:32]      + bitrate: 192 kbps, samplerate: 48000 Hz
[12:05:32]    + AC3 passthrough
libdvdread: Encrypted DVD support unavailable.
libdvdread: Couldn't find device name.
[12:05:32] reader: first SCR 146
[12:05:32] yadif thread started for segment 0
[12:05:32] yadif thread started for segment 1
[12:05:32] decomb thread started for segment 0
[12:05:32] decomb thread started for segment 1
[12:05:32] encx264: keyint-min: 30, keyint-max: 300
[12:05:32] encx264: encoding with stored aspect 8/9
x264 [warning]: width or height not divisible by 16 (720x478), compression will suffer.
x264 [info]: using SAR=8/9
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.1 Cache64
[12:05:32] mpeg2: "Chapter 2" (2) at frame 0 time 9009
x264 [info]: profile High, level 3.0
No accelerated IMDCT transform found
[12:05:32] output track 0: ac3 in sync after skipping 579 bytes
[12:05:32] sync: expecting 3274 video frames
[12:05:32] sync: first pts is 9009
[12:08:02] reader: end of chapter 2 (media 2) reached at media chapter 3
[12:08:02] reader: done. 1 scr changes
[12:08:04] sync: got 3259 frames, 3274 expected
[12:08:04] work: average encoding speed for job is 21.433277 fps
[12:08:05] mux: track 0, 3228 frames, 17480614 bytes, 1286.42 kbps, fifo 32
[12:08:05] mux: track 1, 3397 frames, 2608896 bytes, 191.99 kbps, fifo 256
[12:08:05] mpeg2 done: 3260 frames
[12:08:05] render: lost time: 90090 (30 frames)
[12:08:05] render: gained time: 90090 (120 frames) (0 not accounted for)
[12:08:05] render: average dropped frame duration: 3003
x264 [info]: slice I:16    Avg QP:22.19  size: 30716  PSNR Mean Y:42.58 U:49.48 V:49.13 Avg:43.88 Global:42.17
x264 [info]: slice P:1107  Avg QP:25.54  size:  9542  PSNR Mean Y:39.45 U:47.14 V:46.50 Avg:40.77 Global:38.82
x264 [info]: slice B:2105  Avg QP:27.25  size:  3052  PSNR Mean Y:39.37 U:47.27 V:46.85 Avg:40.70 Global:38.52
x264 [info]: consecutive B-frames:  8.3% 18.8% 10.0% 30.8% 16.5%  7.1%  8.5%
x264 [info]: mb I  I16..4: 12.4% 69.3% 18.3%
x264 [info]: mb P  I16..4:  1.3%  3.9%  0.8%  P16..4: 54.6% 13.2% 12.3%  0.2%  0.3%    skip:13.4%
x264 [info]: mb B  I16..4:  0.0%  0.2%  0.1%  B16..8: 44.1%  0.8%  1.6%  direct: 2.6%  skip:50.5%  L0:36.9% L1:54.5% BI: 8.6%
x264 [info]: 8x8 transform  intra:66.3%  inter:76.9%
x264 [info]: direct mvs  spatial:96.4%  temporal:3.6%
x264 [info]: coded y,uvDC,uvAC intra:73.5% 75.9% 30.9% inter:23.6% 24.7% 0.6%
x264 [info]: ref P L0  64.4% 23.6% 12.0%
x264 [info]: ref B L0  84.6% 15.4%
x264 [info]: ref B L1  93.3%  6.7%
x264 [info]: SSIM Mean Y:0.9574847
x264 [info]: PSNR Mean Y:39.415 U:47.238 V:46.739 Avg:40.736 Global:38.633 kb/s:1298.31
[12:08:05] decomb: yadif deinterlaced 1579 | blend deinterlaced 596 | unfiltered 1053 | total 3228
[12:08:05] libhb: work result = 0

Rip done!
HandBrake has exited.


 ############ End of Log ##############  
KonaBlend
Novice
Posts: 72
Joined: Tue Nov 04, 2008 2:35 am

Re: Optimize your Handbrake compile for your hardware

Post by KonaBlend »

So that's getting there -- but if possible you should compile the exact same svn of HB for both sides of the test; for example between svn2430 and sv2455 there was at least 1 bump in x264 version. And other changes too. The characteristic of the test looks reasonable except I think there's one aspect of it that may emphasize compiler versions/cflags more than others -- and that is using some filters. iirc, they are C so they have a tendency to benefit more; so if you did the test without decomb there might be even less of a difference. But then again, some people do need to use filters often, I'm just not one of them.

Aside from those issues; your logs show a 5.3% improvement from hb-r2430/gcc424 -> hb-r2455/gcc440+flags . Which is not bad, but it's also not near 20-25% which you claimed earlier. Of that 5% improvement, I would hazard a guess here it can generally be attributed as < 1% to x264 bump, 1-2% to gcc bump, and 1-2% for agressive (but limiting) cflags.

If you do have time for some more tests, out of curiosity it would be interesting to see the following comparisons as it isolates gcc versions:

1. hb-r2455/gcc424
2. hb-r2455/gcc440
3. hb-r2455/gcc440+cflags .

Also we have another relevant thread with some results: http://forum.handbrake.fr/viewtopic.php?f=4&t=10446 .
masterasia
Posts: 6
Joined: Sat May 23, 2009 1:07 am

Re: Optimize your Handbrake compile for your hardware

Post by masterasia »

I think i shall focus more on the effects of cflags on gcc4.40 rather than recompiling back to gcc 4.2.4.

I googled aroud and made up my own ultimate cflags for core2:
-march=core2 -mtune=core2 -msse4.1 -pipe -mfpmath=sse -malign-double -O3 -ffast-math -fomit-frame-pointer -funroll-loops -fpeel-loops -ftree-loop-linear -fno-tree-pre -ftree-vectorize -minline-all-stringops -fivopts

I think I might switch to -O2, I read that it's best to keep code size smaller to fit in processor cache, especially when core2 access to main memory is so much slower without integrated memory controller

Additional findings:

Handbrake's main libhb uses:

make/include/gcc.defs

GCC.args.O.speed = -O3

side note:
I managed to sucessfully compile handbrake on Playstation3 ubuntu linux 9.10 alpha, overall desktop performance has improve alot after you enable ps3vram as swapfile. Sadly the performance of Handbrake on Playstation 3 is less than impressive since x264 doesn't make use of SPEs. At best it is like a pentium4 2.8Ghz or less. Also playing around with GCC compiler flags don't help much.
KonaBlend
Novice
Posts: 72
Joined: Tue Nov 04, 2008 2:35 am

Re: Optimize your Handbrake compile for your hardware

Post by KonaBlend »

masterasia wrote:I think i shall focus more on the effects of cflags on gcc4.40 rather than recompiling back to gcc 4.2.4.
Then you really should at least provide comparison between hb-r2455/gcc440 and hb-r2455/gcc440+cflags . Otherwise it's meaningless. Also you are passing a lot of redundant options and using options that are useless without profiling.
Post Reply