Linux and SSE3
Forum rules
An Activity Log is required for support requests. Please read How-to get an activity log? for details on how and why this should be provided.
An Activity Log is required for support requests. Please read How-to get an activity log? for details on how and why this should be provided.
Linux and SSE3
I recently noticed that HandbrakeCLI 0.9.2 running on Ubuntu 7.10 doesn't seem to be using my processor's SSE3 (or SSSE3) capabilities - while the Window's version does use them.
I'm encoding using the AppleTV preset and a command line that is as simple as:
HandbrakeCLI -i ./anymovie/VIDEO_TS -o converted.m4v --preset="AppleTV"
I run the same conversion on a dual-boot system in Windows XP / SP2 and Ubuntu 7.10. I'm using an Intel Q6600 @ 3GhZ w/ 4GB of memory.
Running Windows I get:
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 SSSE3 Cache64
Running Ubuntu I get:
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 Cache64
I do see a performance difference between the two conversions (~85fps for XP vs 75fps for Ubuntu) - although it could due to alot of things other than the use of the SSE3 extensions!
Is this a known issue? Ubuntu related?
Regards,
John
PS - Thanks for a fantastic app! It not only does a great job moving my DVD library onto my AppleTV - it is the best thing I have found as a torture test for Overclocking stability!!!
I'm encoding using the AppleTV preset and a command line that is as simple as:
HandbrakeCLI -i ./anymovie/VIDEO_TS -o converted.m4v --preset="AppleTV"
I run the same conversion on a dual-boot system in Windows XP / SP2 and Ubuntu 7.10. I'm using an Intel Q6600 @ 3GhZ w/ 4GB of memory.
Running Windows I get:
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 SSSE3 Cache64
Running Ubuntu I get:
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 Cache64
I do see a performance difference between the two conversions (~85fps for XP vs 75fps for Ubuntu) - although it could due to alot of things other than the use of the SSE3 extensions!
Is this a known issue? Ubuntu related?
Regards,
John
PS - Thanks for a fantastic app! It not only does a great job moving my DVD library onto my AppleTV - it is the best thing I have found as a torture test for Overclocking stability!!!
Last edited by jhsweeney on Fri Apr 11, 2008 4:16 am, edited 1 time in total.
Re: Linux and SSE3
Oh yeah, one more piece of information, I'm using the downloaded version of HandbrakeCLI - not one I compiled myself.
Re: Linux and SSE3
maybe try compiling handbrake yourself (make sure you have yasm installed!) and see if it reports if its using SSE3?
Re: Linux and SSE3
Hey I'm having the same problem. I'm running a Q6600 which has SSSE3 capabilities but handbrake seems to be ignoring them. Did you find a solution?
Re: Linux and SSE3
Um this thread already contains the solution. Use yasm.
Re: Linux and SSE3
jbrake - I'll try it (recompiling with yasm) but as I said in the original post I was using the downloaded (pre-compiled) version -- and the program identified that my processor had MMX, SSE2, etc. I thought if the image was compiled without using yasm no acceleration would be present.
Re: Linux and SSE3
In fairness, I'm not sure if even a yasm-based compile will use features that were not present on the machine that the binary was compiled on - and I am quite certain my machine does not support SSE3. If the bootstrapper instead intelligently autodetects the capabilities of the machine at runtime, I have no reasonable explanation.
Rodney
Rodney
Re: Linux and SSE3
Right...that's why the solution is for the user to recompile with yasm on their machine with its capabilties.rhester wrote:In fairness, I'm not sure if even a yasm-based compile will use features that were not present on the machine that the binary was compiled on - and I am quite certain my machine does not support SSE3.
Re: Linux and SSE3
Once again jbrake is correct! I recompiled on my system (with yasm) and SSE3 and SSSE3 capabilities were enabled!
Some interesting comparisons...
I used the following command in the command line to conver using three different version of SW on the same system:
-i ./ripped_dvd/VIDEO_TS -o episode.mv4 -t 8 --preset="AppleTV"
OS// SW Rev// Speed FPS // Output Filesize// CPU Capabilities Used
Linux// 0.9.2+**// 79// 989MB// MMX,MMXEXT,SSE2,SSE3,SSSE3,Cache64
Linux// 0.9.2// 72// 992MB// MMX,MMXEXT,SSE2,Cache64
Windows XP// 0.9.2// 82// 992MB// MMX,MMXEXT,SSE2,SSE3,SSSE3,Cache64
**Built using jam from svn://svn.HandBrake.fr/trunk on 4/14/08
Some interesting comparisons...
I used the following command in the command line to conver using three different version of SW on the same system:
-i ./ripped_dvd/VIDEO_TS -o episode.mv4 -t 8 --preset="AppleTV"
OS// SW Rev// Speed FPS // Output Filesize// CPU Capabilities Used
Linux// 0.9.2+**// 79// 989MB// MMX,MMXEXT,SSE2,SSE3,SSSE3,Cache64
Linux// 0.9.2// 72// 992MB// MMX,MMXEXT,SSE2,Cache64
Windows XP// 0.9.2// 82// 992MB// MMX,MMXEXT,SSE2,SSE3,SSSE3,Cache64
**Built using jam from svn://svn.HandBrake.fr/trunk on 4/14/08
Re: Linux and SSE3
I have tried and succeeded to compile, but I still can't get SSE3 enabled (yasm installed). Compilation goes fine with two different svn revisions, so far 1447 and 1449. I have also compiled the 0.9.2 source code from the download page, but no SSE3. My box is running Ubuntu 8.04 on Intel Q9300. Any suggestions?
Re: Linux and SSE3
I finally got to the bottom of it. It turns out that not just any yasm works (it might have to do with me having a relatively new cpu). The default yasm from Ubuntu 8.04 is version 0.5.0 which does not enable SSE3 on Q9300. So I checked out yasm 0.7.0 from svn, and x264 reports SSE3. This might be useful to those about to get/running very new cpu's.
Re: Linux and SSE3
thanks jmarius, my encodes are now avg 100 fps!
Re: Linux and SSE3
Good tips re yasm compile but the increase in speed seems quite neglible.
Where the above results from the first or second pass?
For what its worth; here are the results on a dual opti 2ghz running redhat.
Options
========
-e x264 -E faac -p -x level=4.1:ref=3:mixed-refs:bframes=3:b-rdo:bime:weightb:direct=auto:subme=5:trellis=1:me=umh:merange=12 -B 192 -R 44.1 -D 1 -a 1 -6 stereo -Y 576 -2 -T -P -t
Output
======
Modified x264 options for pass 1 to append turbo options: level=4.1:ref=3:mixed-refs:bframes=3:b-rdo:bime:weightb:direct=auto:subme=5:trellis=1:me=umh:merange=12:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0
x264 [info]: SSIM Mean Y:0.9552489
x264 [info]: PSNR Mean Y:39.726 U:47.847 V:48.405 Avg:41.097 Global:39.839 kb/s:993.28
x264 [info]: using SAR=341/240
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 3DNow!
Encoding: task 1 of 2, 99.96 % (108.38 fps, avg 91.94 fps, ETA 00h00m00s)No accelerated IMDCT transform found
Where the above results from the first or second pass?
For what its worth; here are the results on a dual opti 2ghz running redhat.
Options
========
-e x264 -E faac -p -x level=4.1:ref=3:mixed-refs:bframes=3:b-rdo:bime:weightb:direct=auto:subme=5:trellis=1:me=umh:merange=12 -B 192 -R 44.1 -D 1 -a 1 -6 stereo -Y 576 -2 -T -P -t
Output
======
Modified x264 options for pass 1 to append turbo options: level=4.1:ref=3:mixed-refs:bframes=3:b-rdo:bime:weightb:direct=auto:subme=5:trellis=1:me=umh:merange=12:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0
x264 [info]: SSIM Mean Y:0.9552489
x264 [info]: PSNR Mean Y:39.726 U:47.847 V:48.405 Avg:41.097 Global:39.839 kb/s:993.28
x264 [info]: using SAR=341/240
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 3DNow!
Encoding: task 1 of 2, 99.96 % (108.38 fps, avg 91.94 fps, ETA 00h00m00s)No accelerated IMDCT transform found
Re: Linux and SSE3
FYI:
For current SVN of HB and Ubuntu 7.10 Gusty, the tarball of yasm which is in the Intrepid Ibex repos (that which will be 8.10) will work. You don't need to track yasm's SVN.
Since 8.04 Hardy shipped with the same version of yasm as 7.10, this should work for that, too.
For current SVN of HB and Ubuntu 7.10 Gusty, the tarball of yasm which is in the Intrepid Ibex repos (that which will be 8.10) will work. You don't need to track yasm's SVN.
Since 8.04 Hardy shipped with the same version of yasm as 7.10, this should work for that, too.
Re: Linux and SSE3
Looks like you were just missing out on 3dnow, since the binary download was build with MMX and SSE2. So not so surprising it didn't help so much on your hardware.Minbari wrote:Good tips re yasm compile but the increase in speed seems quite neglible.
[...]
For what its worth; here are the results on a dual opti 2ghz running redhat.
The precompiled binary leaves a lot more performance on the floor on Core 2 CPUs, since they support SSSE3 which seems to help a lot.
BTW, Opterons (pre-Barcelona/Phenom architecture) only have a 64bit SSE execution engine. So they have to do 128bit SIMD instructions in two halves. Core2, and current AMD designs based on the quad-core Barcelona, have 128bit execution engines, so they do a 128bit SIMD op just as fast as a 64bit SIMD op. (e.g. MULSD (multiply scalar double-precision) vs MULPD (multiply packed double-precision, operating on two numbers at once). So just another reason why more extensive SIMD doesn't help so much on AMD k8 cores.