Linux and SSE3

Support for HandBrake on Linux, Solaris, and other Unix-like platforms
Forum rules
An Activity Log is required for support requests. Please read How-to get an activity log? for details on how and why this should be provided.
Post Reply
jhsweeney
Posts: 11
Joined: Fri Feb 16, 2007 5:44 pm

Linux and SSE3

Post by jhsweeney »

I recently noticed that HandbrakeCLI 0.9.2 running on Ubuntu 7.10 doesn't seem to be using my processor's SSE3 (or SSSE3) capabilities - while the Window's version does use them.

I'm encoding using the AppleTV preset and a command line that is as simple as:

HandbrakeCLI -i ./anymovie/VIDEO_TS -o converted.m4v --preset="AppleTV"

I run the same conversion on a dual-boot system in Windows XP / SP2 and Ubuntu 7.10. I'm using an Intel Q6600 @ 3GhZ w/ 4GB of memory.

Running Windows I get:
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 SSSE3 Cache64

Running Ubuntu I get:
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 Cache64

I do see a performance difference between the two conversions (~85fps for XP vs 75fps for Ubuntu) - although it could due to alot of things other than the use of the SSE3 extensions!

Is this a known issue? Ubuntu related?

Regards,
John

PS - Thanks for a fantastic app! It not only does a great job moving my DVD library onto my AppleTV - it is the best thing I have found as a torture test for Overclocking stability!!!
Last edited by jhsweeney on Fri Apr 11, 2008 4:16 am, edited 1 time in total.
jhsweeney
Posts: 11
Joined: Fri Feb 16, 2007 5:44 pm

Re: Linux and SSE3

Post by jhsweeney »

Oh yeah, one more piece of information, I'm using the downloaded version of HandbrakeCLI - not one I compiled myself.
Polygon
Novice
Posts: 72
Joined: Wed Oct 24, 2007 1:36 pm

Re: Linux and SSE3

Post by Polygon »

maybe try compiling handbrake yourself (make sure you have yasm installed!) and see if it reports if its using SSE3?
vv8d6n0s
Posts: 1
Joined: Mon Apr 14, 2008 3:52 pm

Re: Linux and SSE3

Post by vv8d6n0s »

Hey I'm having the same problem. I'm running a Q6600 which has SSSE3 capabilities but handbrake seems to be ignoring them. Did you find a solution?
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: Linux and SSE3

Post by jbrjake »

Um this thread already contains the solution. Use yasm.
jhsweeney
Posts: 11
Joined: Fri Feb 16, 2007 5:44 pm

Re: Linux and SSE3

Post by jhsweeney »

jbrake - I'll try it (recompiling with yasm) but as I said in the original post I was using the downloaded (pre-compiled) version -- and the program identified that my processor had MMX, SSE2, etc. I thought if the image was compiled without using yasm no acceleration would be present.
rhester
Veteran User
Posts: 2888
Joined: Tue Apr 18, 2006 10:24 pm

Re: Linux and SSE3

Post by rhester »

In fairness, I'm not sure if even a yasm-based compile will use features that were not present on the machine that the binary was compiled on - and I am quite certain my machine does not support SSE3. If the bootstrapper instead intelligently autodetects the capabilities of the machine at runtime, I have no reasonable explanation.

Rodney
jbrjake
Veteran User
Posts: 4805
Joined: Wed Dec 13, 2006 1:38 am

Re: Linux and SSE3

Post by jbrjake »

rhester wrote:In fairness, I'm not sure if even a yasm-based compile will use features that were not present on the machine that the binary was compiled on - and I am quite certain my machine does not support SSE3.
Right...that's why the solution is for the user to recompile with yasm on their machine with its capabilties.
jhsweeney
Posts: 11
Joined: Fri Feb 16, 2007 5:44 pm

Re: Linux and SSE3

Post by jhsweeney »

Once again jbrake is correct! I recompiled on my system (with yasm) and SSE3 and SSSE3 capabilities were enabled!

Some interesting comparisons...

I used the following command in the command line to conver using three different version of SW on the same system:

-i ./ripped_dvd/VIDEO_TS -o episode.mv4 -t 8 --preset="AppleTV"

OS// SW Rev// Speed FPS // Output Filesize// CPU Capabilities Used
Linux// 0.9.2+**// 79// 989MB// MMX,MMXEXT,SSE2,SSE3,SSSE3,Cache64
Linux// 0.9.2// 72// 992MB// MMX,MMXEXT,SSE2,Cache64
Windows XP// 0.9.2// 82// 992MB// MMX,MMXEXT,SSE2,SSE3,SSSE3,Cache64

**Built using jam from svn://svn.HandBrake.fr/trunk on 4/14/08
jmarius
Posts: 4
Joined: Sun Apr 22, 2007 3:57 pm

Re: Linux and SSE3

Post by jmarius »

I have tried and succeeded to compile, but I still can't get SSE3 enabled (yasm installed). Compilation goes fine with two different svn revisions, so far 1447 and 1449. I have also compiled the 0.9.2 source code from the download page, but no SSE3. My box is running Ubuntu 8.04 on Intel Q9300. Any suggestions?
jmarius
Posts: 4
Joined: Sun Apr 22, 2007 3:57 pm

Re: Linux and SSE3

Post by jmarius »

I finally got to the bottom of it. It turns out that not just any yasm works (it might have to do with me having a relatively new cpu). The default yasm from Ubuntu 8.04 is version 0.5.0 which does not enable SSE3 on Q9300. So I checked out yasm 0.7.0 from svn, and x264 reports SSE3. This might be useful to those about to get/running very new cpu's.
elchubi
Posts: 4
Joined: Sun May 04, 2008 1:24 pm

Re: Linux and SSE3

Post by elchubi »

thanks jmarius, my encodes are now avg 100 fps!
Minbari
Posts: 2
Joined: Sun Jun 22, 2008 7:28 am

Re: Linux and SSE3

Post by Minbari »

Good tips re yasm compile but the increase in speed seems quite neglible.

Where the above results from the first or second pass?

For what its worth; here are the results on a dual opti 2ghz running redhat.
Options
========
-e x264 -E faac -p -x level=4.1:ref=3:mixed-refs:bframes=3:b-rdo:bime:weightb:direct=auto:subme=5:trellis=1:me=umh:merange=12 -B 192 -R 44.1 -D 1 -a 1 -6 stereo -Y 576 -2 -T -P -t

Output
======
Modified x264 options for pass 1 to append turbo options: level=4.1:ref=3:mixed-refs:bframes=3:b-rdo:bime:weightb:direct=auto:subme=5:trellis=1:me=umh:merange=12:ref=1:subme=1:me=dia:analyse=none:trellis=0:no-fast-pskip=0:8x8dct=0:weightb=0

x264 [info]: SSIM Mean Y:0.9552489
x264 [info]: PSNR Mean Y:39.726 U:47.847 V:48.405 Avg:41.097 Global:39.839 kb/s:993.28
x264 [info]: using SAR=341/240
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 3DNow!
Encoding: task 1 of 2, 99.96 % (108.38 fps, avg 91.94 fps, ETA 00h00m00s)No accelerated IMDCT transform found
tlunde
Posts: 31
Joined: Fri Dec 15, 2006 9:52 pm

Re: Linux and SSE3

Post by tlunde »

FYI:

For current SVN of HB and Ubuntu 7.10 Gusty, the tarball of yasm which is in the Intrepid Ibex repos (that which will be 8.10) will work. You don't need to track yasm's SVN.

Since 8.04 Hardy shipped with the same version of yasm as 7.10, this should work for that, too.
pcordes
Posts: 14
Joined: Mon Aug 04, 2008 12:12 am

Re: Linux and SSE3

Post by pcordes »

Minbari wrote:Good tips re yasm compile but the increase in speed seems quite neglible.
[...]
For what its worth; here are the results on a dual opti 2ghz running redhat.
Looks like you were just missing out on 3dnow, since the binary download was build with MMX and SSE2. So not so surprising it didn't help so much on your hardware.

The precompiled binary leaves a lot more performance on the floor on Core 2 CPUs, since they support SSSE3 which seems to help a lot.

BTW, Opterons (pre-Barcelona/Phenom architecture) only have a 64bit SSE execution engine. So they have to do 128bit SIMD instructions in two halves. Core2, and current AMD designs based on the quad-core Barcelona, have 128bit execution engines, so they do a 128bit SIMD op just as fast as a 64bit SIMD op. (e.g. MULSD (multiply scalar double-precision) vs MULPD (multiply packed double-precision, operating on two numbers at once). So just another reason why more extensive SIMD doesn't help so much on AMD k8 cores.
Post Reply