Page 1 of 3

NLMeans denoise filter

Posted: Sat May 24, 2014 3:02 am
by BradleyS
Update: HandBrake 1.0.0 and later switched to a key=value parameter setting. 6.0:1:7:3:2:0 (NLMeans Medium) passed as a custom parameter setting would now be:

Code: Select all

The parameters format is as follows:

Code: Select all


NLMeans was committed as revision 6216: A number of small updates and fixes were made since, and threading was added in revision 6397: Performance is greatly improved.

The original patch was CLI only, enabled with --nlmeans and optionally, --nlmeans-tune. Current HandBrake also provides full GUI support.

--nlmeans is the same as --nlmeans="medium" (with or without --nlmeans-tune="none"), or 6:1:7:3:2:0, as of revision 7402: The initial NLMeans implementation and HandBrake release versions 0.10.x previously used 8:1:7:3:2:0, as does this document.

As of release version 0.10.x and development version 7402 (2015-08-17):

Code: Select all

            none           film                 animation
ultralight  1.5:1:7:3:2:0  1.5:0.9:7:3:2:0:2.4  2.5:0.15:5:7:2:0:2.00
light       3.0:1:7:3:2:0  3.0:0.8:7:3:2:0:4.0  3.0:0.15:5:7:3:0:2.25
medium      6.0:1:7:3:2:0  6.0:0.8:7:3:2:0:8.0  5.0:0.15:5:7:4:0:4.00
strong       10:1:7:3:2:0  8.0:0.6:7:3:2:0:10    10:0.15:5:7:4:0:8.00

ultralight                 0.0:0.9:7:3:2:0:2.4 (same as 0:0:0:0:0:0:2.4:0.9:7:3:2:0)
light                      0.0:0.8:7:3:2:0:3.5 (same as 0:0:0:0:0:0:3.5:0.8:7:3:2:0)
medium                     0.0:0.8:7:3:2:0:6.0 (same as 0:0:0:0:0:0:6.0:0.8:7:3:2:0)
strong                     0.0:0.6:7:3:2:0:8.0 (same as 0:0:0:0:0:0:8.0:0.6:7:3:2:0)

                           high motion
ultralight                 1.5:0.9:7:3:2:0:2.40:0.9:7:5:1:0
light                      3.0:0.9:7:3:2:0:3.25:0.8:7:5:1:0
medium                     6.0:0.8:7:3:2:0:6.00:0.7:7:5:1:0
strong                     8.0:0.6:7:3:2:0:6.75:0.5:7:5:1:0
Revised original notes below have been kept for posterity.


NLMeans is a denoise filter that produces higher quality output than HandBrake's old one (hqdn3d).

Unlike hqdn3d, which reduces noise via lowpass filtering (removal of high frequency information), NLMeans achieves its result by averaging multiple patches of similar pixels together; the similarities remain and variance (usually noise) is attenuated. This method does a good job of reducing unwanted noise without clobbering the high frequency range, restoring the appearance of structure and detail found in the original source.

Thanks to Dr. Dirk Farin for his GPL implementation of NLMeans for ffmpeg, the basis for this work.

Presets and tunes

The four strength presets are ultralight, light, medium, and strong.
- Ultralight does its best to remove a small amount of noise without substantially affecting the look of the original picture.
- The others are fairly self explanatory.

The five tunes are film, grain, highmotion, animation, and none (or omitted).
- Film is suitable for all normal / live action content and removes slightly more chroma noise than luma noise.
- Grain passes the luma completely untouched and is nearly identical to film concerning chroma, only slightly weaker on account of leaving the luma noise. The appearance of grain is well preserved when used in conjunction with x264's grain tune. For a very slight grain reduction, try film on ultralight.
- Highmotion employs spatial-only filtering for the chroma channels in order to avoid color smearing between frames with very high motion (something which only becomes an issue at very strong settings) and is otherwise identical to film.
- Animation is good for cel animation and tries harder to remove artifacts around edges while preserving their overall fidelity.
- None uses equal strength for luma and chroma, and does not make any special adjustments for different types of content.

For the CLI, presets are specified e.g., --nlmeans="medium" --nlmeans-tune="film". Custom (numeric) settings may be passed to --nlmeans, in which case --nlmeans-tune is ignored. Failure to specify any setting to --nlmeans uses the original filter default, 8:1:7:3:2:0.

Custom parameters

Everything below is concerning custom settings and is only useful for advanced tweaking. Use the preset system, use the preset system, use the preset system, and be happy. Unless of course, you have a really difficult source and know what you're doing.

The default parameters are equivalent to --nlmeans="8:1:7:3:2:0". The six parameters are:

Origin Weight
Patch Size (context width)
Range (spatial search window width)
Frames (temporal search depth)
Prefilter Mode

Extended syntax for discrete channel parameters is available. Simply append additional parameters to target the additional channels in order Y, Cb, Cr. Parameter values cascade, e.g. to target both chroma channels equally, it is sufficient to specify values for Cb's parameters. To target all channels, specify Y only.

Code: Select all

Extended syntax in detail:
--nlmeans="lumaY_strength   : lumaY_origin_tune   : lumaY_patch_size   : lumaY_range   : lumaY_frames   : lumaY_prefilter   :
           chromaB_strength : chromaB_origin_tune : chromaB_patch_size : chromaB_range : chromaB_frames : chromaB_prefilter :
           chromaR_strength : chromaR_origin_tune : chromaR_patch_size : chromaR_range : chromaR_frames : chromaR_prefilter"

The following are equivalent:

The following disable processing of the Cr channel:

The following disables processing of the Y channel and
specifies the default settings for both Cb and Cr:
Strength. 0.00 to unlimited, default: 8.00.

Controls how much noise is removed. Higher values produce smoother images. Values between 3 and 10 are usually appropriate. An initial value between 6 and 8 is recommended.

In the original NLMeans algorithm and this implementation, strength (technically, "averaging weight decay") is not constant, meaning that other settings also have considerable influence the amount of noise removed. Be mindful of this, especially when widening your search range; a reduction in overall strength may be needed to compensate.

Origin Weight. 0.00 to 1.00, default: 1.00.

For every search, there is the case where the patch of pixels being compared to is the same as the source. This parameter adjusts the amount of influence that single comparison has on the overall result. Higher values promote the origin patch and lower values reduce its influence.

Most sources benefit from a slight reduction from the default value of 1, which will likely be lower in the future, once I've tested enough material to make an informed decision on what that value should be.

Try values between 0.50 for noisier inputs and 1.00 for cleaner inputs. Lower values for animation can significantly reduce artifacts around edges; try values between 0.15 and 0.50. Most importantly, tune this parameter last. Lower values may result in smoother video, so be sure to revisit the strength parameter.

Patch Size. Odd number, default: 7.

Sets the dimensions of the patches to be compared. Sane values are odd numbers 3 through 9. The default, 7, produces a patch size of 7x7 or 49 pixels. Likewise, 5 produces a patch size of 5x5 or 25 pixels, and 3 produces a patch size of 3x3 or 9 pixels. A value of 1 is usually not very beneficial, and values too large will reduce fine detail.

Search Range. Odd number, default: 3.

Sets the dimensions of the spatial search window. Sane values are odd numbers 3 through 31, typically 3, 5, or 7.

Each patch is compared against many others. This parameter restricts how many others to an area surrounding the original patch, rather than search the entire frame. A range of 3 creates a 3x3 window, yielding 9 patch comparisons. With a 7x7 patch size, that's 441 pixel comparisons.

Use extreme caution with values higher than about 15, as the computation required increases exponentially. A 15x15 search window with 7x7 patch size must compare more than 11,000 pixels per patch. At 31x31, the number skyrockets to over 49,000 and speed is expressed in minutes per frame.

Frames. 1 to 32, default: 2.

How many video frames to compare against, also known as temporal filtering. Sane values for normal content are 1 to 3. Animation may benefit from higher values, not usually more than about 10. Very high values will produce visible ghosting.

Prefilter Mode. Enum, default: 0 (off).

Selects type of prefiltering to use for weight analysis: mean or median, each with two kernel sizes. The prefiltered image is only used for analysis and is not applied to the output (unless passthru is specified).

Prefiltering can dramatically improve the decisions NLMeans makes during weight analysis. By referencing a partially denoised copy of the original image, the algorithm can form a better opinion about what is detail and what is noise. This is especially useful for very noisy sources and usually commands a reduction in overall strength (for luma say, from 6-8 to 3 when using the mean 3x3 filter), since brute force is no longer needed to remove most of the noise. I've observed quality improvements upwards of 20% when used appropriately.

A minor side-effect of prefiltering is that ultra-fine detail may be slightly attenuated. The reduce strength modes (256, 512), which can be combined, may limit this effect. Regardless, prefiltering is not recommended for animation.

The edge boost mode (1024) is useful for bringing back blurred edges in the prefiltered image (still not recommended for animation). Passthru mode (2048) skips NLMeans filtering and outputs the prefiltered image instead.

Prefilter modes are:

(0) Off
(1) Mean 3x3 filter
(2) Mean 5x5 filter
(4) Median 3x3 filter
(8) Median 5x5 filter
(256) Reduce strength by 25%
(512) Reduce strength by 50%
(1024) Edge boost
(2048) Passthru

To use multiple modes together, simply add their values. For example: to prefilter using the mean 3x3 filter at 50% strength and recover some lost edge detail using edge boost, use mode 1537 (1 + 512 + 1024). To see what that prefilter looks like without any additional processing, enable passthru for a final mode of 3585 (1537 + 2048). In conjunction with the default settings, this would look something like --nlmeans="8:1:7:3:2:1537" or --nlmeans="8:1:7:3:2:3585".

The default settings are pretty good for film and similar sources. A good workflow for tweaking is:

1. Start with the defaults or an educated guess.
2. Adjust strength. Moderate reduction in noise is recommended; strong settings out of the gate may impair this workflow.
3. Decrease patch size if fine detail is lost (assuming your initial strength is reasonable). Increase patch size if coarse or splotchy noise is present (values beyond 9 are not typically recommended).
4. Increase search range slightly to potentially increase quality. This is especially useful for sources with repetitive features. Stop when quality diminishes or performance becomes too poor.
5. Decrease strength to compensate for increased search range.
6. For animation or low motion, experiment by adding more temporal frames. Decrease range, strength to compensate.
7. When everything looks pretty great, tune the origin weight and make a final overall strength adjustment (decrease slightly for lower origin weight).
8. If the source is somewhat troublesome and the output still shows some areas where noise isn't removed efficiently, try enabling the prefilter and lowering strength some.

Custom parameters examples

Here are some settings I've had success with and some comparison to hqdn3d. Video only, RF 18.

Code: Select all

Baseball (cartoon), 720x480, heavy grain:
   Off:    -                             5.58 Mbit/s
   Medium: hqdn3d  7:7:7:5:5:5           4.07 Mbit/s  hqdn3d strong preset is entirely inadequate here
   Medium: hqdn3d  25:9:9:5:5:5          1.92 Mbit/s  smooth but muddy, edges are smeared/streaked
*  Medium: nlmeans 5.25:0.15:5:25:1:0    2.13 Mbit/s  maximum useful range, excellent but very slow
*  Medium: nlmeans 5.25:0.15:3:17:6:0    2.40 Mbit/s  smaller patch size and 6 frames, minor ghosting, good but slow
*  Strong: nlmeans 10.0:0.15:5:7:10:0    1.16 Mbit/s  zero noise, very fine texture is lost but edges are excellent

Code: Select all

Oven (film), 1920x1040, moderate grain with bad chroma:
   Off:    -                            36.40 Mbit/s
   Medium: hqdn3d  4:19:19:4:4:4        24.82 Mbit/s  sacrifices detail and still compresses poorly
   Medium: nlmeans 8.00:1.00:7:3:2:0    13.75 Mbit/s  default looks good
*  Medium: nlmeans 4.65:0.85:7:3:3:0    14.39 Mbit/s  slightly better fine detail
*  Medium: nlmeans 4.65:0.85:7:3:3:1    11.87 Mbit/s  prefiltering dramatically improves chroma
   Medium: nlmeans 4.65:0.85:7:3:3:2    11.44 Mbit/s  only slightly more efficient than prefilter=1

Code: Select all

Remote (film), 1920x1040, moderate grain with bad chroma:
   Off:    -                            24.51 Mbit/s
   Medium: hqdn3d  4:19:19:4:4:4         7.26 Mbit/s  chroma bleed and artifacts, luma not too bad
   Medium: nlmeans 8.00:1.00:7:3:2:0     7.14 Mbit/s  default looks good
*  Medium: nlmeans 4.65:0.85:7:3:3:0     6.79 Mbit/s  looks pretty great
*  Medium: nlmeans 3.00:0.85:5:3:3:0:
                   3.25:0.85:7:3:3:0     7.32 Mbit/s  discrete channel tuning yields best result

Code: Select all

Whip (film), 1920x804, over-compressed heavy grain, mixed motion:
   Off:    -                            29.10 Mbit/s
   Light:  hqdn3d  4.5:3:3:2:3:3        18.49 Mbit/s  light, many noticeable artifacts
*  Light:  nlmeans 2.25:0.60:7:3:2:0    18.69 Mbit/s  retains a pleasing, even fine grain while removing artifacts
   Medium: hqdn3d  11:8:8:4:5:5         10.45 Mbit/s  strong, smooth with ugly artifacts
   Medium: nlmeans 8.00:1.00:7:3:2:0    10.30 Mbit/s  default pretty good, far fewer artifacts
*  Medium: nlmeans 6.00:0.60:7:3:2:0    10.83 Mbit/s  quite amazing and very fast, still some artifacts
*  Medium: nlmeans 3.00:0.60:7:3:2:1    10.65 Mbit/s  prefiltering eliminates all artifacts

Code: Select all

Dale (mobile), 1920x1080, blotchy chroma noise:
   Off:    -                            21.31 Mbit/s
   Light:  hqdn3d  3:2:2:2:3:3          17.43 Mbit/s  source lacks high frequencies, lowpass not too bad
*  Light:  nlmeans 2.75:0.90:5:5:2:0    13.52 Mbit/s  better

Code: Select all

Basement (film), 1920x800, clean, very light grain, low motion:
   Off:    -                            14.77 Mbit/s
   Light:  hqdn3d  2:1:1:2:3:3          10.58 Mbit/s  hqdn3d weak preset, slightly smoother than source
   Light:  nlmeans 1.50:1.00:3:7:2:0     8.39 Mbit/s  visually similar to hqdn3d, 43% better compression
*  Light:  nlmeans 1.15:0.80:3:7:2:0    10.28 Mbit/s  nearly identical to source, 30% better compression

Code: Select all

Franklin (dslr), 1920x1080, clean:
   Off:    -                            10.65 Mbit/s
   Medium: hqdn3d  3:2:2:2:3:3           5.05 Mbit/s  less pleasing but still good
*  Medium: nlmeans 8.00:1.00:7:3:2:0     4.64 Mbit/s  better at similar bitrate

Code: Select all

Jon (dslr), 1920x1080, sensitivity and compression noise:
   Off:    -                            10.93 Mbit/s
   Medium: hqdn3d  3:2:2:2:3:3           5.88 Mbit/s  looks good but leaves artifacts around edges
*  Medium: hqdn3d  2:1:1:2:3:3
         + nlmeans 1.50:0.80:7:3:1:0     5.65 Mbit/s  nlmeans cleans up hqdn3d artifacts


The default settings yield roughly 3 frames per second for 1920x1080p24 and 15 frames per second for 720x480p24 on a 3.33 GHz Intel Xeon processor. The quick-and-dirty parameter string "12:1:7:3:1:0" yields 5 fps for 1920x1080p24 and 28 fps for 720x480p24.

NLMeans is now threaded thanks to JohnAStebbins. Performance increases of 3-9x over the above figures seem typical.


As you can see, in addition to removing unpleasant artifacts, intelligent removal of noise can have very positive effects on bitrate.

Going forward, hqdn3d is mostly useful for sources where the noise is restricted to high frequency data, such as the output of certain mobile devices and DSLRs. In such cases, a simple lowpass does the job with only a slight loss in overall fidelity. In almost every other common case, hqdn3d will not retain enough detail to be truly pleasing. NLMeans is far superior in this regard and every other, except speed.

Future improvement

While quite advanced, the algorithm has minor quirks that provide some room for improvement.
  • Replacing the current strength metric with one that takes patch size, search range, and temporal depth into account would create more consistency between settings and improve usability.
  • Optimizing to not recalculate patch difference weight(b,a) where weight(a,b) has already been calculated could provide up to 2x speed improvement.
  • Maintaining a covariance matrix would provide a method for disposing of calculations deviating far enough from the mean to lower quality, solving the mild problem of cumulative error associated with the algorithm's weight system. Prefiltering partially solves this problem for noisier sources.
  • Replacing the semi-exhaustive search window with a more intelligent, predictive search would potentially increase performance and quality.

Update 2014-05-24: Add more example sources and settings, update list of potential improvements, clarify some sections, fix some typos.
Update 2014-05-26: Add example images plus additional sources and settings, update patch to latest version, fix some typos.
Update 2014-05-27: Add more example images, update patch to latest version.
Update 2014-05-31: Add prefilter examples, update patch to latest version.
Update 2014-06-01: Add example of discrete channel parameters, update patch to latest version, improve example parameters readability, fix some typos.
Update 2014-06-08: Add preset and tune info, update patch to latest version.
Update 2014-06-10: Update patch to latest version (cosmetics only).
Update 2014-06-12: Update patch to latest version (minor bug fixes).
Update 2014-06-14: Update patch to latest version (minor bug fixes).
Update 2014-06-19: Update patch to commit version:
Update 2014-09-17: Official name is now NLMeans (nlmeans). Update and clarify some sections and add notes about threading committed as revision 6397:
Update 2015-01-24: Replaced remote images with inline attachments, and attached all available examples (including YUV and hqdn3d comparisons) as a tarball.
Update 2015-08-17: Add notes about --nlmeans with no parameters as of revision 7402:
Update 2015-08-17: Add current presets and tunes parameters mapping.
Update 2017-06-16: Update to reflect parameter settings mapping for HandBrake 1.0.0 and later.

Re: NL-means denoise filter

Posted: Tue May 27, 2014 10:00 am
by mod16
This looks very impressive. :)

The current denoise filter is certainly one of the very few weaknesses of HandBrake (for my requirements the only one). I'm really looking forward for this new solution to become stable and usable for everyone.

Thank you for your effort!

Re: NL-means denoise filter

Posted: Tue May 27, 2014 12:44 pm
by Rodeo
mod16 wrote:and usable for everyone.
Given its speed, even with threading this will most likely never be the case :P

Re: NL-means denoise filter

Posted: Tue May 27, 2014 1:37 pm
by BradleyS
Indeed, it is very slow. However, I'm hopeful that with threading I may achieve 2x real time fps on my CPU, possibly more with improvements to the algorithm. So maybe 0.5x to 1x real time fps for typical machines, eventually. Will be fun to compare performance against HEVC!

Re: NL-means denoise filter

Posted: Tue May 27, 2014 4:21 pm
by mod16
Rodeo wrote:
mod16 wrote:and usable for everyone.
Given its speed, even with threading this will most likely never be the case :P
Nevertheless it would be really nice to have an alternative to the current solution. Of course it's a matter of personal preferences. I wouldn't care if encoding took twice as long if I get a better picture quality and smaller file sizes (given a source where as denoise filter makes sense).

Re: NL-means denoise filter

Posted: Tue May 27, 2014 6:41 pm
by BradleyS
I just added a couple more animation examples. Anime fans will love this filter; it does a great job preserving edges.

Re: NL-means denoise filter

Posted: Sat May 31, 2014 8:17 pm
by BradleyS
Added prefilter example. Prefiltering greatly improves weight analysis in most scenarios. Have a look at Oven 01a Comparison.

Re: NL-means denoise filter

Posted: Sun Jun 01, 2014 2:33 am
by BradleyS
Added another prefilter example.

The source "Whip" was difficult in that the algorithm would not fully remove the noise, forcing a choice between leftover artifacts (still fewer than hqdn3d) or sacrificing detail with very strong settings. Prefiltering completely solves the problem.

Flip back and forth between the original, noisy source and the nlmeans prefilter versions. The results are simply stunning.

Re: NL-means denoise filter

Posted: Mon Jun 02, 2014 8:04 am
by Smithcraft
I have a question for you BradleyS.

Personally, I love grain, and I think that using the grain tune gives me better results in terms of encoded image quality.

Would I be able to preserve grain, but do a way with unwanted noise with this filter?


Re: NL-means denoise filter

Posted: Mon Jun 02, 2014 1:49 pm
by BradleyS
Smithcraft wrote:Personally, I love grain, and I think that using the grain tune gives me better results in terms of encoded image quality.
I'm right there with you.
Smithcraft wrote:Would I be able to preserve grain, but do a way with unwanted noise with this filter?
Yes. In the most recent version of the patch (uploaded yesterday), I've added discrete channel parameters. To process only the chroma (color) channels and leave the luminance noise (grain), use something like --nlmeans="0:0:0:0:0:0:8:1:7:3:2:0", which is less redundant than and equivalent to --nlmeans="0:0:0:0:0:0:8:1:7:3:2:0:8:1:7:3:2:0".

If your source has heavy grain you can reduce it slightly by using a lesser strength for luminance with something like --nlmeans="2.25:1:7:3:2:0:8", which specifies a strength of 2.25 for luma and 8 for chroma. See the "Whip" example, "Light" settings (no image for this, sorry).

Re: NL-means denoise filter

Posted: Mon Jun 02, 2014 4:31 pm
by BradleyS
On that note, I'm working on some presets for nlmeans like we have for hqdn3d (old --denoise). One of them will skip the luma and act kind of like a chroma smoother, only better. So you shouldn't have to think much about settings for grain preservation, just set it and forget it. Will pair well with the appropriate x264 tune.

Re: NL-means denoise filter

Posted: Sat Jun 07, 2014 7:08 am
by Smithcraft
Glad to hear it!

Thanks BradleyS!


Re: NL-means denoise filter

Posted: Sun Jun 08, 2014 8:07 am
by BradleyS
The latest update introduces presets and tunes.

For most content --nlmeans="medium" --nlmeans-tune="film" will work well. Adjust medium downward/upward to light/strong as necessary.

When coupling very strong settings with very high motion, try a strength preset with --nlmeans-tune="highmotion" to avoid color trails.

For cel animation such as cartoons or anime, try a strength preset with --nlmeans-tune="animation". The animation tune is necessarily slower than tunes for other content types due to increased search range and temporal depth.

For grain preservation (@Smithcraft), try a strength preset with --nlmeans-tune="grain" and use x264 tune:grain. The filter won't touch the luma and it's faster than film since there's less processing involved.

Alternatively, --nlmeans="ultralight" --nlmeans-tune="film" will very slightly affect the luma, if a minor reduction in grain is desired. I've found this to be especially useful in the case of slightly overcompressed grainy sources; it has the potential to turn nasty grain into pleasing grain.

Testing and feedback is appreciated. I recommend 10-30 second segments to get started. 10 seconds of 1080p24 takes about 90 seconds to encode on my CPU; 10 seconds of 480p24 takes about 15 seconds. The results can be excellent, with bitrates reduced by 40-70%.

Re: NL-means denoise filter

Posted: Sun Jun 08, 2014 1:26 pm
by fervid
Cool, looks like the Noise Shampoo Photoshop filter. I don't care how slow it is, I'd love that to be an option in WinGUI. Young Frankenstein really could use it.

Re: NL-means denoise filter

Posted: Thu Jun 19, 2014 10:07 pm
by BradleyS
Committed as revision 6216:

Re: NL-means denoise filter

Posted: Mon Jul 07, 2014 12:23 am
by Djfe
is someone working on a threaded version of nl-means at the moment? / how likely is it that someone will take care of this in the future?

Re: NL-means denoise filter

Posted: Mon Jul 07, 2014 12:57 am
by BradleyS
I haven't spent time on threading because JohnAStebbins is planning to introduce frame-based threading at some point in the near future.

Re: NL-means denoise filter

Posted: Tue Jul 08, 2014 3:55 pm
by Djfe
as a general threading for several filters or just for this one?

Re: NL-means denoise filter

Posted: Tue Jul 08, 2014 7:35 pm
by BradleyS
It will be available to most filters, with minor modifications.

Re: NLMeans denoise filter

Posted: Wed Sep 17, 2014 6:10 pm
by BradleyS
NLMeans is now threaded thanks to JohnAStebbins! Performance is dependent on hardware, of course, but 3-9x speed up seems typical. This is a huge improvement and it certainly goes a long way toward making state of the art denoising accessible to all. Edited to add: NLMeans is available in all GUIs! Use a preset.

I've made some small revisions to the original topic to better reflect the current state of NLMeans. I do not plan to make any further changes to the topic except to add an archive of the external images linked.

Enjoy your squeaky clean videos.

Re: NLMeans denoise filter

Posted: Thu Sep 18, 2014 7:45 pm
by DasMurkel

are there any plans to bring this filter to OpenCL? From the looks of it it seems pretty well suited for that. I've looked into doing it myself but the build process of HB puzzles me a little bit and since I'm using Linux, I can't even get the OpenCL enabled Version of HB to compile.

Do you guys have any more (code) documentation about OCL than the flow chart on the Wiki or can somebody tell me how to link libopencl to libhb so I can try and build myself a prototype? Best regards!

Re: NLMeans denoise filter

Posted: Fri Sep 19, 2014 7:56 am
by JohnAStebbins
We don't have any opencl knowledgeable devs. The opencl code in hb was contributed, and that contributor has not stuck around to support or maintain it. IMO it should be removed because it fails so often.

I'm all for trying to get an opencl implementation of nlmeans, but you won't find much help here, and it would be *really* helpful if you stuck around to support it at least enough to teach others.

Re: NLMeans denoise filter

Posted: Fri Sep 19, 2014 8:07 pm
by DasMurkel

I would at least like to try, so I can contribute something to this great piece of software. If you could tell me/point me to some docs (via PM if you want) what I have to do to add libopencl to libhbs list of linked libraries, I would try to write a prototype to see what the performance is like and if it's worth pursuing further. If this leads somewhere, I'm happy to stick around.

Re: NLMeans denoise filter

Posted: Sat Sep 20, 2014 2:01 pm
by Rodeo
We link against OpenCL at runtime, so the build requirements are nil.

If you need to use a function that we don't currently support, you can:

- make sure it's part of the OpenCL 1.1 API and works on AMD/NVIDIA/Intel*

- declare it with HB_OCL_API and add it to the hb_opencl_library_s struct with HB_OCL_FUNC_DECL (file: libhb/opencl.h)

- make sure it gets loaded by hb_opencl_library_init with HB_OCL_LOAD (file: libhb/opencl.c)

Then, include "opencl.h" and use hb_ocl->function_name instead of calling the function directly.

The current OpenCL scaling code doesn't run on NVIDIA (which sucks enough as it is), but we won't accept new contributions that don't :P

Re: NLMeans denoise filter

Posted: Thu Oct 30, 2014 11:04 pm
by Kytael
have any steps taken towards the OpenCL implementation been taken? I can't find any other than a small mention on the locked 0.10.0 beta thread: viewtopic.php?f=6&t=30613&hilit=nlmeans#p142091
There is an OpenCL version that seems to work quite well, more info and source code can be found here: NLMeansCL: GPU based Non Local Means Denoising
while following this link there is also a paper describing a GPU-accelerated real-time NLMeans algorithm: A GPU-accelerated real-time NLMeans algorithm .