IBA::Unsharp_mask() speed and memory optimization #4513

ssh4net · 2024-10-29T04:05:54Z

Description

Replacing 3x IBA + Helper function that generate 4 fulls size image buffers with single unsharp_mask_impl() that use parallel_image() to compute unsharp:
src + contr * (((src - blur) < threshold) ? 0.0 : (src - blur))

Added two pass 1D convolution for a kernels higher than 3x3

Tests

	ImageBuf sharped(input.spec());
	const int repeats = 50;

	std::cout << "Start sharpening\n";
	auto start = std::chrono::high_resolution_clock::now();

	for (int i = 0; i < repeats; i++)
	{
		//ok = ImageBufAlgo::unsharp_mask(sharped, input, "gaussian", 15.0f, 10.0f, 0.01f);
		ok = ImageBufAlgo::unsharp_mask(sharped, input, "gaussian", 5.0f, 2.0f, 0.05f);
		std::cout << ".";
	}

	std::cout << "\n";

	auto part1 = std::chrono::high_resolution_clock::now();
	std::chrono::duration<double> elapsed_part1 = part1 - start;
	std::cout << "Elapsed time: " << elapsed_part1.count() << " s\n";

both single threaded (one IB at time) and multithreaded (multiply IB at time) show pretty good speedup:
~30-40% with less memory use.

for 5x5 gaussian kernels two pass mode should add at least 20% speedup.

(if someone can do independent benchmark, will be great. As soon as I had a big differences on them depend on real or synthetic use)

Checklist:

I have read the contribution guidelines.
I have updated the documentation, if applicable. (Check if there is no
need to update the documentation, for example if this is a bug fix that
doesn't change the API.)
I have ensured that the change is tested somewhere in the testsuite
(adding new test cases if necessary).
If I added or modified a C++ API call, I have also amended the
corresponding Python bindings (and if altering ImageBufAlgo functions, also
exposed the new functionality as oiiotool options).
My code follows the prevailing code style of this project. If I haven't
already run clang-format before submitting, I definitely will look at the CI
test that runs clang-format and fix anything that it highlights as being
nonconforming.

instead of using 3x IBA + helper functions unsharp_mask_impl() function that use parallel_image() Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

for a kernels 5x5 or more can give a speedup from 40% and higher for a bigger kernels Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

ssh4net · 2024-10-31T04:03:50Z

ssh4net added 4 commits October 29, 2024 12:54

Optimisation of unsharp mask function

468abcc

instead of using 3x IBA + helper functions unsharp_mask_impl() function that use parallel_image() Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

some cleanup

3ec0c8e

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

auto-tests errors fix

e44b08d

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

two pass blur for kernels bigger than 3x3

6b28ec0

for a kernels 5x5 or more can give a speedup from 40% and higher for a bigger kernels Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

ssh4net changed the title ~~IBA::Unsharp_mask() speed and memory optimization (after convolution code)~~ IBA::Unsharp_mask() speed and memory optimization Oct 29, 2024

ssh4net added 4 commits October 29, 2024 20:47

clang fix

71d052c

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

missed second width

ee6b6fc

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

fixing issue from convolve_() in-place

77c5299

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

vertical first

76ce825

Signed-off-by: Vlad (Kuzmin) Erium <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IBA::Unsharp_mask() speed and memory optimization #4513

IBA::Unsharp_mask() speed and memory optimization #4513

Uh oh!

ssh4net commented Oct 29, 2024 •

edited

Loading

Uh oh!

ssh4net commented Oct 31, 2024 •

edited

IBA::Unsharp_mask() speed and memory optimization #4513

IBA::Unsharp_mask() speed and memory optimization #4513

Uh oh!

Conversation

ssh4net commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist:

Uh oh!

ssh4net commented Oct 31, 2024 • edited

ssh4net commented Oct 29, 2024 •

edited

Loading

ssh4net commented Oct 31, 2024 •

edited