Skip to content

Tags: agnivade/levenshtein

Tags

v1.2.1

Toggle v1.2.1's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Add additional cases to corpus

v1.2.0

Toggle v1.2.0's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Add tests on long strings with few different characters

Benchmarks are run before and after optimization "Remove leading and trailing identical runes".
Long strings with differences at the beginning (long_lead), in the middle (long_middle) or at the end (long_trail) show significant improvements in processing time and memory allocations. When the optimization is ineffective due to different leading and trailing characters (long_diff) there is no change in processing time or memory allocation.

goos: linux
goarch: amd64
pkg: github.com/agnivade/levenshtein
cpu: AMD Ryzen 7 7840U w/ Radeon  780M Graphics
                      │  before.txt  │              after.txt              │
                      │    sec/op    │   sec/op     vs base                │
Simple/ASCII-16         134.20n ± 0%   79.03n ± 0%  -41.11% (p=0.000 n=20)
Simple/French-16         254.8n ± 0%   129.7n ± 0%  -49.09% (p=0.000 n=20)
Simple/Nordic-16         500.6n ± 1%   208.0n ± 0%  -58.45% (p=0.000 n=20)
Simple/Long_lead-16     1862.0n ± 0%   209.6n ± 1%  -88.75% (p=0.000 n=20)
Simple/Long_middle-16   3613.0n ± 0%   325.0n ± 0%  -91.00% (p=0.000 n=20)
Simple/Long_trail-16    3911.0n ± 0%   399.0n ± 1%  -89.80% (p=0.000 n=20)
Simple/Long_diff-16      4.030µ ± 0%   4.029µ ± 1%        ~ (p=0.899 n=20)
Simple/Tibetan-16        413.0n ± 0%   277.3n ± 0%  -32.86% (p=0.000 n=20)
geomean                  964.6n        299.5n       -68.95%

                      │  before.txt  │              after.txt               │
                      │     B/op     │    B/op     vs base                  │
Simple/ASCII-16         0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/French-16        0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/Nordic-16        0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/Long_lead-16     464.0 ± 0%     368.0 ± 0%  -20.69% (p=0.000 n=20)
Simple/Long_middle-16   672.0 ± 0%     544.0 ± 0%  -19.05% (p=0.000 n=20)
Simple/Long_trail-16    720.0 ± 0%     576.0 ± 0%  -20.00% (p=0.000 n=20)
Simple/Long_diff-16     720.0 ± 0%     720.0 ± 0%        ~ (p=1.000 n=20) ¹
Simple/Tibetan-16       0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
geomean                            ²                -7.99%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                      │  before.txt  │              after.txt               │
                      │  allocs/op   │ allocs/op   vs base                  │
Simple/ASCII-16         0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/French-16        0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/Nordic-16        0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/Long_lead-16     3.000 ± 0%     2.000 ± 0%  -33.33% (p=0.000 n=20)
Simple/Long_middle-16   3.000 ± 0%     2.000 ± 0%  -33.33% (p=0.000 n=20)
Simple/Long_trail-16    3.000 ± 0%     2.000 ± 0%  -33.33% (p=0.000 n=20)
Simple/Long_diff-16     3.000 ± 0%     3.000 ± 0%        ~ (p=1.000 n=20) ¹
Simple/Tibetan-16       0.000 ± 0%     0.000 ± 0%        ~ (p=1.000 n=20) ¹
geomean                            ²               -14.11%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

v1.1.1

Toggle v1.1.1's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Refactor

v1.1.0

Toggle v1.1.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Improving performance (#15)

* Improve performance by removing the "else" part of the matching condition

Benchmark :
name              old time/op    new time/op    delta
Simple/ASCII-4       330ns ± 3%     294ns ± 4%  -10.97%  (p=0.000 n=19+19)
Simple/French-4      609ns ± 4%     537ns ± 5%  -11.85%  (p=0.000 n=19+20)
Simple/Nordic-4     1.26µs ± 5%    1.02µs ± 3%  -18.72%  (p=0.000 n=20+19)
Simple/Tibetan-4    1.01µs ± 3%    0.89µs ± 3%  -12.71%  (p=0.000 n=18+19)
[Geo mean]           712ns          615ns       -13.62%

* Declare "current" variable in the inner loop

* Remove unnecessary initialisation of the cache (variable "x")

The initialisation seems to be done in the loop with "x[lenS1] = prev"

Benchmark :
name              old time/op    new time/op    delta
Simple/ASCII-4       288ns ± 3%     244ns ± 5%  -15.42%  (p=0.000 n=19+20)
Simple/French-4      527ns ± 3%     448ns ± 3%  -15.10%  (p=0.000 n=20+20)
Simple/Nordic-4      989ns ± 3%     843ns ± 3%  -14.75%  (p=0.000 n=20+19)
Simple/Tibetan-4     899ns ± 6%     803ns ± 2%  -10.68%  (p=0.000 n=19+20)
[Geo mean]           606ns          521ns       -14.01%

* Update go.mod and go.sum

* Use range in for loops

name              old time/op    new time/op    delta
Simple/ASCII-4       260ns ± 3%     265ns ± 4%  +2.14%  (p=0.001 n=20+20)
Simple/French-4      472ns ± 3%     488ns ± 3%  +3.41%  (p=0.000 n=20+20)
Simple/Nordic-4      868ns ± 3%     930ns ± 5%  +7.18%  (p=0.000 n=19+19)
Simple/Tibetan-4     823ns ± 3%     878ns ± 3%  +6.70%  (p=0.000 n=19+20)
[Geo mean]           544ns          570ns       +4.84%

* Revert "Use range in for loops"

This reverts commit 58032d8

* Make go.mod reference local module

* Cache initialization is necessary

see TestWithoutCacheInit

* Revert "Remove unnecessary initialisation of the cache (variable "x")"

This reverts commit 670e997

* Add test case to TestSanity

The ABBA case makes the test fail when the cache isn't initialized

* Revert "Make go.mod reference local module"

This reverts commit 518107d

* Revert "Cache initialization is necessary"

This reverts commit 741625d

v1.0.3

Toggle v1.0.3's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

v1.0.2

Toggle v1.0.2's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
[Added]: benchmark sink

Add a sink to prevent the compiler from optimizing away the benchmark.
Although this doesn't happen, this is a safety check and best practice.

v1.0.1

Toggle v1.0.1's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Making function ComputeDistance() return early on edge cases

v1.0.0

Toggle v1.0.0's commit message

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
Added go.mod