Skip to content

Conversation

@Noratrieb
Copy link
Contributor

Unstable sorting has significantly better code size (and often better performance too). With this change (rebased on top of v0.25.0, which is what std currently uses) the .text size of Rust hello world (with -Clto=fat -Copt-level=s) goes from 297120 to 290351 (-2.23%).

I do not know that much about addr2line, so I cannot guarantee that this code is fine with unstable sorting. But I think it should be fine since it always sorts on addresses, which are either unique or nondeterministic anyways (as mentioned in one comment even). So it should not make a significant difference.

Unstable sorting has significantly better code size (and often better
performance too). With this change (rebased on top of v0.25.0, which is
what std currently uses) the .text size of Rust hello world (with `-Clto=fat
-Copt-level=s`) goes from 297120 to 290351 (-2.23%).

I do not know *that* much about addr2line, so I cannot guarantee that
this code is fine with unstable sorting. But I think it should be fine
since it always sorts on addresses, which are either unique or
nondeterministic anyways (as mentioned in one comment even). So it
should not make a significant difference.
@Noratrieb
Copy link
Contributor Author

before:
image

after:
image

@philipc
Copy link
Contributor

philipc commented Sep 24, 2025

For correctness, I agree that this should only affect parts that were nondeterministic anyway. However, running addr2line --all -af on a libxul.so that I had lying around shows less addresses, so I'll try to figure out what's going on there before merging.

The performance difference when running scripts/benchmark-addr2line-docker.sh is noise.

cargo bench shows mixed results. These benchmarks aren't very stable but there does appear to be a difference here:

context_new             time:   [515.98 µs 516.26 µs 516.56 µs]
                        change: [+4.7649% +5.3754% +6.0487%] (p = 0.00 < 0.05)
                        Performance has regressed.

context_new_parse_lines time:   [5.0581 ms 5.0624 ms 5.0670 ms]
                        change: [+0.1764% +0.2994% +0.4055%] (p = 0.00 < 0.05)
                        Change within noise threshold.

context_new_parse_functions
                        time:   [7.6034 ms 7.6128 ms 7.6234 ms]
                        change: [−5.9501% −5.7841% −5.6126%] (p = 0.00 < 0.05)
                        Performance has improved.

context_new_parse_inlined_functions
                        time:   [27.333 ms 27.420 ms 27.518 ms]
                        change: [+1.7455% +2.1007% +2.4944%] (p = 0.00 < 0.05)
                        Performance has regressed.

context_query_location  time:   [832.62 µs 834.19 µs 836.49 µs]
                        change: [−3.2239% −2.8645% −2.5365%] (p = 0.00 < 0.05)
                        Performance has improved.

context_query_with_functions
                        time:   [1.9885 ms 1.9901 ms 1.9917 ms]
                        change: [+0.8561% +1.0993% +1.2944%] (p = 0.00 < 0.05)
                        Change within noise threshold.

context_new_and_query_location
                        time:   [623.46 µs 623.77 µs 624.11 µs]
                        change: [+2.3716% +3.0202% +3.5973%] (p = 0.00 < 0.05)
                        Performance has regressed.

context_new_and_query_with_functions
                        time:   [750.20 µs 750.53 µs 750.90 µs]
                        change: [−0.2144% +0.3569% +0.8736%] (p = 0.22 > 0.05)
                        No change in performance detected.

Not necessary for correctness, but helps compare results.
@Noratrieb
Copy link
Contributor Author

I ran the benchmarks locally and also saw a few small performance slowdowns, but I'm not sure whether it's really significant. Stable vs unstable sorting performance varies a bunch, where either can be faster depending on the circumstances.

Copy link
Contributor

@philipc philipc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@philipc philipc merged commit cad80f5 into gimli-rs:master Sep 27, 2025
12 checks passed
@philipc
Copy link
Contributor

philipc commented Sep 27, 2025

Does rust have an equivalent of C's qsort that can be used for any type given just the size and the comparison function (instead of having 5 copies of the unstable sort algorithm, one for each type we call it for)?

@philipc
Copy link
Contributor

philipc commented Sep 27, 2025

However, running addr2line --all -af on a libxul.so that I had lying around shows less addresses, so I'll try to figure out what's going on there before merging.

The unit_ranges sorting was one cause of this, and #359 fixes another.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants