Use unstable instead of stable sorts #358

Noratrieb · 2025-09-23T19:59:21Z

Unstable sorting has significantly better code size (and often better performance too). With this change (rebased on top of v0.25.0, which is what std currently uses) the .text size of Rust hello world (with -Clto=fat -Copt-level=s) goes from 297120 to 290351 (-2.23%).

I do not know that much about addr2line, so I cannot guarantee that this code is fine with unstable sorting. But I think it should be fine since it always sorts on addresses, which are either unique or nondeterministic anyways (as mentioned in one comment even). So it should not make a significant difference.

Unstable sorting has significantly better code size (and often better performance too). With this change (rebased on top of v0.25.0, which is what std currently uses) the .text size of Rust hello world (with `-Clto=fat -Copt-level=s`) goes from 297120 to 290351 (-2.23%). I do not know *that* much about addr2line, so I cannot guarantee that this code is fine with unstable sorting. But I think it should be fine since it always sorts on addresses, which are either unique or nondeterministic anyways (as mentioned in one comment even). So it should not make a significant difference.

Noratrieb · 2025-09-23T20:01:07Z

before:

after:

philipc · 2025-09-24T05:56:40Z

For correctness, I agree that this should only affect parts that were nondeterministic anyway. However, running addr2line --all -af on a libxul.so that I had lying around shows less addresses, so I'll try to figure out what's going on there before merging.

The performance difference when running scripts/benchmark-addr2line-docker.sh is noise.

cargo bench shows mixed results. These benchmarks aren't very stable but there does appear to be a difference here:

context_new             time:   [515.98 µs 516.26 µs 516.56 µs]
                        change: [+4.7649% +5.3754% +6.0487%] (p = 0.00 < 0.05)
                        Performance has regressed.

context_new_parse_lines time:   [5.0581 ms 5.0624 ms 5.0670 ms]
                        change: [+0.1764% +0.2994% +0.4055%] (p = 0.00 < 0.05)
                        Change within noise threshold.

context_new_parse_functions
                        time:   [7.6034 ms 7.6128 ms 7.6234 ms]
                        change: [−5.9501% −5.7841% −5.6126%] (p = 0.00 < 0.05)
                        Performance has improved.

context_new_parse_inlined_functions
                        time:   [27.333 ms 27.420 ms 27.518 ms]
                        change: [+1.7455% +2.1007% +2.4944%] (p = 0.00 < 0.05)
                        Performance has regressed.

context_query_location  time:   [832.62 µs 834.19 µs 836.49 µs]
                        change: [−3.2239% −2.8645% −2.5365%] (p = 0.00 < 0.05)
                        Performance has improved.

context_query_with_functions
                        time:   [1.9885 ms 1.9901 ms 1.9917 ms]
                        change: [+0.8561% +1.0993% +1.2944%] (p = 0.00 < 0.05)
                        Change within noise threshold.

context_new_and_query_location
                        time:   [623.46 µs 623.77 µs 624.11 µs]
                        change: [+2.3716% +3.0202% +3.5973%] (p = 0.00 < 0.05)
                        Performance has regressed.

context_new_and_query_with_functions
                        time:   [750.20 µs 750.53 µs 750.90 µs]
                        change: [−0.2144% +0.3569% +0.8736%] (p = 0.22 > 0.05)
                        No change in performance detected.

src/unit.rs

Not necessary for correctness, but helps compare results.

Noratrieb · 2025-09-26T18:15:13Z

I ran the benchmarks locally and also saw a few small performance slowdowns, but I'm not sure whether it's really significant. Stable vs unstable sorting performance varies a bunch, where either can be faster depending on the circumstances.

philipc

Thanks!

philipc · 2025-09-27T06:04:05Z

Does rust have an equivalent of C's qsort that can be used for any type given just the size and the comparison function (instead of having 5 copies of the unstable sort algorithm, one for each type we call it for)?

philipc · 2025-09-27T06:08:34Z

However, running addr2line --all -af on a libxul.so that I had lying around shows less addresses, so I'll try to figure out what's going on there before merging.

The unit_ranges sorting was one cause of this, and #359 fixes another.

philipc reviewed Sep 25, 2025

View reviewed changes

src/unit.rs Outdated Show resolved Hide resolved

Sort parsed unit ranges by unit_id as well

2be7d3e

Not necessary for correctness, but helps compare results.

philipc approved these changes Sep 27, 2025

View reviewed changes

philipc merged commit cad80f5 into gimli-rs:master Sep 27, 2025
12 checks passed

philipc mentioned this pull request Sep 27, 2025

Ignore empty line sequences #359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use unstable instead of stable sorts #358

Use unstable instead of stable sorts #358

Uh oh!

Noratrieb commented Sep 23, 2025

Uh oh!

Noratrieb commented Sep 23, 2025

Uh oh!

philipc commented Sep 24, 2025

Uh oh!

Uh oh!

Noratrieb commented Sep 26, 2025

Uh oh!

philipc left a comment

Uh oh!

Uh oh!

philipc commented Sep 27, 2025

Uh oh!

philipc commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use unstable instead of stable sorts #358

Use unstable instead of stable sorts #358

Uh oh!

Conversation

Noratrieb commented Sep 23, 2025

Uh oh!

Noratrieb commented Sep 23, 2025

Uh oh!

philipc commented Sep 24, 2025

Uh oh!

Uh oh!

Noratrieb commented Sep 26, 2025

Uh oh!

philipc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

philipc commented Sep 27, 2025

Uh oh!

philipc commented Sep 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants