Skip to content

Conversation

@heshpdx
Copy link
Contributor

@heshpdx heshpdx commented Sep 9, 2024

I was requested to dig further in the spirit of !852 so I spent time inspecting other parts of the code. This PR unlocks more FDIV->FMUL opportunities. I inspected the assembly and confirmed that each of the changes did result in a transformation of FDIV to FMUL.

Here are the perf outliers, with the majority of other tests in the noise (since likely they do not exercise the changed code).

Grid path cells shows 4.3% upside, as measured on a Ampere Altra in the OCI Cloud, using gcc-10 out of the box. Regular h3 cmake build.
mainline:

$ ./build/bin/benchmarkGridPathCells 
        -- gridPathCellsNear: 39.136827 microseconds per iteration (10000 iterations)
        -- gridPathCellsFar: 1799.592192 microseconds per iteration (1000 iterations)

branch:

$ ./build_fast/bin/benchmarkGridPathCells 
        -- gridPathCellsNear: 37.556445 microseconds per iteration (10000 iterations)
        -- gridPathCellsFar: 1724.603344 microseconds per iteration (1000 iterations)

The experimental polygon to cells overlapping is also an outlier, 1-2%:

mainline:

        -- polygonToCellsSF_Overlapping: 5817.474534 microseconds per iteration (500 iterations)
        -- polygonToCellsAlameda_Overlapping: 13169.533630 microseconds per iteration (500 iterations)
        -- polygonToCellsSouthernExpansion_Overlapping: 381815.775600 microseconds per iteration (10 iterations)

branch:

        -- polygonToCellsSF_Overlapping: 5773.028512 microseconds per iteration (500 iterations)
        -- polygonToCellsAlameda_Overlapping: 12886.969474 microseconds per iteration (500 iterations)
        -- polygonToCellsSouthernExpansion_Overlapping: 377545.843300 microseconds per iteration (10 iterations)

More FDIV->FMUL opportunities unlocked, following in the
spirit of #852
@coveralls
Copy link

coveralls commented Sep 9, 2024

Coverage Status

coverage: 98.824% (-0.002%) from 98.826%
when pulling b7542c3 on heshpdx:master
into a1037eb on uber:master.

Copy link
Collaborator

@nrabinowitz nrabinowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these optimizations!

Co-authored-by: Nick Rabinowitz <[email protected]>
@heshpdx
Copy link
Contributor Author

heshpdx commented Sep 18, 2024

Looks like the CIFuzz is broken. Do we need to update that runner?

@isaacbrodsky
Copy link
Collaborator

Looks like the CIFuzz is broken. Do we need to update that runner?

I'm addressing it in #907. You may need to rebase or merge that change in too

@isaacbrodsky isaacbrodsky merged commit 4899d29 into uber:master Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants