-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Pull requests: jax-ml/jax
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Mosaic GPU] Optimize int4->int8 casts following relayouts
#31950
by copybara-service
bot
was merged Sep 19, 2025
Loading…
[Mosaic GPU] Add an optimized conversion routine for int4 to int8 casts
#31949
by copybara-service
bot
was merged Sep 19, 2025
Loading…
Strengthen the tests for type conversions
#31948
by copybara-service
bot
was merged Sep 19, 2025
Loading…
[Pallas:MGPU] Add support for clusters and TMA multicast in the Hopper matmul
#31782
by copybara-service
bot
was merged Sep 19, 2025
Loading…
[Pallas:MGPU] Add support for collective copies in pipelining helpers
#31781
by copybara-service
bot
was merged Sep 19, 2025
Loading…
[Pallas:MGPU] Allow changing the number of arrivals on cluster barriers
#31780
by copybara-service
bot
was merged Sep 18, 2025
Loading…
[Mosaic GPU] Allow varying the arrival count on cluster barriers
#31779
by copybara-service
bot
was merged Sep 18, 2025
Loading…
[Pallas:MGPU] Add tests for the newly added tuning options
#31776
by copybara-service
bot
was merged Sep 18, 2025
Loading…
[Pallas:MGPU] Use the planar_snake grid iteration order to improve L2 hit rate in the Blackwell matmul
#31773
by copybara-service
bot
was merged Sep 12, 2025
Loading…
[Pallas:MGPU] Add support for tuning epilogue_tile_n in Blackwell matmul
#31749
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Pallas:MGPU] Use a single barrier to wait for A and B transfers in Blackwell matmul
#31748
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Pallas:MGPU] Allow tuning the non-contracting dimension split over compute WGs in Hopper matmul
#31746
by copybara-service
bot
was merged Sep 12, 2025
Loading…
[Mosaic GPU] Fix the incorrect type annotations in the Mosaic GPU profiler
#31743
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Pallas:MGPU] Avoid bubbles between steps of the MN loop in the Hopper matmul
#31737
by copybara-service
bot
was merged Sep 12, 2025
Loading…
[Pallas:MGPU] Don't use delay_release in Hopper MGPU matmul example
#31736
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Pallas:MGPU] Use the planar_snake CTA order to improve L2 cache hits in Hopper matmul
#31735
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Pallas:MGPU] Add a helper for a 2D grid traversal with better locality
#31705
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Pallas:MGPU] Add support for tiling the TMA epilogue
#31699
by copybara-service
bot
was merged Sep 11, 2025
Loading…
[Mosaic GPU] Add support for performing multiple measurements in a single cupti session
#31691
by copybara-service
bot
was closed Sep 11, 2025
Loading…
[Pallas:MGPU] Use multiple iterations when benchmarking the Hopper matmul
#31690
by copybara-service
bot
was closed Sep 11, 2025
Loading…
[Pallas:MGPU] Add async_prefetch support in warp thread semantics
#31629
by copybara-service
bot
was closed Sep 8, 2025
Loading…
[Mosaic GPU] Use all SMs to send the data
#31624
by copybara-service
bot
was merged Sep 8, 2025
Loading…
[Mosaic GPU] Only send from one SM in each group of SMs that share the same M coordinate
#31623
by copybara-service
bot
was closed Sep 8, 2025
Loading…
[Mosaic GPU] Improve the async_prefetch test
#31622
by copybara-service
bot
was merged Sep 8, 2025
Loading…
[Pallas:MGPU] Support async_prefetch in warpgroup lowering
#31621
by copybara-service
bot
was merged Sep 8, 2025
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.