Release v3.8.0

@hanhanW

IREE Release v3.8.0

1. Compiler

1.1 Data Tiling & Scaled Matmul

Introduced DataTiledScaledMMAAttr and implemented scaled matmul data tiling materialization using new scaled intrinsic attributes for improved codegen flexibility. (#22176, #22189)
Added ping-pong ukernel support for FP8 and FP16 data tiling, tuned for LLaMA workloads, delivering up to 30–40% latency reduction vs. non–data-tiled paths. (#21919)
Added ROCm encoding specialization via UKernelProviderInterface for data-tiled ukernels. (#21914)
Introduced intentional padded configurations for (I)GEMM to improve convolution performance by ~8% with no degradation in backward paths. (#21931)
Disabled data-tiling by default for CPU backends due to memory and backend inconsistencies; it’s now opt-in via --iree-opt-data-tiling, with updated CPU docs and tests reflecting the change. (#21935)
Published a detailed blog on Data Tiling introducing how operand layouts are transformed to match hardware-preferred formats for better locality and cache efficiency. (https://iree.dev/community/blog/2025-08-25-data-tiling-walkthrough/)

1.2 Convolution

Transposed input backward convolution filter layout from CHWF → FHWC, aligning with matmul_transpose_b and improving performance. (#22100)
Reordered iterator dimensions for input backward convolutions to match forward NHWC-FHWC conv layout, simplifying autotuning and shape handling. (#22208)
Enabled extract slice propagation during convolution padding to improve fusion opportunities. (#21948)

1.3 Matmul & Vector Distribute

Removed virtual MMAs from vector distribute matmul/conv pipelines to fix regressions and restore original performance on Punet configurations. (#22202)
Added support for distributing subgroups across multiple M dimensions in vector distribute pipelines, improving parallel utilization. (#22000)

1.4 Others

Added encoding propagation and fusion passes in the default dispatch creation path, improving layout-based fusion. (#22063)
Introduced optional split-reduction size inference for batch normalization. (#21731)
Fused broadcasts with attention consumers instead of producers, improving dimension inference and downstream fusion. (#22008)
Updated ConvertAccGEMMToGEMM to support scaled GEMMs. (#22093)
Reordered memref reshapes above empty tensor elimination to ensure correct dominance in bufferization. (#22045)
Fixes and Refinements (#22222, #22106, #22179, #22041, #22143, #22095, #22233, #22197, #22195, #22033, #22031, #21997, #21910, #21970, #21952, #21900, #21890, #21665, #22100, #22208, #22045, #22202)

2. Runtime

Split hoisted async constant lifetimes to drastically reduce retained memory (e.g., 9 GB → 500 KB in large tiled workloads). (#21995)
Added per–entry-point flags and workgroup size emission, preparing for new HAL APIs and better runtime introspection.
- ⚠️ Breaking change: local executable library format bumped to v0.6. (#21754, #22078, #21950)
Updated GPU executable headers for versioning and added a new infer-format call to safely infer executable data format and size.
- ⚠️ Breaking change: requires GPU executable recompilation.(#21763)
CPU matmul configuration switched to linalg::LinalgOp interface for better op fusion and flexibility. (#21954)
General Enhancements and Fixes (#22101, #22110, #22102, #22048, #21921, #22075)

Change Log

Git History

What's Changed

[DT] Fuse encoding ops more aggressively for multi-use, gather, and slices ops. by @hanhanW in #21830
[Codegen][Tuner]: improve python binding to query target info by @bangtianliu in #21812
[Codegen][Tuner] retire the C/Python binding for querying mma intrinsic. NFC. by @bangtianliu in #21816
[Integrate] Drop llvm/llvm-project@b4c31dc revert. by @hanhanW in #21851
[Encoding] Support SetEncoding on scaled contraction ops by @Max191 in #21825
[Test] Add onnx_ops test suites with O2/O3 optimization level. by @hanhanW in #21838
[CodeGen] Do not fuse parallel ops if they directly write to destination. by @hanhanW in #21837
[GPU] Add pattern to fold fill into pad ops by @nirvedhmeshram in #21864
[Codegen][IGEMM] Do not pre-pad convs with CHW layout or small input channel size by @yzhang93 in #21839
[GPU] Remove reshape by expansion in workgroup scope of combine layout pass by @nirvedhmeshram in #21869
[CPU] Remove passing tests from expected_compile_failures list. by @hanhanW in #21871
[GPU] Use Affine map for size calculations of alloca's in fission pass by @nirvedhmeshram in #21870
[Codegen][AMDGPU] Fix matmul miscompile on RDNA4 by @kuhar in #21873
[NFC] Code Quality changes by @Muzammiluddin-Syed-ECE in #21876
Avoid needles isa checks. NFC. by @kuhar in #21885
[VectorDistribute] Refactor layout configuration to a simpler logic by @Groverkss in #21883
[StableHLO][CHLO]Refactor CHLO decompositions to follow upstream StableHLO by @LekkalaSravya3 in #21682
Revert "[VectorDistribute] Refactor layout configuration to a simpler logic" by @Groverkss in #21887
[docs] Clarify compiler coding standards by @kuhar in #21886
Upgrade Preprocessing and Modules to free create functions. NFC. by @kuhar in #21877
[Codegen] Upgrade Common, SPIRV, VMVX to free create functions. NFC. by @kuhar in #21879
[Codegen] Upgrade LLVMCPU and LLVMGPU to free create functions. NFC. by @kuhar in #21880
[Codegen] Upgrade Dialect and Interfaces to free create functions. NFC. by @kuhar in #21881
Add gfx950 ukernel patterns by @sebvince in #21856
Bump version to 3.8.0 after 3.7.0 release. by @sa-faizal in #21852
[docs] Update the file config file for running ONNX operator tests on CPU. by @hanhanW in #21892
Upgrade GlobalOpt, InputConversion, ExternalInterfacess to free create function. NFC. by @kuhar in #21878
[Codegen] Upgrade Transforms and Utils to free create functions. NFC. by @kuhar in #21882
[ROCM] Update Ukernel infra to handle InnerTiledOp/Multi_MMA_MFMA by @Abhishek-Varma in #21759
Reland "[VectorDistribute] Refactor layout configuration to a simpler logic" by @Groverkss in #21895
Upgrade IREE plugins to free create functions. NFC. by @kuhar in #21896
[GPU] Remove MMAScheduleAttr by @Groverkss in #21884
[LLVMCPU] Respect dominance when doing replacement of tile and fused values by @MaheshRavishankar in #21901
[Codegen] Upgrade iree dialects to free create functions. NFC. by @kuhar in #21898
Integrate LLVM at llvm-project/llvm@daf8f9fc1ccc6c5679bc89058fd66d8ea4da9d59 by @rkayaith in #21893
Upgrade all remaining code to free create functions. NFC. by @kuhar in #21902
[LLVMGPU] Move LLVMGPUVectorLowering after OptimizeIntArithmetic by @Max191 in #21597
[Codegen] Promote scales to LDS by @Muzammiluddin-Syed-ECE in #21767
Bump the github-actions group with 2 updates by @dependabot[bot] in #21897
Integrate llvm/llvm-project@31bee3421ba4 by @rkayaith in #21905
[CPU] Tile all the ops to target vector sizes before vectorization. by @hanhanW in #21900
[LinalgExt] Fold subview ops into map_scatter output before decomposing by @Max191 in #21891
[GPU] Do not do c promotion for unaligned (I)GEMMs by @nirvedhmeshram in #21823
[Codegen][ROCm] Add repro instructions for .rocmasm files by @kuhar in #21874
[LinalgExt] Fix FoldWithProducerReshapeByExpansion for >1 dyn dim by @IanWood1 in #21894
At the beginning of emulate narrow type, flatten incoming memrefs by @lialan in #21910
Revert "Disable failing ARM-SME tests. (#21715)" by @banach-space in #21860
[Codegen][AMDGPU] Drop backend reverts, emergency RDNA4 lowering fix by @krzysz00 in #21906
[codegen] more consumer fusion by @jtuyls in #21848
[CPU][DT] Add codegen support for broadcast/dequant -> matmul dispatch. by @hanhanW in #21911
[Codegen][IGEMM] Set convolution pre-padding as default by @yzhang93 in #21899
[Codegen][GenericVectorization] Fix incorrect usage of std::accumulation that led to overflow by @mshockwave in #21920
Integrate llvm/llvm-project@e92cbfbe3087 by @rkayaith in #21917
[Codegen][Cleanup] Always enable vectorization for padding and gather. by @hanhanW in #21924
[Test] Disable AMDGPU onnx_ops test suite (O0) job. by @hanhanW in #21929
Integrate llvm/torch-mlir@7000187b by @rkayaith in #21918
Bump nanobind version by @Hardcode84 in #21926
[iree][codegen] Add #iree_codegen.denormal_fp_math to set denormals behavior by @fabianmcg in #21840
[ROCM] Add back specialization pattern tests by @jtuyls in #21939
Fix --iree-hip-target validation by @bjacob in #21909
Integrate LLVM at llvm/llvm-project@b22f94dcc58e by @rkayaith in #21943
[GPU] Propagate extract slice when doing convolution padding by @nirvedhmeshram in #21948
[CPU] Adjust tile sizes for mmt4d dispatches that have relayout ops. by @hanhanW in #21934
Fixing CSE of hoisted encoding ops. by @benvanik in #21921
Adding util.list.construct pseudo-op. by @benvanik in #21950
[Dispatch Creation] Don't fuse no input producer with reduction by @IanWood1 in #21930
Revert "[LinalgExt] Fix FoldWithProducerReshapeByExpansion for >1 dyn dim" by @IanWood1 in #21947
[DispatchCreation]: Add FormSplitReductionDispatchesPass support for ArgCompare op by @bangtianliu in #21903
[CPU] Add an experimental flag to disable linalg.conv generalization. by @hanhanW in #21953
[GPU][DT] Add pingpong ukernels for data tiling (f8 and f16) by @Yu-Zhewen in #21919
[ROCM][DT] Add encoding specialization infra for data-tiled ukernels by @jtuyls in #21914
[GPU] Use UkernelDescriptor and deprecate UkernelConfigAttr and GPULowerToUkernelsPass by @Abhishek-Varma in #21766
[docs] Update docs on sdxl golden output by @efric in #21936
Fix Dispatch Creation TransformOptions by @IanWood1 in #21964
Integrate LLVM at llvm/llvm-project@ed1f1b8 by @rkayaith in #21963
[docs] Add a blog post for data-tiling introduction. by @hanhanW in #21774
Avoid needless isa checks. NFC. by @bangtianliu in #21968
using Base::Base in tablegen passes. by @benvanik in #21969
[iree][codegen] Set #iree_codegen.denormal_fp_math in attention dispatches by @fabianmcg in #21940
[compiler][NFC] Update remaining code to free create functions. by @hanhanW in #21972
[plugins][NFC] Upgrade plugins/ to free create functions. by @hanhanW in #21973
[GPU][DT] Update data layout strategy for pingpong ukernels by @Yu-Zhewen in #21957
Using explicit operation types in passes. by @benvanik in #21971
Converting compiler/Bindings/ to tablegen Passes.td. by @benvanik in #21974
[Codegen] Unroll instead of linearize vector.to_elements. by @amd-eochoalo in #21959
[Codegen] Added erf ; FastMath rewrite for vector types. by @keshavvinayak01 in #21849
Adding hal.executable lazy flag. by @benvanik in #21966
Don't inline immutable globals with non-util dialect attrs. by @benvanik in #21986
[Codegen][RISCV] Do not lower vector.gather to branches in the presence of RVV by @mshockwave in #21927
[GPU] Only combine complex relayout chains in GPUCombineLayoutTransformation by @Max191 in #21985
[LLVMGPU] Move masked load optimizations after vector lowering by @Max191 in #21962
[iree-test-suites] Update golden benchmark numbers by @Max191 in #21980
[Encoding] Deprecate MatmulKAttr encoding attribute. by @hanhanW in #21976
[Codegen] Make collapse_shape hoisting pattern work with store_to_buffer by @Max191 in #21999
[LinalgExt] Add canonicalization to convert identity map_scatter to copy by @Max191 in #21998
[GPU][DT] Add benchmark files for llama_8b_f16 with data-tiling. by @hanhanW in #21975
[Encoding] set default option for scaled matmul encodings to false by @Muzammiluddin-Syed-ECE in #21994
Marking stream.async.dispatch as pure. by @benvanik in #21989
[Dispatch Creation] Allow fusing pad with split reduction dispatch by @IanWood1 in #21987
Fix data race in GPU C ukernels caching of shared memory size by @bjacob in #22004
[LinalgExt][NFC] Remove unused code in TransposeFusion by @IanWood1 in #22006
[Dispatch Creation] Rework dispatch formation logic by @IanWood1 in #21854
[TensorExt] Fix dynamic dim canonicalization in bitcast folder by @jtuyls in #21997
[CPU] Switch matmul config to use linalg::LinalgOp interface. by @hanhanW in #21954
Bump the github-actions group with 2 updates by @dependabot[bot] in #21992
Integrate LLVM at llvm/llvm-project@0648c5183f32 by @qedawkins in #22003
[VectorDistribute] Use subgroup_basis instead of subgroup_m/n_count by @Groverkss in #21912
Splitting hoisted async constant lifetime. by @benvanik in #21995
Adding iree_hal_executable_export_info_t and queries. by @benvanik in #21754
Respect user FILECHECK_OPTS/LIT_OPTS environment variables when running through ctest by @rkayaith in #22019
[PassUtils] Allow passing overload constructors to addPredicatedPass by @rkayaith in #22021
[NFC] remove unused header files by @bangtianliu in #21977
[Codegen][GPU] Enable TileAndFuse for matmul by default by @jerryyin in #21834
[CPU] Populate to_elements unrolling patterns in LLVM conversion. by @hanhanW in #22010
Fix mi308 Pkgci failures by @IanWood1 in #22028
[Dispatch Creation] Fuse bcast with attention instead of producer by @IanWood1 in #22008
[CPU] Add precondition to kernel dispatch method selection for gemm. by @hanhanW in #22031
[Codegen][Tuner] update lowering config binding for subgroup basis by @bangtianliu in #22027
Fix indices in scaled matmul rank assert by @jtuyls in #22016
[GPU] Introduce Intentional Padded Configurations for (I)GEMM by @nirvedhmeshram in #21931
[CI] Disabling WebGPU build due to CI failures. by @MaheshRavishankar in #22030
[DispatchCreation] Add option to infer split-reduction sizes for batchnorm by @rkayaith in #21731
Implement iree_gpu.coalesced_gather_dma op by @lialan in #21846
[LinalgExt] Support map_scatter decomposition with strided memrefs by @Max191 in #21952
[Codegen] Tile map_scatter op for large vector sizes by @Max191 in #22035
[DispatchCreation] Fix iree-compile split-reduction flag name by @rkayaith in #22038
Integrate LLVM at llvm/llvm-project@dffd7f3d9a3 by @qedawkins in #22023
[NFC][ROCM] Refactor bitcode ukernel to a separate file by @Abhishek-Varma in #21983
[VectorDistribute] Allow distributing subgroups on multiple m dimensions by @Groverkss in #22000
[LLVMGPU] Vectorize map_scatter in LLVMGPUTileAndFuse pipeline by @Max191 in #21890
[Codegen] Push up memref reshapes before empty tensor elimination by @Max191 in #22045
[LLVMGPU] Add support for direct convolution in tile and fuse pipeline by @yzhang93 in #22033
LLVM-Integrate: Drop revert for f645d209d by @qedawkins in #22044
Integrate llvm/llvm-project@50ef746a12 by @qedawkins in #22046
[Encoding] Propagate layout encodings through tensor.cast ops by @Max191 in #21970
[python] Expose python bindings for nvvm in iree.compiler.dialects by @saladpalad in #21993
Revert "[Dispatch Creation] Rework dispatch formation logic (#21854)" by @IanWood1 in #22058
[Codegen] Add transform ops for matching contraction ops by @bangtianliu in #21981
Integrate llvm/llvm-project@1ee18959bcdf by @efric in #22062
Disable data-tiling flag by default and refresh the CPU docs. by @hanhanW in #21935
[DispatchCreation] Propagate and fuse encodings in default path by @Max191 in #22063
[LinalgExt][NFC] Remove unused VectorOps include by @hanhanW in #22066
Integrate llvm/llvm-project@9d48df7a92e7 by @efric in #22064
[LLVMGPU] Enable iree-llvmgpu-test-combine-layout-transformation by default by @Max191 in #21979
Adding interface support for stream.async.transfer result placement. by @benvanik in #22048
Marking stream.tensor.dispatch pure. by @benvanik in #22075
[GPU][DT] Add data-tiling resolver by default. by @hanhanW in #22074
[Util] Allow varying types in optimization barrier by @qedawkins in #22076
[ROCm] Add an experimental target for gfx1250 by @kuhar in #22077
Remove e2e matmul tests with explicit compilation-info by @bjacob in #22085
[NFC] Improving consistency of Util/Transforms/Passes.h. by @benvanik in #22078
e2e matmul tests covering vector-distribution by @bjacob in #22086
Reapply "[LinalgExt] Fix FoldWithProducerReshapeByExpansion for >1 … by @IanWood1 in #22088
[Test] Trim data-tiling compile flags from tests. by @hanhanW in #22092
[Codegen] Support scaled matmul in ConvertAccGEMMToGEMM by @Max191 in #22093
[GPU] Fix bug in shared memory computation for scaled intrinsics by @Max191 in #22095
Integrate llvm/llvm-project@876296e9b7f0 by @efric in #22097
[Codegen] Add transform op for matching dimension sizes. by @bangtianliu in #22040
Revert e2e matmul tests changes by @bjacob in #22111
[mlir][amdgpu] Replaced nullopt with target arch chipset in populateGpuPromoteShuffleToAMDGPUPatterns pass by @xintin in #21799
Display a warning when we spill SGPRs or VGPRs by @sebvince in #21863
[Codegen][GPU] Fix MMA Intrinsics Sorting by @bangtianliu in #22090
[Codegen][GPU][NFC] Fix mma sort follow up by @bangtianliu in #22122
[DT] Add support for materializing func.func args with encodings. by @hanhanW in #22115
Break generate_e2e_matmul_test.py into multiple files by @bjacob in #22120
NFC: Simplify generation of e2e matmul test functions. by @bjacob in #22123
[ROCm] Fix up gfx1250 definitions by @kuhar in #22131
Clean up KnownTargets.cpp. NFC. by @kuhar in #22133
Fix the Windows build: portably set environment variable PYTHONPATH. by @bjacob in #22136
[Codegen][ROCm] Attempt to fix MMA sorting CI failures by @kuhar in #22141
[NFC][GlobalOpt] Update function names in LIT by @AGindinson in #22083
[VectorDistribute] Flush denormals for attention reduction config by @Groverkss in #22041
Simplify op conversion pattern inheriting constructor definitions. NFC. by @kuhar in #22143
Simplify op rewrite pattern inheriting constructor definitions. NFC. by @kuhar in #22142
[LLVMGPU] Don't use DMA for scaled matmul by @Max191 in #22094
[Codegen] Add bufferization support for new iree_gpu.coalesced_gather_dma op by @lialan in #22049
[GPU] Support iree_codegen.load_from_buffer in GPUBubbleResourceCasts by @Max191 in #22140
[Preprocessing] Add pass to sink transpose through pad by @IanWood1 in #22106
[LLVMGPU] Unroll elementwise operations by @Groverkss in #21665
Bump actions/cache from 4.2.4 to 4.3.0 in the github-actions group by @dependabot[bot] in #22152
Increase golden time. by @amd-eochoalo in #22159
[Codegen][AMDGPU] Fix incorrect canonical map for MXFP RHS scales by @krzysz00 in #22162
[Preprocessing] Transpose conv filter layout from CHWF to FHWC by @yzhang93 in #22100
Integrate llvm/llvm-project@7af31bf by @amd-eochoalo in #22148
[LLVMGPU][Codegen] Increase parallel rows read for matvec by @efric in #22163
[Codegen] support matching any values for dims_equal transform op by @bangtianliu in #22149
Integrate llvm/llvm-project@a33544b by @amd-eochoalo in #22167
E2E MXFP4 matmul tests by @bjacob in #22170
[Test][NFC] Drop input_type from e2e tests because IREE can infer the input type. by @hanhanW in #22014
[DispatchCreation] infer split-reduction sizes for ArgCompare by @bangtianliu in #22154
[Codegen][AMDGPU] Tile and convert gather to coalesced DMA by @lialan in #22157
Revert "[LLVMGPU] Unroll elementwise operations (#21665)" by @MaheshRavishankar in #22186
Integrate llvm/llvm-project@0cb9d40 by @amd-eochoalo in #22182
Port e2e matmul tests from gfx942 to gfx950 by @bjacob in #22191
[E2E-Matmul] Remove redundant flag from scaled matmul e2e test by @Max191 in #22190
[Codegen][AMDGPU] Enable gpu.printf patterns by @krzysz00 in #22192
[GPU] Add thread tile size inference for map_scatter op by @Abhishek-Varma in #22179
[DT][ROCM] Fix inner_tiled bitcode ukernel lowering with instrinsicsM(N) = 1 by @Yu-Zhewen in #22184
[Codegen] Add ResolveShapedTypeResultDimsPass pass to GPU vector distribute by @fabianmcg in #22196
[LinalgExt] Introduce linalg_ext.exp_reduction by @hhkit in #21761
Integrate llvm/llvm-project@4845b3e by @amd-eochoalo in #22200
[GPU] Allow multi result and indexing compute generic ops in TilleAndFuse pipeline by @nirvedhmeshram in #22195
[Codegen][GPU] Fix IGEMM pre-padding and fusion patterns by @yzhang93 in #22197
[DataTiling] Introduce DataTiledMMAInterfaceAttr by @Max191 in #22098
[Codegen] Follow-up Fix for MatchContractionOp by @bangtianliu in #22201
Removing virtual MMAs from vector distribute matmul/conv pipeline by @jerryyin in #22202
[Codegen] Add transform op for matching convolution ops by @bangtianliu in #22194
Revert "[GPU] Allow multi result and indexing compute generic ops in TilleAndFuse pipeline" by @IanWood1 in #22205
[Codegen] Fix premature return in iree_codegen.inner_tiled verifier by @Max191 in #22183
[Codegen][LLVMGPU] Later scf-to-cf to support math.erf by @newling in #21817
[CI] Add dummy torch pkgci by @Groverkss in #22203
[DataTiling][GPU] Introduce DataTiledScaledMMAAttr by @Max191 in #22176
[Preprocessing] Reorder the iterator dims to match forward NHWC-FHWC convs by @yzhang93 in #22208
Decrease llama 8b_f16_decode golden time by @efric in #22220
[DataTiling][GPU] Implement scaled matmul data tiling materialization by @Max191 in #22189
Integrate llvm/llvm-project@95e0ae9f by @newling in #22214
Control const expr hoisting in Dispatch Creation by @IanWood1 in #22164
[codegen] Fix test after PR 22196 by @fabianmcg in #22218
[codegen][gpu] GPUApplyPaddingLevel: fold case where no padding by @newling in #22193
[GlobalOpt] Use Option<> for TransformOptions by @IanWood1 in #22222
[ROCm] Enable e2e stablehlo tests by @kuhar in #22224
Adding a LiftCFGToSCFPass. by @benvanik in #22101
Improving support for unreachable control flow in both CFG and SCF. by @benvanik in #22102
Adding VerifyStructuredControlFlowPass. by @benvanik in #22110
Fix g++ warning -Werror=parentheses by @IanWood1 in #22225
Update CODEOWNERS to include new tests and dialect owners by @Groverkss in #22213
[CI] Add clip and llama torch_models tests by @Groverkss in #22212
[samples] Update PyTorch JIT notebook for Python 3.12 by @HeatCrab in #22209
[Codegen] add transform op for matching attention op by @bangtianliu in #22199
Fix linking MSVC error from forward declaration used as templated type by @Max191 in #22233
[PkgCI] Use urllib instead of github cli in pkgci artifact_run by @Groverkss in #22211
Transposed Workgroup Reordering for large rectangular matmuls by @sebvince in #22165
Fix typo ConditionalTranspose attribute description by @sebvince in #22238
[docs] Add LLVM debugging and some AMDGPU-specific tips by @krzysz00 in #22146
Integrate llvm/llvm-project@7546bd3 by @newling in #22234
[CI][iree-test-suites] Add random weight 8b_fp8 and 8b_fp16 benchmarks by @Groverkss in #22239
Integrate llvm/llvm-project@327a89c by @newling in #22255
[CI][iree-test-suites] Upload json summary for torch_models CI by @Groverkss in #22253
[CI][iree-test-suites] Update ref for iree-test-suites by @Groverkss in #22263
[build flags] prepare to enable more warnings in compile flags (#21996) by @schuermans-roofline in #22252
[Codegen] Update the assembly formats and corresponding tests for matcher ops by @bangtianliu in #22270

New Contributors

@LekkalaSravya3 made their first contribution in #21682
@saladpalad made their first contribution in #21993
@xintin made their first contribution in #21799
@hhkit made their first contribution in #21761
@HeatCrab made their first contribution in #22209
@schuermans-roofline made their first contribution in #22252

Full Changelog: v3.7.0...v3.8.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release v3.8.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

IREE Release v3.8.0

1. Compiler

1.1 Data Tiling & Scaled Matmul

1.2 Convolution

1.3 Matmul & Vector Distribute

1.4 Others

2. Runtime

Change Log

What's Changed

New Contributors

Contributors

Uh oh!