Add Fast PromptDepthAnything Processor #40602

SamuelBarryCS · 2025-09-01T22:59:28Z

What:

Implement fast processor for PromptDepthAnything, following request in [Contributions Welcome] Add Fast Image Processors #36978
Add one additional test to tests/models/prompt_depth_anything to check for numerical value
Add a temporary file (to be deleted before merging) to run additional test and get speed benchmark of classic vs. fast implementation

Test performed:

Run RUN_SLOW=1 python -m pytest tests/models/prompt_depth_anything/test_image_processing_prompt_depth_anything.py -v, all passing:

Run the temporary file to get additional confidence in the fast processor output & get speed benchmarks:

📱 Testing on device: cpu
------------------------------

🔧 Config: batch_size=1, image_size=(384, 384)
⏳ Benchmarking slow processor...
⚡ Benchmarking fast processor...
📊 Results:
   Slow: 0.0032s ± 0.0004s
   Fast: 0.0006s ± 0.0000s
   Speedup: 5.83x
✅ Output verification: PASSED
   (Shape checked ✓, pixel value equality checked ✓, depth value equality checked ✓)

🔧 Config: batch_size=1, image_size=(512, 512)
⏳ Benchmarking slow processor...
⚡ Benchmarking fast processor...
📊 Results:
   Slow: 0.0049s ± 0.0003s
   Fast: 0.0014s ± 0.0001s
   Speedup: 3.42x
✅ Output verification: PASSED
   (Shape checked ✓, pixel value equality checked ✓, depth value equality checked ✓)
...
...
(all 12 cases passing)

Impact metrics

10x speedup on 1 H100 setup

Speed benchmark (normal vs fast) on CPU:

Speed benchmark (normal vs fast) on 1 H100:

How to review:

Read diff
Run tests with RUN_SLOW=1 python -m pytest tests/models/prompt_depth_anything/test_image_processing_prompt_depth_anything.py -v

TODO/ Next:

NA

SamuelBarryCS · 2025-09-02T06:59:21Z

cc @yonigozlan, ready for review 🤗!

Rocketknight1 · 2025-09-02T13:00:48Z

cc @yonigozlan

SamuelBarryCS · 2025-09-08T17:35:17Z

@yonigozlan I actually have to fix things - I turned back the PR in draft/ WIP
Sorry about that. Let me ping you once it's reviewable

yonigozlan

Hey @SamuelBarryCS, thanks a lot for contributing this! I pointed out a few changes to make before merging 🤗

src/transformers/models/auto/image_processing_auto.py

src/transformers/models/prompt_depth_anything/benchmark_prompt_depth_anything_fast.py

src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py

tests/models/prompt_depth_anything/test_image_processing_prompt_depth_anything.py

Co-authored-by: Yoni Gozlan <[email protected]>

…m:SamuelBarryCS/transformers into fast-image-processing-promptdepthanything

…_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]>

SamuelBarryCS · 2025-09-12T22:11:30Z

src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py

+        processed_images = reorder_images(processed_images_grouped, grouped_images_index)
+
+        # Only stack tensors if they all have the same shape and return_tensors is specified
+        if return_tensors == "pt" and processed_images:


FYI Yoni - we can't stack tensors of different shapes. Trying to stack without this check makes the tests fail

Yes that's expected indeed, but better to get an error when attempting to stack than silently not stack, as the user will be expecting a tensor in output. For the batch tests, we can just set keep_aspect_ratio to False to have them pass

SamuelBarryCS · 2025-09-12T22:45:42Z

@yonigozlan it looks like I found a bug in the behavior of the slow processor.
Cf below where I'm logging encoding_slow.prompt_depth.shape & encoding_fast.prompt_depth.shape for torchify = True/ False in images = self.image_processor_tester.prepare_image_inputs(equal_resolution=True, torchify=True) in test_slow_fast_equivalence_batched:

=== TORCHIFY=FALSE ===
Slow processor: torch.Size([7, 1, 192, 256])
Fast processor: torch.Size([7, 1, 192, 256])

=== TORCHIFY=TRUE ===
Slow processor: torch.Size([7, 192, 256, 1])
Fast processor: torch.Size([7, 1, 192, 256])

The issue comes from applying to_channel_dimension_format with the wrong input channel dim. I fixed it in 6aec2cf. After the fix, the logging looks as follow:

=== TORCHIFY=FALSE ===
Slow processor: torch.Size([7, 1, 192, 256])
Fast processor: torch.Size([7, 1, 192, 256])

=== TORCHIFY=TRUE ===
Slow processor: torch.Size([7, 1, 192, 256])
Fast processor: torch.Size([7, 1, 192, 256])

which is way better :) !

SamuelBarryCS · 2025-09-12T23:59:01Z

Should be ready for another (hopefully final) round of review @yonigozlan 🤗

…ng-promptdepthanything

yonigozlan · 2025-09-15T14:44:46Z

Actually @yonigozlan , isn't there a bug in the slow processor ?

transformers/src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py

Line 266 in 29e567e

pad_size_left, pad_size_right = _get_pad(height, size_divisor)

Why are we getting left/right pd using height and not width ? ... I am doing pad_size_left, pad_size_right = _get_pad(width, size_divisor) in fast processor which makes much more sense in my opinion

Indeed thanks for catching that! Looks like it was functionally correct because it's also reversed in the call to pad, but definitely misleading

yonigozlan · 2025-09-15T14:45:00Z

@yonigozlan it looks like I found a bug in the behavior of the slow processor. Cf below where I'm logging encoding_slow.prompt_depth.shape & encoding_fast.prompt_depth.shape for torchify = True/ False in images = self.image_processor_tester.prepare_image_inputs(equal_resolution=True, torchify=True) in test_slow_fast_equivalence_batched:
=== TORCHIFY=FALSE ===
Slow processor: torch.Size([7, 1, 192, 256])
Fast processor: torch.Size([7, 1, 192, 256])

=== TORCHIFY=TRUE ===
Slow processor: torch.Size([7, 192, 256, 1])
Fast processor: torch.Size([7, 1, 192, 256])
The issue comes from applying to_channel_dimension_format with the wrong input channel dim. I fixed it in 6aec2cf. After the fix, the logging looks as follow:
=== TORCHIFY=FALSE ===
Slow processor: torch.Size([7, 1, 192, 256])
Fast processor: torch.Size([7, 1, 192, 256])

=== TORCHIFY=TRUE ===
Slow processor: torch.Size([7, 1, 192, 256])
Fast processor: torch.Size([7, 1, 192, 256])
which is way better :) !

Niice thanks for fixing!

yonigozlan

Thanks for iterating again, just pushed some last very minor changes, but everything looks good now, :) waiting for the CI to pass then I'll merge!

yonigozlan · 2025-09-15T14:22:58Z

src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py

+        processed_images = reorder_images(processed_images_grouped, grouped_images_index)
+
+        # Only stack tensors if they all have the same shape and return_tensors is specified
+        if return_tensors == "pt" and processed_images:


Yes that's expected indeed, but better to get an error when attempting to stack than silently not stack, as the user will be expecting a tensor in output. For the batch tests, we can just set keep_aspect_ratio to False to have them pass

github-actions · 2025-09-15T14:49:17Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, prompt_depth_anything

HuggingFaceDocBuilderDev · 2025-09-15T14:57:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SamuelBarryCS · 2025-09-15T15:57:35Z

Perfect, thanks a lot for the last round of edit & merging! 🤗

* Test & import setup * First version passing tests * Ruff * Dummy post processing * Add numerical test * Adjust * Doc * Ruff * remove unused arg * Refine interpolation method and push test script * update bench * Comments * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Yoni Gozlan <[email protected]> * Remove benchmrk script * Update docstrings * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]> * doc * further process kwargs * remove it * remove * Remove to dict * remove crop middle * Remove param specific handling * Update testing logic * remove ensure multiple of as kwargs * fix formatting * Remove none default and get image size * Move stuff to _preprocess_image_like_inputs and refacto * Clean * ruff * End of file & comments * ruff again * Padding fixed * Remove comments to pass tests * Remove prompt depth from kwargs * Adjust output_size logic * Docstring for preprocess * auto_docstring for preprocess * pass as an arg * update test batched * stack images * remove prompt scale to meter * return tensors back in preprocess * remove copying of images * Update behavior to match old processoer * Fix batch size of tests * fix test and fast * Fix slow processor * Put tests back to pytorch * remove check and modify batched tests * test do_pad + slow processor fix --------- Co-authored-by: Yoni Gozlan <[email protected]> Co-authored-by: yonigozlan <[email protected]>

… text generation (huggingface#40837) * init * added TopH * Update TopH logits_process.py * Update logits_process.py * Update test_logits_process.py * Update test_logits_process.py * added test No. 4 * Resolving __init__.py issues * Resolving configuration_utils.py Issues * Resolving logits_process.py Issues * Resolving utils.py Issues * Resolving test_logits_process.py Issues * Resolving __init__.py issues * Resolving logits_process.py Issues * Resolving __init__.py issues * Updated Docs * Updated Docstring * style: autoformat with make fixup * Fixing Docstring * Update logits_process.py removed defaults * Variable H name -> cumulative_entropy * Using torch.distributions.Categorical * Improve torch_dtype checks (#40808) * Improve torch_dtype checks Signed-off-by: Yuanyuan Chen <[email protected]> * Apply suggestions from code review --------- Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Joao Gante <[email protected]> * Add VideoProcessors to auto-backend requirements (#40843) * add it * fix existing ones * add perception to auto_mapping... * Adds Causal Conv 1D kernel for mamba models (#40765) * add kernel * make style * keep causal-conv1d * small fix * small fix * fix modular converter * modular fix + lazy loading * revert changes modular * nit * hub kernels update * update * small nit * Update no split modules in T5Gemma model (#40810) * Update no split modules in T5Gemma model * Update no_split_modules also for T5Gemma modular * Remove model_split_percents from test cases --------- Co-authored-by: Anton Vlasjuk <[email protected]> * Replace image classification loss functions to `self.loss_function` (#40764) * Fix the misalignment between the l2norm in GDN of Qwen3-Next and the implementation in the FLA library. (#40842) * align torch implementation of gdn with fla. * fix fla import. * fix * remove unused attr * fixes * strictly align l2norm in Qwen3-Next with FLA implementation. --------- Co-authored-by: bozheng-hit <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> * Fixes for continuous batching (#40828) * Fix for CB attn mask and refactor * Tests for CB (not all passing) * Passing tests and a logger fix * Fixed the KV metrics that were broken when we moved to hybrid alloc * Fix circular import and style * Added tests for FA * Unfolded test to have device expectations * Fixes for H100 * more fixes for h100 * H100 are good * Style * Adding some comments from #40831 * Rename test * Avoid 1 letter variables * Dictonnary is only removed during kwargs * Test for supported sample * Fix a unvoluntary slice * Fixes for non-sliced inputs and small example improvments * Slice inputs is more understandabe * Style * [tests] re-enable aria fast tests (#40846) * rise from the dead * test * [SAM2] Fix inconsistent results with original implementation with input boxes (#40800) * Fix inconsistencies with box input inference with original repo * remove print * always pad * fix modular * [Sam2Video] Fix video inference with batched boxes and add test (#40797) fix video inference with batched boxes and add test * add: differential privacy research model (#40851) * VaultGemma * Removing Sequence and Token classification models. Removing integration tests for now * Remove pass-only modular code. style fixes * Update vaultgemma.md * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <[email protected]> * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <[email protected]> * Add links to model doc * Correct model doc usage examples * Updating model doc to describe differences from Gemma 2 * Update model_doc links * Adding integration tests * style fixes * repo consistency * attribute exception --------- Co-authored-by: Amer <[email protected]> Co-authored-by: Anton Vlasjuk <[email protected]> * [test] Fix test_eager_matches_sdpa incorrectly skipped (#40852) * ouput_attentions in typed kwargs * correct typing in GenericForTokenClassification * improve * [tests] move generative tests away from `test_modeling_common.py` (#40854) move tests * [generate] Always use decoder config to init cache (#40772) * mega derp * fix * always use the decoder * Use checkpoint in auto_class_docstring (#40844) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix TrainingArguments.parallelism_config NameError with accelerate<1.10.1 (#40818) Fix ParallelismConfig type for accelerate < 1.10.1 Co-authored-by: Marc Sun <[email protected]> * Redirect MI355 CI results to dummy dataset (#40862) * [Bug fix #40813] Fix base_model_tp_plan of Starcoder2 model. (#40814) Signed-off-by: greg-kwasniewski1 <[email protected]> * [docstrings / type hints] Update outdated annotations for `past_key_values` (#40803) * some fixes * nits * indentation * indentation * a bunch of type hints * bulk changes * fix florence kwargs (#40826) * fix: XIELU act parameters not being casted to correct dtype (#40812) * Update model tags and integration references in bug report (#40881) * [Qwen3 Next] Use numerically stable `rsqrt` (#40848) use numerically stable inverse * Adding Support for Qwen3-VL Series (#40795) * add qwen3vl series * make fixup * fix import * re-protect import * fix it finally (need to merge main into the branch) * skip processor test (need the checkpoint) * oups typo * simplify modular * remove unecesary attr * fix layer * remove unused rope_deltas args * reuse image def * remove unnesesary imports --------- Co-authored-by: Cyril Vallez <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> * [`VaultGemma`] Update expectations in integration tests (#40855) * fix tests * style * Fix modular consistency (#40883) * reapply modular * add missing one * 🔴 Move variable output controls to `_prepare_generation_config ` (#40715) * move checks to validate steps where possible * fix csm and other models that override _sample * ops dia you again * opsie * joao review * Move variable output controls to `prepare_inputs_for_generation` * fix a bunch of models * back to basics * final touches * Clarify passing is_causal in sdpa_attention_paged_forward (#40838) * Correctly pass is_causal in sdpa_attention_paged_forward Signed-off-by: Yuanyuan Chen <[email protected]> * Improve typing Signed-off-by: Yuanyuan Chen <[email protected]> * Add comment Signed-off-by: Yuanyuan Chen <[email protected]> * Improve comments Signed-off-by: Yuanyuan Chen <[email protected]> * Revert typing Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Use torch.expm1 and torch.log1p for better numerical results (#40860) Signed-off-by: Yuanyuan Chen <[email protected]> * Add Fast PromptDepthAnything Processor (#40602) * Test & import setup * First version passing tests * Ruff * Dummy post processing * Add numerical test * Adjust * Doc * Ruff * remove unused arg * Refine interpolation method and push test script * update bench * Comments * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Yoni Gozlan <[email protected]> * Remove benchmrk script * Update docstrings * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]> * doc * further process kwargs * remove it * remove * Remove to dict * remove crop middle * Remove param specific handling * Update testing logic * remove ensure multiple of as kwargs * fix formatting * Remove none default and get image size * Move stuff to _preprocess_image_like_inputs and refacto * Clean * ruff * End of file & comments * ruff again * Padding fixed * Remove comments to pass tests * Remove prompt depth from kwargs * Adjust output_size logic * Docstring for preprocess * auto_docstring for preprocess * pass as an arg * update test batched * stack images * remove prompt scale to meter * return tensors back in preprocess * remove copying of images * Update behavior to match old processoer * Fix batch size of tests * fix test and fast * Fix slow processor * Put tests back to pytorch * remove check and modify batched tests * test do_pad + slow processor fix --------- Co-authored-by: Yoni Gozlan <[email protected]> Co-authored-by: yonigozlan <[email protected]> * Fix deta loading & dataclass (#40878) * fix * fix 2 * Remove dict branch of attention_mask in sdpa_attention_paged_forward (#40882) Remove dict branch of attention_mask Signed-off-by: Yuanyuan Chen <[email protected]> * 🌐 [i18n-KO] Translated smolvlm.md to Korean (#40414) * fix: manual edits * Apply suggestions from code review * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: Steven Liu <[email protected]> * 🌐 [i18n-KO] Translated `imageprocessor.md` to Korean (#39557) * feat: manual translation * docs: fix ko/_toctree.yml * Apply suggestions from code review Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Yijun Lee <[email protected]> * Update docs/source/ko/image_processors.md Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Yijun Lee <[email protected]> Co-authored-by: Steven Liu <[email protected]> * [generate] remove docs of a feature that no longer exists (#40895) * Make debugging failing tests (check and update expect output values) easier 🔥 (#40727) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fixing the call to kernelize (#40628) * fix * style * overload train and eval * add getter and setter * Fix getter regression (#40824) * test things * style * move tests to a sane place * Fix flaky `Gemma3nAudioFeatureExtractionTest::test_dither` (#40902) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * [cache] Merge static sliding and static chunked layer (#40893) * merge * get rid of tensors in get_mask_sizes!! * remove branch * add comment explanation * re-add the class with deprecation cycle * Harmonize CacheLayer names (#40892) * unify naming * style * doc as well * post rebase fix * style * style * revert * [cache] Only use scalars in `get_mask_sizes` (#40907) * remove tensor ops * style * style * Set seed for `Glm4vIntegrationTest` (#40905) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add Olmo3 model (#40778) * transformers add-new-model-like for Olmo3 * Implement modular Olmo3 * Update Olmo3 tests * Copy Olmo2 weight converter to Olmo3 * Implement Olmo3 weight converter * Fix code quality errors * Remove unused import * Address rope-related PR comments * Update Olmo3 model doc with minimal details * Fix Olmo3 rope test failure * Fix 7B integration test * remove dummy EncodingFast (#40864) Signed-off-by: Yuanyuan Chen <[email protected]> * Improve module name handling for local custom code (#40809) * Improve module name handling for local custom code * Use `%lazy` in logging messages * Revert "Use `%lazy` in logging messages" This reverts commit 5848755d5805e67177c5218f351c0ac852df9340. * Add notes for sanitization rule in docstring * Remove too many underscores * Update src/transformers/dynamic_module_utils.py * Update src/transformers/dynamic_module_utils.py --------- Co-authored-by: Matt <[email protected]> * Remove `runner_map` (#40880) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * disable `test_fast_is_faster_than_slow` (#40909) fix Co-authored-by: ydshieh <[email protected]> * [gemma3] `Gemma3ForConditionalGeneration` compatible with assisted generation (#40791) * gemma3vision compatible with assisted generation * docstring * BC * docstring * failing checks * make fixup * apply changes to modular * misc fixes * is_initialized * fix poor rebase * [generate] misc fixes (#40906) misc fixes * 🔴Make `center_crop` fast equivalent to slow (#40856) make center_crop fast equivalent to slow * Fix dtype in Paligemma (#40912) * fix dtypes * fix copies * delete unused attr * [Docs] Adding documentation of MXFP4 Quantization (#40885) * adding mxfp4 quantization docs * review suggestions * Apply suggestions from code review Co-authored-by: vb <[email protected]> Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: vb <[email protected]> Co-authored-by: Steven Liu <[email protected]> * Processor load with multi-processing (#40786) push * [Llama4] Remove `image_sizes` arg and deprecate `vision_feature_layer` (#40832) * Remove unused arg * deprecate * revrt one change * get set go * version correction * fix * make style * comment * Fix #40067: Add dedicated UMT5 support to GGUF loader (config, tokenizer, test) (#40218) * Fix #40067 : add UMT5 support in GGUF loader (config, tokenizer, test) * chore: fix code formatting and linting issues * refactor: move UMT5 GGUF test to quantization directory and clean up comments * chore: trigger CI pipeline * refactor(tests): Move UMT5 Encoder GGUF test to GgufModelTests. This consolidates the new test into the main class for consistency. * Add regression check to UMT5 encoder GGUF test Verify encoder output against reference tensor values with appropriate tolerances for stability. * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Mohamed Mekkouri <[email protected]> * Update tests/quantization/ggml/test_ggml.py remove comments Co-authored-by: Mohamed Mekkouri <[email protected]> --------- Co-authored-by: Mohamed Mekkouri <[email protected]> * [torchao safetensors] renaming get_state_dict function (#40774) renaming get_state_dict function Co-authored-by: Mohamed Mekkouri <[email protected]> * Adding activation kernels (#40890) * first commit * add mode * revert modeling * add compile * rm print * Minor fix for #40727 (#40929) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add support for Florence-2 training (#40914) * Support training florence2 * update doc and testing model to florence-community * fix florence-2 test, use head dim 16 instead of 8 for fa2 * skip test_sdpa_can_dispatch_on_flash * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Add LongCat-Flash (#40730) * working draft for LongCat * BC changes to deepseek_v3 for modular * format * various modularities * better tp plan * better init * minor changes * make modular better * clean up patterns * Revert a couple of modular commits, because we won't convert in the end * make things explicit. * draft test * toctree, tests and imports * drop * woops * make better things * update test * update * fixes * style and CI * convert stuff * up * ah, yes, that * enable gen tests * fix cache shape in test (sum of 2 things) * fix tests * comments * re-Identitise * minimize changes * better defaults * modular betterment * fix configuration, add documentation * fix init * add integration tests * add info * simplify * update slow tests * fix * style * some additional long tests * cpu-only long test * fix last tests? * urg * cleaner tests why not * fix * improve slow tests, no skip * style * don't upcast * one skip * finally fix parallelism * [DOC] Add missing dates in model cards (#40922) add missing dates * [models] remove unused `import torch.utils.checkpoint` (#40934) * Intel CPU dockerfile (#40806) * upload intel cpu dockerfile Signed-off-by: jiqing-feng <[email protected]> * update cpu dockerfile Signed-off-by: jiqing-feng <[email protected]> * update label name Signed-off-by: jiqing-feng <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]> * docs(i18n): Correct the descriptive text in the README_zh-hans.md (#40941) * Fix trainer tests (#40823) * fix liger * fix * more * fix * fix hp * fix --------- Co-authored-by: Matej Sirovatka <[email protected]> * Fix `Glm4vMoeIntegrationTest` (#40930) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Raise error instead of warning when using meta device in from_pretrained (#40942) * raise instead of warning * add timm * remove * Consistent naming for images kwargs (#40834) * use consistent naming for padding * no validation on pad size * add warnings * fix * fox copies * another fix * fix some tests * fix more tests * fix lasts tests * fix copies * better docstring * delete print * Remove nested import logic for torchvision (#40940) * remove nested import logic for torchvision * remove unnecessary protected imports * remove unnecessarry protected import in modular (and modeling) * fix wrongly remove protected imports * Fix `Glm4vModelTest::test_eager_matches_fa2_generate` (#40947) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Update expected values for some `test_speculative_generation` (#40949) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Standardize audio embedding function name for audio multimodal models (#40919) * Standardize audio embedding function name for audio multimodal models * PR review * Add FlexOlmo model (#40921) * transformers add-new-model-like * Add FlexOlmo implementation * Update FlexOlmo docs * Set default tokenization for flex olmo * Update FlexOlmo tests * Update attention comment * Remove unneeded use of `sliding_window` * Don't list dropout in eager_paged_attention_forward (#40924) Remove dropout argument Signed-off-by: Yuanyuan Chen <[email protected]> * Update expected values for one more `test_speculative_generation` after #40949 (#40967) fix Co-authored-by: ydshieh <[email protected]> * FIX(trainer): ensure final checkpoint is saved when resuming training (#40347) * fix(trainer): ensure final checkpoint is saved when resuming training * add test * make style && slight fix of test * make style again * move test code to test_trainer * remove outdated test file * Apply style fixes --------- Co-authored-by: rangehow <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Marc Sun <[email protected]> * Add new model LFM2-VL (#40624) * Add LFM2-VL support * add tests * linting, formatting, misc review changes * add siglip2 to auto config and instantiate it in lfm2-vl configuration * decouple image processor from processor * remove torch import from configuration * replace | with Optional * remove layer truncation from modeling file * fix copies * update everything * fix test case to use tiny model * update the test cases * fix finally the image processor and add slow tests * fixup * typo in docs * fix tests * the doc name uses underscore * address comments from Yoni * delete tests and unsuffling * relative import * do we really handle imports better now? * fix test * slow tests * found a bug in ordering + slow tests * fix copies * dont run compile test --------- Co-authored-by: Anna <[email protected]> Co-authored-by: Anna Banaszak <[email protected]> * Fix outdated version checks of accelerator (#40969) * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <[email protected]> * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Use `skip_predictor=True` in vjepa2 `get_vision_features` (#40966) use skip_predictor in vjepa2 `get_vision_features` * [Trainer] Fix DP loss (#40799) * fix * style * Fix fp16 * style --------- Co-authored-by: Matej Sirovatka <[email protected]> * [timm_wrapper] better handling of "Unknown model" exception in timm (#40951) * fix(timm): Add exception handling for unknown Gemma3n model * nit: Let’s cater to this specific issue * nit: Simplify error handling * Fix Issue #39030: AutoTokenizer.from_pretrained does not propagate token (#40956) * fix merge conflicts * change token typing --------- Co-authored-by: Ubuntu <[email protected]> * [tests] Really use small models in all fast tests (#40945) * start * xcodec * chameleon * start * layoutlm2 * layoutlm * remove skip * oups * timm_wrapper * add default * doc * consistency * Add captured actual outputs to CI artifacts (#40965) * fix * fix * Remove `# TODO: ???` as it make me `???` * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Revert change in `compile_friendly_resize` (#40645) fix * Track the CI (model) jobs that don't produce test output files (process being killed etc.) (#40981) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Using torch.distributions.Categorical * Remove `set_model_tester_for_less_flaky_tests` (#40982) remove * Benchmarking v2 GH workflows (#40716) * WIP benchmark v2 workflow * Container was missing * Change to sandbox branch name * Wrong place for image name * Variable declarations * Remove references to file logging * Remove unnecessary step * Fix deps install * Syntax * Add workdir * Add upload feature * typo * No need for hf_transfer * Pass in runner * Runner config * Runner config * Runner config * Runner config * Runner config * mi325 caller * Name workflow runs properly * Copy-paste error * Add final repo IDs and schedule * Review comments * Remove wf params * Remove parametrization from worfkflow files * Fix callers * Change push trigger to pull_request + label * Add back schedule event * Push to the same dataset * Simplify parameter description * 🔴[`Attention`] Bert-based Models Attention Refactor (#38301) * clean start to bert refactor * some test fixes * style * fix last tests * be strict on positional embeddings, fixup according tests * cache support * more cache fixes, new causal API * simplify masks, fix tests for gen * flex attn, static cache support, round of fixes * ? * this time * style * fix flash attention tests, flex attention requires torch 2.7.x to work with multiple classes (as recompile strats force a size call which is wrongly interpreted before) * roberta * fixup sdpa remains * attention split, simplify args and kwargs, better typing * fix encoder decoder * fix test * modular roberta * albert * data2vectext, making it modular tomorrow * modular data2vec text * tmp disable * xmod + cache position fixes * whoops * electra + markuplm, small fixes * remove wrong copy * xlm_roberta + some embedding fixes * roberta prelayernorm * RemBert: remove copy, maybe doing it later * ernie * fix roberta offloading * camembert * copy fixes * bert generation + fixes on eager * xlm roberta xl * bridgetower (text) + seamlessv2 copy fixes * rocbert + small fixes * whoops * small round of fixups * NOTE: kernels didnt load with an earlier version, some fixup (needs another look bc cross deps) * the end of the tunnel? * fixup nllbmoe + style * we dont need this anymore * megatron bert is barely used, low prio skip for now * Modernize bert (template for others) NOTE: trying to push this through, might be overdue if not in time possible * check inputs for all others (if checkmarked) * fix bridgetower * style * fix encoder decoder (partially but cause found and fix also, just needs to be done for everything else) * proper fix for bert to force intermediate dict outputs * propagate to others * style * xlm roberta xl investigation, its the layernorm... * mobile bert * revert this, might cause issues with composed models * review * style * Remove [[autodoc]] refs to TF/Flax objects (#40996) * remove refs * more * ENH: Enable readline support for transformers chat (#40911) ENH Enable readline support for chat This small change enables GNU readline support for the transformers chat command. This includes, among others: - advanced navigation and editing: ctrl + a ctrl + e alt + b alt + f ctrl + k alt + d etc. - navigate and search history: arrow up/down ctrl + p ctrl + n ctrl + r - undo: ctrl + _ - clear screen: ctrl + l Implementation Although it may look strange, just importing readline is enough to enable it in Python, see: https://docs.python.org/3/library/functions.html#input As readline is not available on some platforms (https://docs.python.org/3/library/readline.html), the import is guarded. Readline should work on Linux, MacOS, and with WSL, I'm not sure about Windows though. Ideally, someone can give it a try. It's possible that Windows users would have to install pyreadline (https://pypi.org/project/pyreadline3/). * [testing] test `num_hidden_layers` being small in model tester (#40992) fix Co-authored-by: ydshieh <[email protected]> * blt wip (#38579) * blt wip * cpu version * cpu friendly with full entropy model (real time patching) * adding config file instead of args file * enable MPS * refactoring unused code * single config class in config file * inherit from PreTrainedModel * refactor LMTransformer --> BLTPatcher * add conversion script * load from new checkpoing with form_pretrained * fixed demo from_pretrained * clean up * clean a few comments * cleanup folder * clean up dir * cleaned up modeling further * rename classes * adding transformers Attention class and RotaryEmbedding class * exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc * seperate out patcher config, update modeling and conversion script * rename vars to be more transformers-like * rm unused functions * adding cross attention from transformers * pass arg * rename weights * updated conversion script * overwritten commit! fixing PR * apply feedback * adding BLTRMSNorm like Llama * add repeat_kv and eager_attention_forward copied from * BLTMLP identical to MllamTextMLP * clean up some args' * more like mllama, but busier inits * BLTTransformerLayer config * decoder, encoder, global configs * wip working on modular file * cleaning up patch and configs * clean up patcher helpers * clean up patcher helpers further * clean up * some config renaming * clean up unused configs * clean up configs * clean up configs * update modular * clean * update demo * config more like mllama, seperated subconfigs from subdicts * read from config instead of self args * update demo file * model weights to causal lm weights * missed file * added tied weights keys * BLTForCausalLM * adding files after add-new-model-like * update demo * working on tests * first running integration tests * added integration tests * adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff * tokenizer clean up * modular file * fixing rebase * ruff * adding correct basemodel output and updating config with checkpoint vals (for testing) * BLTModelTests git status * enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic * fix sdpa == causal tests * fix small model test and some gradient checkpointing * skip training GC tests * fix test * updated modular * update modular * ruff * adding modular + modeling * modular * more modern is_casual check * cleaning up modular * more modular reduction * ruff * modular fix * fix styling * return 2 * return 2 * fix some tests * fix bltcrossattention after modular break * some fixes / feedback * try cache generate fix * try cache generate fix * fix generate tests * attn_impl workaround * refactoring to use recent TransformersKwargs changes * fix hidden_states shape test * refactor to new outputs * simplify outputs a bit * rm unneeded decoderlayer overwriting * rename blt * forgot tokenizer test renamed * Reorder * Reorder * working on modular * updates from modular * new modular * ruff and such * update pretrainedmodel modular * using cohere2 apply_rotary_pos_emb * small changes * apply feedback r2 * fix cross_attention * apply more feedback * update modeling fix * load submodules from pretrainedmodel * set initializer_range to subconfigs * rm cross_attnetion_states pass when not needed * add 7b projection layer support * check repo * make copies * lost cohere2 rotate_half * ruff * copies? * don't tie weights for submodules * tie weights setting * check docstrings * apply feedback * rebase * rebased modeling * update docs * applying feedback * few more fixes * fix can_record_outputs * fast tokenizer * no more modulelist * tok auto * rm tokenizersss * fix docs * ruff * fix after rebase * fix test, configs are not subscriptable --------- Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: Lysandre <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> * [docs] rm stray tf/flax autodocs references (#40999) rm tf references * [`RMSNorm`] Fix rms norm init for models that center around 1 (#40796) * fix * fixup inits * oops * fixup gemma * fixup modular order * how does this keep happen lol * vaultgemma is new i forgot * remove init check * Make `EfficientLoFTRModelTest` faster (#41000) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix typoes in src and tests (#40845) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix more dates in model cards and wrong modalities in _toctree.yml (#40955) * Fix model cards and modalities in toctree * fix new models * RUFF fix on CI scripts (#40805) Signed-off-by: Yuanyuan Chen <[email protected]> * fix dict like init for ModelOutput (#41002) * fix dict like init * style * 🚨 [v5] remove generate output retrocompatibility aliases (#40998) remove old type aliases * [tests] update `test_left_padding_compatibility` (and minimize overwrites) (#40980) * update test (and overwrites) * better test comment * 0 as a default for * Patch more `unittest.case.TestCase.assertXXX` methods (#41008) fix Co-authored-by: ydshieh <[email protected]> * 🚨 [v5] remove deprecated entry point (#40997) * remove old entry point * update references to transformers-cli * 🚨 [lightglue] fix: matches order changed because of early stopped indices (#40859) * fix: bug that made early stop change order of matches * fix: applied code suggestion Co-authored-by: Pavel Iakubovskii <[email protected]> * fix: applied code suggestion to modular * fix: integration tests --------- Co-authored-by: Pavel Iakubovskii <[email protected]> * Fix `PhimoeIntegrationTest` (#41007) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix Glm4v test (#41011) fix * Update after #41007 (#41014) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix benchmark runner argument name (#41012) * Adding support for Qwen3Omni (#41025) * Add Qwen3Omni * make fix-copies, import properly * nit * fix wrong setup. Why was audio_token_id renamed ? * upds * more processing fixes * yup * fix more generation tests * down to 1? * fix import issue * style, update check repo * up * fix quality at my best * final quality? * fix doc building * FINAL COMMIT: SKIP IMPORTANT BUT FAILING TESTS FOR MERGE * SKIP THE TEMPLATE ONE --------- Co-authored-by: lvyuanjun.lyj <[email protected]> Co-authored-by: Arthur <[email protected]> * Making compute_loss_func always take priority in Trainer (#40632) * logger warn, if-else logic improved * redundant if condition fix * Modify Qwen3Omni parameter name since VL changed it (#41045) Modify parameter name since VL changed it Co-authored-by: lvyuanjun.lyj <[email protected]> * Fix Qwen video tests (#41049) fix test * [testing] Fix `qwen2_audio` (#41018) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix typing of tuples (#41028) * Fix tuple typing Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Remove optax (#41030) Remove optax dep Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typos in English/Chinese documentation (#41031) * Fix typos and formatting in English docs Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typos and formatting in Chinese docs Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Use torch.autocast (#40975) * Use torch.autocast Signed-off-by: Yuanyuan Chen <[email protected]> * Format code Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * docs: improved RoPE function Docstrings (#41004) * docs: improved RoPE functuon docstrings * Update src/transformers/modeling_rope_utils.py Co-authored-by: Joao Gante <[email protected]> --------- Co-authored-by: Joao Gante <[email protected]> * Fix condition for emitting warning when generation exceeds max model length (#40775) correct warning when generation exceeds max model length Signed-off-by: Yannick Schnider <[email protected]> * Fix outdated torch version check (#40925) Update torch minimum version check to 2.2 Signed-off-by: Yuanyuan Chen <[email protected]> * Remove doc of tf and flax (#41029) Signed-off-by: Yuanyuan Chen <[email protected]> * Add Whole Word Masking and Padding Strategy to DataCollatorForLanguageModeling (#39485) * Add whole word masking * Vectorize whole word masking functions * Unit test whole word masking * Remove support for TF in whole word masking * [testing] Fix `seed_oss` (#41052) * fix * fix * fix * fix * fix * fix * Update tests/models/seed_oss/test_modeling_seed_oss.py Co-authored-by: Anton Vlasjuk <[email protected]> * fix --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: Anton Vlasjuk <[email protected]> * Remove repeated import (#40937) * Remove repeated import Signed-off-by: Yuanyuan Chen <[email protected]> * Fix conflict Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Simplify unnecessary Optional typing (#40839) Remove Optional Signed-off-by: Yuanyuan Chen <[email protected]> * Add write token for uploading benchmark results to the Hub (#41047) * Separate write token for Hub upload * Address review comments * Address review comments * Ci utils (#40978) * Add CI reports dir to gitignore * Add utils to run local CI * Review compliance * Style * License * Remove <frameworkcontent> and <pt> tags from documentation (#41055) * Remove <frameworkcontent> and <pt> tags Signed-off-by: Yuanyuan Chen <[email protected]> * Revert changes Signed-off-by: Yuanyuan Chen <[email protected]> * Update docs/source/en/model_doc/madlad-400.md --------- Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Joao Gante <[email protected]> * Fix CI jobs being all red 🔴 (false positive) (#41059) fix Co-authored-by: ydshieh <[email protected]> * Update quantization CI (#41068) * fix * new everything * fix * [i18n-bn] Add Bengali language README file (#40935) * [i18n-bn] Add Bengali language README file and update links in existing language files * Update Bengali README for clarity and consistency in model descriptions * Improve documentation and errors in Mamba2-based models (#41063) * fix bug in Mamba2 docs * correct 'because on of' issue * link to other Mamba2 model types * github URL is not changed * update error message in generated files * Update team member list for some CI workflows (#41094) * update list * update list --------- Co-authored-by: ydshieh <[email protected]> * fix crash when using chat to send 2+ request to gptoss (#40536) Signed-off-by: Wang, Yi <[email protected]> * Minor addition, no split modules for VideoMAEE (#41051) * added no split modules * fixed typo --------- Co-authored-by: Raushan Turganbay <[email protected]> * Switch to `python:3.10-slim` for CircleCI docker images (#41067) fix Co-authored-by: ydshieh <[email protected]> * Fix argument name in benchmarking script (#41086) * Fix argument name in benchmarking script * Adjust vars * Remove mention of TensorFlow/Flax/JAX from English documentation (#41058) Remove mention of TensorFlow from English documentation Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typos in documentation (#41087) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typing (#40788) * Fix optional typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix optional typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix schema typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typing * Fix typing * Fix typing * Fix typing * Use np.ndarray Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typing Signed-off-by: Yuanyuan Chen <[email protected]> * Format code Signed-off-by: Yuanyuan Chen <[email protected]> * Use np.ndarray Signed-off-by: Yuanyuan Chen <[email protected]> * Improve typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix quote string of np.ndarray Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> * Fix code * Format Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Remove unused arguments (#40916) * Fix unused arguments Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Remove tf and flax from Chinese documentation (#41057) Signed-off-by: Yuanyuan Chen <[email protected]> * fix wrong height and width when read video use torchvision (#41091) * docs: Fix Tool Use links and remove dead RAG links (#41104) docs: Fix tool use links. Remove dead RAG links. Fix style * 🚨 [generate] update paligemma mask updates (and other assisted generation-related fixes) (#40917) * tmp * fix modular inheritance * nit * paligemma 1 doesn't have swa * use same pattern as in models with hybrid layers * PR comments * helium also needs layer_typed (bc it relies on gemma) * paligemma/gemma3: same mask creation fn in fwd and generate * propagate changes to helium (gemma-based) * tmp commit * slow paligemma tests passing, let's see what breaks * fix test_left_padding_compatibility * tmp commit * tmp commit * rebase error * docs * reduce diff * like this? * t5gemma * better comment * shorter diff * exception * ffs type * optional * shorter modular_gemma.py * helium model actually needs no changes -- the tester is the issue * t5gemma modular config * a few more modular; paligemma BC * fix processor issues? * rm config exception * lift warning in gemma * [tests] gpt2 + `CausalLMModelTester` (#41003) * tmp commit * tmp commit * tmp commit * rm old GPT2ModelTester * nit bug * add facilities for encoder-decoder tests; add comments on ALL overwrites/extra fns * vision_encoder_decoder * Fix `_get_test_info` for inherited tests (#41106) * fix _get_test_info * fix patched * add comment * ruff --------- Co-authored-by: ydshieh <[email protected]> * Remove bad test skips (#41109) * remove bad skips * remove more * fix inits * Format empty lines and white space in markdown files. (#41100) * Remove additional white space and empty lines from markdown files Signed-off-by: Yuanyuan Chen <[email protected]> * Add empty lines around code Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Update ruff to 0.13.1 + target Python 3.10 + apply fixes (#37809) Update ruff to 0.13.1 target it to Python 3.10 and apply its fixes Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Yih-Dar <[email protected]> * 🚨 [V5] Remove deprecated training arguments (#41017) * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <[email protected]> * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <[email protected]> * Fix comments Signed-off-by: Yuanyuan Chen <[email protected]> * Fix code Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Support loading LFM2 GGUF (#41111) * add gguf config mapping for lfm2 * add lfm2 tensor process to unsqueeze conv weights * adjust values from gguf config to HF config * add test for lfm2 gguf * ruff --------- Co-authored-by: Marc Sun <[email protected]> * [torchao safetensors] integrate torchao safetensors support with transformers (#40735) * enable torchao safetensors * enable torchao safetensors support * add more version checking * [Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963) (#41036) * fix mismatched dims for qwen3 next * propagate changes * chore: renamed tot_heads to total_sequence_length * Apply suggestion from @vasqu Co-authored-by: Anton Vlasjuk <[email protected]> * minor fix to modular qwen3 next file --------- Co-authored-by: Anton Vlasjuk <[email protected]> * Fix the error where a keyword argument appearing before *args (#41099) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix broken `` expressions in markdown files (#41113) Fix broken expressions in markdown files Signed-off-by: Yuanyuan Chen <[email protected]> * Remove self-assignment (#41062) * Remove self-assignment Signed-off-by: Yuanyuan Chen <[email protected]> * Update src/transformers/integrations/flash_paged.py Co-authored-by: Matt <[email protected]> * Clear pass Signed-off-by: Yuanyuan Chen <[email protected]> * Clear pass Signed-off-by: Yuanyuan Chen <[email protected]> * Clear pass Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Matt <[email protected]> * 🚨Refactor: Update text2text generation pipelines to use max_new_tokens… (#40928) * Refactor: Update text2text generation pipelines to use max_new_tokens and resolve max_length warning * docs(text2text_generation): 更新参数注释以反映现代生成实践将max_length参数注释更新为max_new_tokens，以符合现代生成实践中指定生成新token数量的标准做法 * refactor(text2text_generation): Remove outdated input validation logic * docs(text2text_generation): Revert incorrectly modified comment * docs(text2text_generation): Revert incorrectly modified comment * Fixed MXFP4 model storage issue (#41118) * Fixed loading LongT5 from legacy checkpoints (#40724) * Fixed loading LongT5 from legacy checkpoints * Adapted the fix to work with missing lm_head * dummy commit (#41133) * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting --------- Co-authored-by: ydshieh <[email protected]> * Fix loading logic flaw with regards to unexpected and missing keys (#40850) * Unexpected keys should be ignored at load with device map * remove them all * fix logic flaw * fix * simplify * style * fix * revert caching allocator change * add other test * add nice doc --------- Co-authored-by: Cyril Vallez <[email protected]> * Using torch.distributions.Categorical * Resolving logits_process.py Issues * style: autoformat with make fixup * Update logits_process.py removed defaults * Variable H name -> cumulative_entropy * Resolving format error * Correction of the loop variables in logit processor * Vectorized the loop in logits_process * formatted logits_process * paper reference and stopping rule comment logits_process * Trigger CI rerun * Update logits_process.py * added test_TopH_example_integration * added test_TopH_example_integration * Update README.md * Restore CI config to match main (remove accidental changes) * Restore CI config to match upstream main (no diffs) --------- Signed-off-by: Yuanyuan Chen <[email protected]> Signed-off-by: greg-kwasniewski1 <[email protected]> Signed-off-by: jiqing-feng <[email protected]> Signed-off-by: Yannick Schnider <[email protected]> Signed-off-by: Wang, Yi <[email protected]> Co-authored-by: ArminAzizi98 <[email protected]> Co-authored-by: Yuanyuan Chen <[email protected]> Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> Co-authored-by: Mohamed Mekkouri <[email protected]> Co-authored-by: Yuchao Zhang <[email protected]> Co-authored-by: Anton Vlasjuk <[email protected]> Co-authored-by: Pavel Iakubovskii <[email protected]> Co-authored-by: Bo Zheng <[email protected]> Co-authored-by: bozheng-hit <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> Co-authored-by: Rémi Ouazan <[email protected]> Co-authored-by: Yoni Gozlan <[email protected]> Co-authored-by: Ryan Mullins <[email protected]> Co-authored-by: Amer <[email protected]> Co-authored-by: eustlb <[email protected]> Co-authored-by: Albert Villanova del Moral <[email protected]> Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Ákos Hadnagy <[email protected]> Co-authored-by: Grzegorz Kwasniewski <[email protected]> Co-authored-by: NanoCode012 <[email protected]> Co-authored-by: Arthur <[email protected]> Co-authored-by: 艾力可 <[email protected]> Co-authored-by: JJJYmmm <[email protected]> Co-authored-by: Manuel de Prada Corral <[email protected]> Co-authored-by: Samuel Barry <[email protected]> Co-authored-by: yonigozlan <[email protected]> Co-authored-by: HyunZ118 <[email protected]> Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Yijun Lee <[email protected]> Co-authored-by: Yih-Dar <[email protected]> Co-authored-by: ydshieh <[email protected]> Co-authored-by: Pablo Montalvo <[email protected]> Co-authored-by: Shane A <[email protected]> Co-authored-by: Xuehai Pan <[email protected]> Co-authored-by: Matt <[email protected]> Co-authored-by: Raushan Turganbay <[email protected]> Co-authored-by: Aritra Roy Gosthipaty <[email protected]> Co-authored-by: vb <[email protected]> Co-authored-by: Yaswanth Gali <[email protected]> Co-authored-by: Akshay Babbar <[email protected]> Co-authored-by: liangel-02 <[email protected]> Co-authored-by: Duc-Viet Hoang <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: jiqing-feng <[email protected]> Co-authored-by: lilin-1 <[email protected]> Co-authored-by: Matej Sirovatka <[email protected]> Co-authored-by: Jack <[email protected]> Co-authored-by: Rangehow <[email protected]> Co-authored-by: rangehow <[email protected]> Co-authored-by: Anna <[email protected]> Co-authored-by: Anna Banaszak <[email protected]> Co-authored-by: Hamish Scott <[email protected]> Co-authored-by: Harshal Janjani <[email protected]> Co-authored-by: Branden <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Benjamin Bossan <[email protected]> Co-authored-by: Ita Zaporozhets <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: Lysandre <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: StevenBucaille <[email protected]> Co-authored-by: BakerBunker <[email protected]> Co-authored-by: lvyuanjun.lyj <[email protected]> Co-authored-by: Arthur <[email protected]> Co-authored-by: Ayush <[email protected]> Co-authored-by: Ryan Mullins <[email protected]> Co-authored-by: Yannick Schnider <[email protected]> Co-authored-by: Ralph Gleaton <[email protected]> Co-authored-by: Saidur Rahman Pulok <[email protected]> Co-authored-by: Nick Doiron <[email protected]> Co-authored-by: Wang, Yi <[email protected]> Co-authored-by: Duygu Altinok <[email protected]> Co-authored-by: Jinde.Song <[email protected]> Co-authored-by: hbenoit <[email protected]> Co-authored-by: nnul <[email protected]> Co-authored-by: YangKai0616 <[email protected]> Co-authored-by: Karol Szustakowski <[email protected]> Co-authored-by: souvikku <[email protected]>

Original PR #40602 by SamuelBarryCS Original: huggingface/transformers#40602

Merged from original PR #40602 Original: huggingface/transformers#40602

Original PR #40602 by SamuelBarryCS Original: huggingface/transformers#40602

Merged from original PR #40602 Original: huggingface/transformers#40602

… text generation (huggingface#40837) * init * added TopH * Update TopH logits_process.py * Update logits_process.py * Update test_logits_process.py * Update test_logits_process.py * added test No. 4 * Resolving __init__.py issues * Resolving configuration_utils.py Issues * Resolving logits_process.py Issues * Resolving utils.py Issues * Resolving test_logits_process.py Issues * Resolving __init__.py issues * Resolving logits_process.py Issues * Resolving __init__.py issues * Updated Docs * Updated Docstring * style: autoformat with make fixup * Fixing Docstring * Update logits_process.py removed defaults * Variable H name -> cumulative_entropy * Using torch.distributions.Categorical * Improve torch_dtype checks (#40808) * Improve torch_dtype checks Signed-off-by: Yuanyuan Chen <[email protected]> * Apply suggestions from code review --------- Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Joao Gante <[email protected]> * Add VideoProcessors to auto-backend requirements (#40843) * add it * fix existing ones * add perception to auto_mapping... * Adds Causal Conv 1D kernel for mamba models (#40765) * add kernel * make style * keep causal-conv1d * small fix * small fix * fix modular converter * modular fix + lazy loading * revert changes modular * nit * hub kernels update * update * small nit * Update no split modules in T5Gemma model (#40810) * Update no split modules in T5Gemma model * Update no_split_modules also for T5Gemma modular * Remove model_split_percents from test cases --------- Co-authored-by: Anton Vlasjuk <[email protected]> * Replace image classification loss functions to `self.loss_function` (#40764) * Fix the misalignment between the l2norm in GDN of Qwen3-Next and the implementation in the FLA library. (#40842) * align torch implementation of gdn with fla. * fix fla import. * fix * remove unused attr * fixes * strictly align l2norm in Qwen3-Next with FLA implementation. --------- Co-authored-by: bozheng-hit <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> * Fixes for continuous batching (#40828) * Fix for CB attn mask and refactor * Tests for CB (not all passing) * Passing tests and a logger fix * Fixed the KV metrics that were broken when we moved to hybrid alloc * Fix circular import and style * Added tests for FA * Unfolded test to have device expectations * Fixes for H100 * more fixes for h100 * H100 are good * Style * Adding some comments from #40831 * Rename test * Avoid 1 letter variables * Dictonnary is only removed during kwargs * Test for supported sample * Fix a unvoluntary slice * Fixes for non-sliced inputs and small example improvments * Slice inputs is more understandabe * Style * [tests] re-enable aria fast tests (#40846) * rise from the dead * test * [SAM2] Fix inconsistent results with original implementation with input boxes (#40800) * Fix inconsistencies with box input inference with original repo * remove print * always pad * fix modular * [Sam2Video] Fix video inference with batched boxes and add test (#40797) fix video inference with batched boxes and add test * add: differential privacy research model (#40851) * VaultGemma * Removing Sequence and Token classification models. Removing integration tests for now * Remove pass-only modular code. style fixes * Update vaultgemma.md * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <[email protected]> * Update docs/source/en/model_doc/vaultgemma.md Co-authored-by: Anton Vlasjuk <[email protected]> * Add links to model doc * Correct model doc usage examples * Updating model doc to describe differences from Gemma 2 * Update model_doc links * Adding integration tests * style fixes * repo consistency * attribute exception --------- Co-authored-by: Amer <[email protected]> Co-authored-by: Anton Vlasjuk <[email protected]> * [test] Fix test_eager_matches_sdpa incorrectly skipped (#40852) * ouput_attentions in typed kwargs * correct typing in GenericForTokenClassification * improve * [tests] move generative tests away from `test_modeling_common.py` (#40854) move tests * [generate] Always use decoder config to init cache (#40772) * mega derp * fix * always use the decoder * Use checkpoint in auto_class_docstring (#40844) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix TrainingArguments.parallelism_config NameError with accelerate<1.10.1 (#40818) Fix ParallelismConfig type for accelerate < 1.10.1 Co-authored-by: Marc Sun <[email protected]> * Redirect MI355 CI results to dummy dataset (#40862) * [Bug fix #40813] Fix base_model_tp_plan of Starcoder2 model. (#40814) Signed-off-by: greg-kwasniewski1 <[email protected]> * [docstrings / type hints] Update outdated annotations for `past_key_values` (#40803) * some fixes * nits * indentation * indentation * a bunch of type hints * bulk changes * fix florence kwargs (#40826) * fix: XIELU act parameters not being casted to correct dtype (#40812) * Update model tags and integration references in bug report (#40881) * [Qwen3 Next] Use numerically stable `rsqrt` (#40848) use numerically stable inverse * Adding Support for Qwen3-VL Series (#40795) * add qwen3vl series * make fixup * fix import * re-protect import * fix it finally (need to merge main into the branch) * skip processor test (need the checkpoint) * oups typo * simplify modular * remove unecesary attr * fix layer * remove unused rope_deltas args * reuse image def * remove unnesesary imports --------- Co-authored-by: Cyril Vallez <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> * [`VaultGemma`] Update expectations in integration tests (#40855) * fix tests * style * Fix modular consistency (#40883) * reapply modular * add missing one * 🔴 Move variable output controls to `_prepare_generation_config ` (#40715) * move checks to validate steps where possible * fix csm and other models that override _sample * ops dia you again * opsie * joao review * Move variable output controls to `prepare_inputs_for_generation` * fix a bunch of models * back to basics * final touches * Clarify passing is_causal in sdpa_attention_paged_forward (#40838) * Correctly pass is_causal in sdpa_attention_paged_forward Signed-off-by: Yuanyuan Chen <[email protected]> * Improve typing Signed-off-by: Yuanyuan Chen <[email protected]> * Add comment Signed-off-by: Yuanyuan Chen <[email protected]> * Improve comments Signed-off-by: Yuanyuan Chen <[email protected]> * Revert typing Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Use torch.expm1 and torch.log1p for better numerical results (#40860) Signed-off-by: Yuanyuan Chen <[email protected]> * Add Fast PromptDepthAnything Processor (#40602) * Test & import setup * First version passing tests * Ruff * Dummy post processing * Add numerical test * Adjust * Doc * Ruff * remove unused arg * Refine interpolation method and push test script * update bench * Comments * Update src/transformers/models/auto/image_processing_auto.py Co-authored-by: Yoni Gozlan <[email protected]> * Remove benchmrk script * Update docstrings * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]> * Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]> * doc * further process kwargs * remove it * remove * Remove to dict * remove crop middle * Remove param specific handling * Update testing logic * remove ensure multiple of as kwargs * fix formatting * Remove none default and get image size * Move stuff to _preprocess_image_like_inputs and refacto * Clean * ruff * End of file & comments * ruff again * Padding fixed * Remove comments to pass tests * Remove prompt depth from kwargs * Adjust output_size logic * Docstring for preprocess * auto_docstring for preprocess * pass as an arg * update test batched * stack images * remove prompt scale to meter * return tensors back in preprocess * remove copying of images * Update behavior to match old processoer * Fix batch size of tests * fix test and fast * Fix slow processor * Put tests back to pytorch * remove check and modify batched tests * test do_pad + slow processor fix --------- Co-authored-by: Yoni Gozlan <[email protected]> Co-authored-by: yonigozlan <[email protected]> * Fix deta loading & dataclass (#40878) * fix * fix 2 * Remove dict branch of attention_mask in sdpa_attention_paged_forward (#40882) Remove dict branch of attention_mask Signed-off-by: Yuanyuan Chen <[email protected]> * 🌐 [i18n-KO] Translated smolvlm.md to Korean (#40414) * fix: manual edits * Apply suggestions from code review * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/model_doc/smolvlm.md * Update docs/source/ko/_toctree.yml Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: Steven Liu <[email protected]> * 🌐 [i18n-KO] Translated `imageprocessor.md` to Korean (#39557) * feat: manual translation * docs: fix ko/_toctree.yml * Apply suggestions from code review Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Yijun Lee <[email protected]> * Update docs/source/ko/image_processors.md Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Yijun Lee <[email protected]> Co-authored-by: Steven Liu <[email protected]> * [generate] remove docs of a feature that no longer exists (#40895) * Make debugging failing tests (check and update expect output values) easier 🔥 (#40727) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fixing the call to kernelize (#40628) * fix * style * overload train and eval * add getter and setter * Fix getter regression (#40824) * test things * style * move tests to a sane place * Fix flaky `Gemma3nAudioFeatureExtractionTest::test_dither` (#40902) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * [cache] Merge static sliding and static chunked layer (#40893) * merge * get rid of tensors in get_mask_sizes!! * remove branch * add comment explanation * re-add the class with deprecation cycle * Harmonize CacheLayer names (#40892) * unify naming * style * doc as well * post rebase fix * style * style * revert * [cache] Only use scalars in `get_mask_sizes` (#40907) * remove tensor ops * style * style * Set seed for `Glm4vIntegrationTest` (#40905) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add Olmo3 model (#40778) * transformers add-new-model-like for Olmo3 * Implement modular Olmo3 * Update Olmo3 tests * Copy Olmo2 weight converter to Olmo3 * Implement Olmo3 weight converter * Fix code quality errors * Remove unused import * Address rope-related PR comments * Update Olmo3 model doc with minimal details * Fix Olmo3 rope test failure * Fix 7B integration test * remove dummy EncodingFast (#40864) Signed-off-by: Yuanyuan Chen <[email protected]> * Improve module name handling for local custom code (#40809) * Improve module name handling for local custom code * Use `%lazy` in logging messages * Revert "Use `%lazy` in logging messages" This reverts commit 5848755d5805e67177c5218f351c0ac852df9340. * Add notes for sanitization rule in docstring * Remove too many underscores * Update src/transformers/dynamic_module_utils.py * Update src/transformers/dynamic_module_utils.py --------- Co-authored-by: Matt <[email protected]> * Remove `runner_map` (#40880) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * disable `test_fast_is_faster_than_slow` (#40909) fix Co-authored-by: ydshieh <[email protected]> * [gemma3] `Gemma3ForConditionalGeneration` compatible with assisted generation (#40791) * gemma3vision compatible with assisted generation * docstring * BC * docstring * failing checks * make fixup * apply changes to modular * misc fixes * is_initialized * fix poor rebase * [generate] misc fixes (#40906) misc fixes * 🔴Make `center_crop` fast equivalent to slow (#40856) make center_crop fast equivalent to slow * Fix dtype in Paligemma (#40912) * fix dtypes * fix copies * delete unused attr * [Docs] Adding documentation of MXFP4 Quantization (#40885) * adding mxfp4 quantization docs * review suggestions * Apply suggestions from code review Co-authored-by: vb <[email protected]> Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: vb <[email protected]> Co-authored-by: Steven Liu <[email protected]> * Processor load with multi-processing (#40786) push * [Llama4] Remove `image_sizes` arg and deprecate `vision_feature_layer` (#40832) * Remove unused arg * deprecate * revrt one change * get set go * version correction * fix * make style * comment * Fix #40067: Add dedicated UMT5 support to GGUF loader (config, tokenizer, test) (#40218) * Fix #40067 : add UMT5 support in GGUF loader (config, tokenizer, test) * chore: fix code formatting and linting issues * refactor: move UMT5 GGUF test to quantization directory and clean up comments * chore: trigger CI pipeline * refactor(tests): Move UMT5 Encoder GGUF test to GgufModelTests. This consolidates the new test into the main class for consistency. * Add regression check to UMT5 encoder GGUF test Verify encoder output against reference tensor values with appropriate tolerances for stability. * Update tests/quantization/ggml/test_ggml.py Co-authored-by: Mohamed Mekkouri <[email protected]> * Update tests/quantization/ggml/test_ggml.py remove comments Co-authored-by: Mohamed Mekkouri <[email protected]> --------- Co-authored-by: Mohamed Mekkouri <[email protected]> * [torchao safetensors] renaming get_state_dict function (#40774) renaming get_state_dict function Co-authored-by: Mohamed Mekkouri <[email protected]> * Adding activation kernels (#40890) * first commit * add mode * revert modeling * add compile * rm print * Minor fix for #40727 (#40929) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Add support for Florence-2 training (#40914) * Support training florence2 * update doc and testing model to florence-community * fix florence-2 test, use head dim 16 instead of 8 for fa2 * skip test_sdpa_can_dispatch_on_flash * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Add LongCat-Flash (#40730) * working draft for LongCat * BC changes to deepseek_v3 for modular * format * various modularities * better tp plan * better init * minor changes * make modular better * clean up patterns * Revert a couple of modular commits, because we won't convert in the end * make things explicit. * draft test * toctree, tests and imports * drop * woops * make better things * update test * update * fixes * style and CI * convert stuff * up * ah, yes, that * enable gen tests * fix cache shape in test (sum of 2 things) * fix tests * comments * re-Identitise * minimize changes * better defaults * modular betterment * fix configuration, add documentation * fix init * add integration tests * add info * simplify * update slow tests * fix * style * some additional long tests * cpu-only long test * fix last tests? * urg * cleaner tests why not * fix * improve slow tests, no skip * style * don't upcast * one skip * finally fix parallelism * [DOC] Add missing dates in model cards (#40922) add missing dates * [models] remove unused `import torch.utils.checkpoint` (#40934) * Intel CPU dockerfile (#40806) * upload intel cpu dockerfile Signed-off-by: jiqing-feng <[email protected]> * update cpu dockerfile Signed-off-by: jiqing-feng <[email protected]> * update label name Signed-off-by: jiqing-feng <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]> * docs(i18n): Correct the descriptive text in the README_zh-hans.md (#40941) * Fix trainer tests (#40823) * fix liger * fix * more * fix * fix hp * fix --------- Co-authored-by: Matej Sirovatka <[email protected]> * Fix `Glm4vMoeIntegrationTest` (#40930) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Raise error instead of warning when using meta device in from_pretrained (#40942) * raise instead of warning * add timm * remove * Consistent naming for images kwargs (#40834) * use consistent naming for padding * no validation on pad size * add warnings * fix * fox copies * another fix * fix some tests * fix more tests * fix lasts tests * fix copies * better docstring * delete print * Remove nested import logic for torchvision (#40940) * remove nested import logic for torchvision * remove unnecessary protected imports * remove unnecessarry protected import in modular (and modeling) * fix wrongly remove protected imports * Fix `Glm4vModelTest::test_eager_matches_fa2_generate` (#40947) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Update expected values for some `test_speculative_generation` (#40949) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Standardize audio embedding function name for audio multimodal models (#40919) * Standardize audio embedding function name for audio multimodal models * PR review * Add FlexOlmo model (#40921) * transformers add-new-model-like * Add FlexOlmo implementation * Update FlexOlmo docs * Set default tokenization for flex olmo * Update FlexOlmo tests * Update attention comment * Remove unneeded use of `sliding_window` * Don't list dropout in eager_paged_attention_forward (#40924) Remove dropout argument Signed-off-by: Yuanyuan Chen <[email protected]> * Update expected values for one more `test_speculative_generation` after #40949 (#40967) fix Co-authored-by: ydshieh <[email protected]> * FIX(trainer): ensure final checkpoint is saved when resuming training (#40347) * fix(trainer): ensure final checkpoint is saved when resuming training * add test * make style && slight fix of test * make style again * move test code to test_trainer * remove outdated test file * Apply style fixes --------- Co-authored-by: rangehow <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Marc Sun <[email protected]> * Add new model LFM2-VL (#40624) * Add LFM2-VL support * add tests * linting, formatting, misc review changes * add siglip2 to auto config and instantiate it in lfm2-vl configuration * decouple image processor from processor * remove torch import from configuration * replace | with Optional * remove layer truncation from modeling file * fix copies * update everything * fix test case to use tiny model * update the test cases * fix finally the image processor and add slow tests * fixup * typo in docs * fix tests * the doc name uses underscore * address comments from Yoni * delete tests and unsuffling * relative import * do we really handle imports better now? * fix test * slow tests * found a bug in ordering + slow tests * fix copies * dont run compile test --------- Co-authored-by: Anna <[email protected]> Co-authored-by: Anna Banaszak <[email protected]> * Fix outdated version checks of accelerator (#40969) * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <[email protected]> * Fix outdated version checks of accelerator Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Use `skip_predictor=True` in vjepa2 `get_vision_features` (#40966) use skip_predictor in vjepa2 `get_vision_features` * [Trainer] Fix DP loss (#40799) * fix * style * Fix fp16 * style --------- Co-authored-by: Matej Sirovatka <[email protected]> * [timm_wrapper] better handling of "Unknown model" exception in timm (#40951) * fix(timm): Add exception handling for unknown Gemma3n model * nit: Let’s cater to this specific issue * nit: Simplify error handling * Fix Issue #39030: AutoTokenizer.from_pretrained does not propagate token (#40956) * fix merge conflicts * change token typing --------- Co-authored-by: Ubuntu <[email protected]> * [tests] Really use small models in all fast tests (#40945) * start * xcodec * chameleon * start * layoutlm2 * layoutlm * remove skip * oups * timm_wrapper * add default * doc * consistency * Add captured actual outputs to CI artifacts (#40965) * fix * fix * Remove `# TODO: ???` as it make me `???` * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Revert change in `compile_friendly_resize` (#40645) fix * Track the CI (model) jobs that don't produce test output files (process being killed etc.) (#40981) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Using torch.distributions.Categorical * Remove `set_model_tester_for_less_flaky_tests` (#40982) remove * Benchmarking v2 GH workflows (#40716) * WIP benchmark v2 workflow * Container was missing * Change to sandbox branch name * Wrong place for image name * Variable declarations * Remove references to file logging * Remove unnecessary step * Fix deps install * Syntax * Add workdir * Add upload feature * typo * No need for hf_transfer * Pass in runner * Runner config * Runner config * Runner config * Runner config * Runner config * mi325 caller * Name workflow runs properly * Copy-paste error * Add final repo IDs and schedule * Review comments * Remove wf params * Remove parametrization from worfkflow files * Fix callers * Change push trigger to pull_request + label * Add back schedule event * Push to the same dataset * Simplify parameter description * 🔴[`Attention`] Bert-based Models Attention Refactor (#38301) * clean start to bert refactor * some test fixes * style * fix last tests * be strict on positional embeddings, fixup according tests * cache support * more cache fixes, new causal API * simplify masks, fix tests for gen * flex attn, static cache support, round of fixes * ? * this time * style * fix flash attention tests, flex attention requires torch 2.7.x to work with multiple classes (as recompile strats force a size call which is wrongly interpreted before) * roberta * fixup sdpa remains * attention split, simplify args and kwargs, better typing * fix encoder decoder * fix test * modular roberta * albert * data2vectext, making it modular tomorrow * modular data2vec text * tmp disable * xmod + cache position fixes * whoops * electra + markuplm, small fixes * remove wrong copy * xlm_roberta + some embedding fixes * roberta prelayernorm * RemBert: remove copy, maybe doing it later * ernie * fix roberta offloading * camembert * copy fixes * bert generation + fixes on eager * xlm roberta xl * bridgetower (text) + seamlessv2 copy fixes * rocbert + small fixes * whoops * small round of fixups * NOTE: kernels didnt load with an earlier version, some fixup (needs another look bc cross deps) * the end of the tunnel? * fixup nllbmoe + style * we dont need this anymore * megatron bert is barely used, low prio skip for now * Modernize bert (template for others) NOTE: trying to push this through, might be overdue if not in time possible * check inputs for all others (if checkmarked) * fix bridgetower * style * fix encoder decoder (partially but cause found and fix also, just needs to be done for everything else) * proper fix for bert to force intermediate dict outputs * propagate to others * style * xlm roberta xl investigation, its the layernorm... * mobile bert * revert this, might cause issues with composed models * review * style * Remove [[autodoc]] refs to TF/Flax objects (#40996) * remove refs * more * ENH: Enable readline support for transformers chat (#40911) ENH Enable readline support for chat This small change enables GNU readline support for the transformers chat command. This includes, among others: - advanced navigation and editing: ctrl + a ctrl + e alt + b alt + f ctrl + k alt + d etc. - navigate and search history: arrow up/down ctrl + p ctrl + n ctrl + r - undo: ctrl + _ - clear screen: ctrl + l Implementation Although it may look strange, just importing readline is enough to enable it in Python, see: https://docs.python.org/3/library/functions.html#input As readline is not available on some platforms (https://docs.python.org/3/library/readline.html), the import is guarded. Readline should work on Linux, MacOS, and with WSL, I'm not sure about Windows though. Ideally, someone can give it a try. It's possible that Windows users would have to install pyreadline (https://pypi.org/project/pyreadline3/). * [testing] test `num_hidden_layers` being small in model tester (#40992) fix Co-authored-by: ydshieh <[email protected]> * blt wip (#38579) * blt wip * cpu version * cpu friendly with full entropy model (real time patching) * adding config file instead of args file * enable MPS * refactoring unused code * single config class in config file * inherit from PreTrainedModel * refactor LMTransformer --> BLTPatcher * add conversion script * load from new checkpoing with form_pretrained * fixed demo from_pretrained * clean up * clean a few comments * cleanup folder * clean up dir * cleaned up modeling further * rename classes * adding transformers Attention class and RotaryEmbedding class * exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc * seperate out patcher config, update modeling and conversion script * rename vars to be more transformers-like * rm unused functions * adding cross attention from transformers * pass arg * rename weights * updated conversion script * overwritten commit! fixing PR * apply feedback * adding BLTRMSNorm like Llama * add repeat_kv and eager_attention_forward copied from * BLTMLP identical to MllamTextMLP * clean up some args' * more like mllama, but busier inits * BLTTransformerLayer config * decoder, encoder, global configs * wip working on modular file * cleaning up patch and configs * clean up patcher helpers * clean up patcher helpers further * clean up * some config renaming * clean up unused configs * clean up configs * clean up configs * update modular * clean * update demo * config more like mllama, seperated subconfigs from subdicts * read from config instead of self args * update demo file * model weights to causal lm weights * missed file * added tied weights keys * BLTForCausalLM * adding files after add-new-model-like * update demo * working on tests * first running integration tests * added integration tests * adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff * tokenizer clean up * modular file * fixing rebase * ruff * adding correct basemodel output and updating config with checkpoint vals (for testing) * BLTModelTests git status * enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic * fix sdpa == causal tests * fix small model test and some gradient checkpointing * skip training GC tests * fix test * updated modular * update modular * ruff * adding modular + modeling * modular * more modern is_casual check * cleaning up modular * more modular reduction * ruff * modular fix * fix styling * return 2 * return 2 * fix some tests * fix bltcrossattention after modular break * some fixes / feedback * try cache generate fix * try cache generate fix * fix generate tests * attn_impl workaround * refactoring to use recent TransformersKwargs changes * fix hidden_states shape test * refactor to new outputs * simplify outputs a bit * rm unneeded decoderlayer overwriting * rename blt * forgot tokenizer test renamed * Reorder * Reorder * working on modular * updates from modular * new modular * ruff and such * update pretrainedmodel modular * using cohere2 apply_rotary_pos_emb * small changes * apply feedback r2 * fix cross_attention * apply more feedback * update modeling fix * load submodules from pretrainedmodel * set initializer_range to subconfigs * rm cross_attnetion_states pass when not needed * add 7b projection layer support * check repo * make copies * lost cohere2 rotate_half * ruff * copies? * don't tie weights for submodules * tie weights setting * check docstrings * apply feedback * rebase * rebased modeling * update docs * applying feedback * few more fixes * fix can_record_outputs * fast tokenizer * no more modulelist * tok auto * rm tokenizersss * fix docs * ruff * fix after rebase * fix test, configs are not subscriptable --------- Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: Lysandre <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> * [docs] rm stray tf/flax autodocs references (#40999) rm tf references * [`RMSNorm`] Fix rms norm init for models that center around 1 (#40796) * fix * fixup inits * oops * fixup gemma * fixup modular order * how does this keep happen lol * vaultgemma is new i forgot * remove init check * Make `EfficientLoFTRModelTest` faster (#41000) * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix typoes in src and tests (#40845) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix more dates in model cards and wrong modalities in _toctree.yml (#40955) * Fix model cards and modalities in toctree * fix new models * RUFF fix on CI scripts (#40805) Signed-off-by: Yuanyuan Chen <[email protected]> * fix dict like init for ModelOutput (#41002) * fix dict like init * style * 🚨 [v5] remove generate output retrocompatibility aliases (#40998) remove old type aliases * [tests] update `test_left_padding_compatibility` (and minimize overwrites) (#40980) * update test (and overwrites) * better test comment * 0 as a default for * Patch more `unittest.case.TestCase.assertXXX` methods (#41008) fix Co-authored-by: ydshieh <[email protected]> * 🚨 [v5] remove deprecated entry point (#40997) * remove old entry point * update references to transformers-cli * 🚨 [lightglue] fix: matches order changed because of early stopped indices (#40859) * fix: bug that made early stop change order of matches * fix: applied code suggestion Co-authored-by: Pavel Iakubovskii <[email protected]> * fix: applied code suggestion to modular * fix: integration tests --------- Co-authored-by: Pavel Iakubovskii <[email protected]> * Fix `PhimoeIntegrationTest` (#41007) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix Glm4v test (#41011) fix * Update after #41007 (#41014) * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix benchmark runner argument name (#41012) * Adding support for Qwen3Omni (#41025) * Add Qwen3Omni * make fix-copies, import properly * nit * fix wrong setup. Why was audio_token_id renamed ? * upds * more processing fixes * yup * fix more generation tests * down to 1? * fix import issue * style, update check repo * up * fix quality at my best * final quality? * fix doc building * FINAL COMMIT: SKIP IMPORTANT BUT FAILING TESTS FOR MERGE * SKIP THE TEMPLATE ONE --------- Co-authored-by: lvyuanjun.lyj <[email protected]> Co-authored-by: Arthur <[email protected]> * Making compute_loss_func always take priority in Trainer (#40632) * logger warn, if-else logic improved * redundant if condition fix * Modify Qwen3Omni parameter name since VL changed it (#41045) Modify parameter name since VL changed it Co-authored-by: lvyuanjun.lyj <[email protected]> * Fix Qwen video tests (#41049) fix test * [testing] Fix `qwen2_audio` (#41018) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <[email protected]> * Fix typing of tuples (#41028) * Fix tuple typing Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Remove optax (#41030) Remove optax dep Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typos in English/Chinese documentation (#41031) * Fix typos and formatting in English docs Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typos and formatting in Chinese docs Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Use torch.autocast (#40975) * Use torch.autocast Signed-off-by: Yuanyuan Chen <[email protected]> * Format code Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * docs: improved RoPE function Docstrings (#41004) * docs: improved RoPE functuon docstrings * Update src/transformers/modeling_rope_utils.py Co-authored-by: Joao Gante <[email protected]> --------- Co-authored-by: Joao Gante <[email protected]> * Fix condition for emitting warning when generation exceeds max model length (#40775) correct warning when generation exceeds max model length Signed-off-by: Yannick Schnider <[email protected]> * Fix outdated torch version check (#40925) Update torch minimum version check to 2.2 Signed-off-by: Yuanyuan Chen <[email protected]> * Remove doc of tf and flax (#41029) Signed-off-by: Yuanyuan Chen <[email protected]> * Add Whole Word Masking and Padding Strategy to DataCollatorForLanguageModeling (#39485) * Add whole word masking * Vectorize whole word masking functions * Unit test whole word masking * Remove support for TF in whole word masking * [testing] Fix `seed_oss` (#41052) * fix * fix * fix * fix * fix * fix * Update tests/models/seed_oss/test_modeling_seed_oss.py Co-authored-by: Anton Vlasjuk <[email protected]> * fix --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: Anton Vlasjuk <[email protected]> * Remove repeated import (#40937) * Remove repeated import Signed-off-by: Yuanyuan Chen <[email protected]> * Fix conflict Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Simplify unnecessary Optional typing (#40839) Remove Optional Signed-off-by: Yuanyuan Chen <[email protected]> * Add write token for uploading benchmark results to the Hub (#41047) * Separate write token for Hub upload * Address review comments * Address review comments * Ci utils (#40978) * Add CI reports dir to gitignore * Add utils to run local CI * Review compliance * Style * License * Remove <frameworkcontent> and <pt> tags from documentation (#41055) * Remove <frameworkcontent> and <pt> tags Signed-off-by: Yuanyuan Chen <[email protected]> * Revert changes Signed-off-by: Yuanyuan Chen <[email protected]> * Update docs/source/en/model_doc/madlad-400.md --------- Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Joao Gante <[email protected]> * Fix CI jobs being all red 🔴 (false positive) (#41059) fix Co-authored-by: ydshieh <[email protected]> * Update quantization CI (#41068) * fix * new everything * fix * [i18n-bn] Add Bengali language README file (#40935) * [i18n-bn] Add Bengali language README file and update links in existing language files * Update Bengali README for clarity and consistency in model descriptions * Improve documentation and errors in Mamba2-based models (#41063) * fix bug in Mamba2 docs * correct 'because on of' issue * link to other Mamba2 model types * github URL is not changed * update error message in generated files * Update team member list for some CI workflows (#41094) * update list * update list --------- Co-authored-by: ydshieh <[email protected]> * fix crash when using chat to send 2+ request to gptoss (#40536) Signed-off-by: Wang, Yi <[email protected]> * Minor addition, no split modules for VideoMAEE (#41051) * added no split modules * fixed typo --------- Co-authored-by: Raushan Turganbay <[email protected]> * Switch to `python:3.10-slim` for CircleCI docker images (#41067) fix Co-authored-by: ydshieh <[email protected]> * Fix argument name in benchmarking script (#41086) * Fix argument name in benchmarking script * Adjust vars * Remove mention of TensorFlow/Flax/JAX from English documentation (#41058) Remove mention of TensorFlow from English documentation Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typos in documentation (#41087) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typing (#40788) * Fix optional typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix optional typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix schema typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typing * Fix typing * Fix typing * Fix typing * Use np.ndarray Signed-off-by: Yuanyuan Chen <[email protected]> * Fix typing Signed-off-by: Yuanyuan Chen <[email protected]> * Format code Signed-off-by: Yuanyuan Chen <[email protected]> * Use np.ndarray Signed-off-by: Yuanyuan Chen <[email protected]> * Improve typing Signed-off-by: Yuanyuan Chen <[email protected]> * Fix quote string of np.ndarray Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> * Fix code * Format Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Remove unused arguments (#40916) * Fix unused arguments Signed-off-by: Yuanyuan Chen <[email protected]> * More fixes Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Remove tf and flax from Chinese documentation (#41057) Signed-off-by: Yuanyuan Chen <[email protected]> * fix wrong height and width when read video use torchvision (#41091) * docs: Fix Tool Use links and remove dead RAG links (#41104) docs: Fix tool use links. Remove dead RAG links. Fix style * 🚨 [generate] update paligemma mask updates (and other assisted generation-related fixes) (#40917) * tmp * fix modular inheritance * nit * paligemma 1 doesn't have swa * use same pattern as in models with hybrid layers * PR comments * helium also needs layer_typed (bc it relies on gemma) * paligemma/gemma3: same mask creation fn in fwd and generate * propagate changes to helium (gemma-based) * tmp commit * slow paligemma tests passing, let's see what breaks * fix test_left_padding_compatibility * tmp commit * tmp commit * rebase error * docs * reduce diff * like this? * t5gemma * better comment * shorter diff * exception * ffs type * optional * shorter modular_gemma.py * helium model actually needs no changes -- the tester is the issue * t5gemma modular config * a few more modular; paligemma BC * fix processor issues? * rm config exception * lift warning in gemma * [tests] gpt2 + `CausalLMModelTester` (#41003) * tmp commit * tmp commit * tmp commit * rm old GPT2ModelTester * nit bug * add facilities for encoder-decoder tests; add comments on ALL overwrites/extra fns * vision_encoder_decoder * Fix `_get_test_info` for inherited tests (#41106) * fix _get_test_info * fix patched * add comment * ruff --------- Co-authored-by: ydshieh <[email protected]> * Remove bad test skips (#41109) * remove bad skips * remove more * fix inits * Format empty lines and white space in markdown files. (#41100) * Remove additional white space and empty lines from markdown files Signed-off-by: Yuanyuan Chen <[email protected]> * Add empty lines around code Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Update ruff to 0.13.1 + target Python 3.10 + apply fixes (#37809) Update ruff to 0.13.1 target it to Python 3.10 and apply its fixes Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Yih-Dar <[email protected]> * 🚨 [V5] Remove deprecated training arguments (#41017) * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <[email protected]> * Remove deprecated training arguments from V5 Signed-off-by: Yuanyuan Chen <[email protected]> * Fix comments Signed-off-by: Yuanyuan Chen <[email protected]> * Fix code Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> * Support loading LFM2 GGUF (#41111) * add gguf config mapping for lfm2 * add lfm2 tensor process to unsqueeze conv weights * adjust values from gguf config to HF config * add test for lfm2 gguf * ruff --------- Co-authored-by: Marc Sun <[email protected]> * [torchao safetensors] integrate torchao safetensors support with transformers (#40735) * enable torchao safetensors * enable torchao safetensors support * add more version checking * [Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963) (#41036) * fix mismatched dims for qwen3 next * propagate changes * chore: renamed tot_heads to total_sequence_length * Apply suggestion from @vasqu Co-authored-by: Anton Vlasjuk <[email protected]> * minor fix to modular qwen3 next file --------- Co-authored-by: Anton Vlasjuk <[email protected]> * Fix the error where a keyword argument appearing before *args (#41099) Signed-off-by: Yuanyuan Chen <[email protected]> * Fix broken `` expressions in markdown files (#41113) Fix broken expressions in markdown files Signed-off-by: Yuanyuan Chen <[email protected]> * Remove self-assignment (#41062) * Remove self-assignment Signed-off-by: Yuanyuan Chen <[email protected]> * Update src/transformers/integrations/flash_paged.py Co-authored-by: Matt <[email protected]> * Clear pass Signed-off-by: Yuanyuan Chen <[email protected]> * Clear pass Signed-off-by: Yuanyuan Chen <[email protected]> * Clear pass Signed-off-by: Yuanyuan Chen <[email protected]> --------- Signed-off-by: Yuanyuan Chen <[email protected]> Co-authored-by: Matt <[email protected]> * 🚨Refactor: Update text2text generation pipelines to use max_new_tokens… (#40928) * Refactor: Update text2text generation pipelines to use max_new_tokens and resolve max_length warning * docs(text2text_generation): 更新参数注释以反映现代生成实践将max_length参数注释更新为max_new_tokens，以符合现代生成实践中指定生成新token数量的标准做法 * refactor(text2text_generation): Remove outdated input validation logic * docs(text2text_generation): Revert incorrectly modified comment * docs(text2text_generation): Revert incorrectly modified comment * Fixed MXFP4 model storage issue (#41118) * Fixed loading LongT5 from legacy checkpoints (#40724) * Fixed loading LongT5 from legacy checkpoints * Adapted the fix to work with missing lm_head * dummy commit (#41133) * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting * dummy commit, nothing interesting --------- Co-authored-by: ydshieh <[email protected]> * Fix loading logic flaw with regards to unexpected and missing keys (#40850) * Unexpected keys should be ignored at load with device map * remove them all * fix logic flaw * fix * simplify * style * fix * revert caching allocator change * add other test * add nice doc --------- Co-authored-by: Cyril Vallez <[email protected]> * Using torch.distributions.Categorical * Resolving logits_process.py Issues * style: autoformat with make fixup * Update logits_process.py removed defaults * Variable H name -> cumulative_entropy * Resolving format error * Correction of the loop variables in logit processor * Vectorized the loop in logits_process * formatted logits_process * paper reference and stopping rule comment logits_process * Trigger CI rerun * Update logits_process.py * added test_TopH_example_integration * added test_TopH_example_integration * Update README.md * Restore CI config to match main (remove accidental changes) * Restore CI config to match upstream main (no diffs) --------- Signed-off-by: Yuanyuan Chen <[email protected]> Signed-off-by: greg-kwasniewski1 <[email protected]> Signed-off-by: jiqing-feng <[email protected]> Signed-off-by: Yannick Schnider <[email protected]> Signed-off-by: Wang, Yi <[email protected]> Co-authored-by: ArminAzizi98 <[email protected]> Co-authored-by: Yuanyuan Chen <[email protected]> Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> Co-authored-by: Mohamed Mekkouri <[email protected]> Co-authored-by: Yuchao Zhang <[email protected]> Co-authored-by: Anton Vlasjuk <[email protected]> Co-authored-by: Pavel Iakubovskii <[email protected]> Co-authored-by: Bo Zheng <[email protected]> Co-authored-by: bozheng-hit <[email protected]> Co-authored-by: Cyril Vallez <[email protected]> Co-authored-by: Rémi Ouazan <[email protected]> Co-authored-by: Yoni Gozlan <[email protected]> Co-authored-by: Ryan Mullins <[email protected]> Co-authored-by: Amer <[email protected]> Co-authored-by: eustlb <[email protected]> Co-authored-by: Albert Villanova del Moral <[email protected]> Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Ákos Hadnagy <[email protected]> Co-authored-by: Grzegorz Kwasniewski <[email protected]> Co-authored-by: NanoCode012 <[email protected]> Co-authored-by: Arthur <[email protected]> Co-authored-by: 艾力可 <[email protected]> Co-authored-by: JJJYmmm <[email protected]> Co-authored-by: Manuel de Prada Corral <[email protected]> Co-authored-by: Samuel Barry <[email protected]> Co-authored-by: yonigozlan <[email protected]> Co-authored-by: HyunZ118 <[email protected]> Co-authored-by: Steven Liu <[email protected]> Co-authored-by: YONGSANG <[email protected]> Co-authored-by: Yijun Lee <[email protected]> Co-authored-by: Yih-Dar <[email protected]> Co-authored-by: ydshieh <[email protected]> Co-authored-by: Pablo Montalvo <[email protected]> Co-authored-by: Shane A <[email protected]> Co-authored-by: Xuehai Pan <[email protected]> Co-authored-by: Matt <[email protected]> Co-authored-by: Raushan Turganbay <[email protected]> Co-authored-by: Aritra Roy Gosthipaty <[email protected]> Co-authored-by: vb <[email protected]> Co-authored-by: Yaswanth Gali <[email protected]> Co-authored-by: Akshay Babbar <[email protected]> Co-authored-by: liangel-02 <[email protected]> Co-authored-by: Duc-Viet Hoang <[email protected]> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: jiqing-feng <[email protected]> Co-authored-by: lilin-1 <[email protected]> Co-authored-by: Matej Sirovatka <[email protected]> Co-authored-by: Jack <[email protected]> Co-authored-by: Rangehow <[email protected]> Co-authored-by: rangehow <[email protected]> Co-authored-by: Anna <[email protected]> Co-authored-by: Anna Banaszak <[email protected]> Co-authored-by: Hamish Scott <[email protected]> Co-authored-by: Harshal Janjani <[email protected]> Co-authored-by: Branden <[email protected]> Co-authored-by: Ubuntu <[email protected]> Co-authored-by: Benjamin Bossan <[email protected]> Co-authored-by: Ita Zaporozhets <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: Lysandre <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: StevenBucaille <[email protected]> Co-authored-by: BakerBunker <[email protected]> Co-authored-by: lvyuanjun.lyj <[email protected]> Co-authored-by: Arthur <[email protected]> Co-authored-by: Ayush <[email protected]> Co-authored-by: Ryan Mullins <[email protected]> Co-authored-by: Yannick Schnider <[email protected]> Co-authored-by: Ralph Gleaton <[email protected]> Co-authored-by: Saidur Rahman Pulok <[email protected]> Co-authored-by: Nick Doiron <[email protected]> Co-authored-by: Wang, Yi <[email protected]> Co-authored-by: Duygu Altinok <[email protected]> Co-authored-by: Jinde.Song <[email protected]> Co-authored-by: hbenoit <[email protected]> Co-authored-by: nnul <[email protected]> Co-authored-by: YangKai0616 <[email protected]> Co-authored-by: Karol Szustakowski <[email protected]> Co-authored-by: souvikku <[email protected]>

SamuelBarryCS added 10 commits September 1, 2025 15:57

Test & import setup

1e8a9bd

First version passing tests

8c3d734

Ruff

25327db

Dummy post processing

d5f8af7

Add numerical test

52a8c30

Adjust

ea93b23

Doc

8b9da73

Ruff

e33f510

remove unused arg

4831c73

Refine interpolation method and push test script

e396b56

SamuelBarryCS marked this pull request as ready for review September 2, 2025 06:52

github-actions bot requested review from ArthurZucker and yonigozlan September 2, 2025 06:52

update bench

1b36079

SamuelBarryCS changed the title ~~[WIP] Add Fast PromptDepthAnything Processor~~ Add Fast PromptDepthAnything Processor Sep 2, 2025

SamuelBarryCS mentioned this pull request Sep 2, 2025

[Contributions Welcome] Add Fast Image Processors #36978

Open

79 tasks

Comments

c79e947

SamuelBarryCS changed the title ~~Add Fast PromptDepthAnything Processor~~ [WIP] Add Fast PromptDepthAnything Processor Sep 8, 2025

SamuelBarryCS marked this pull request as draft September 8, 2025 17:35

yonigozlan reviewed Sep 8, 2025

View reviewed changes

SamuelBarryCS and others added 3 commits September 10, 2025 21:59

Update src/transformers/models/auto/image_processing_auto.py

add8512

Co-authored-by: Yoni Gozlan <[email protected]>

Remove benchmrk script

648a5af

Merge branch 'fast-image-processing-promptdepthanything' of github.co…

352b58d

…m:SamuelBarryCS/transformers into fast-image-processing-promptdepthanything

SamuelBarryCS changed the title ~~[WIP] Add Fast PromptDepthAnything Processor~~ Add Fast PromptDepthAnything Processor Sep 11, 2025

SamuelBarryCS marked this pull request as ready for review September 11, 2025 05:02

SamuelBarryCS and others added 2 commits September 11, 2025 23:45

Update docstrings

d320003

Update src/transformers/models/prompt_depth_anything/image_processing…

5ee5b70

…_prompt_depth_anything_fast.py Co-authored-by: Yoni Gozlan <[email protected]>

SamuelBarryCS added 2 commits September 12, 2025 21:17

Fix batch size of tests

1bfa430

fix test and fast

c50cfe1

SamuelBarryCS commented Sep 12, 2025

View reviewed changes

Fix slow processor

6aec2cf

Put tests back to pytorch

6c8660a

yonigozlan added 2 commits September 15, 2025 14:30

remove check and modify batched tests

89e5795

Merge remote-tracking branch 'upstream/main' into fast-image-processi…

b8ede58

…ng-promptdepthanything

test do_pad + slow processor fix

644bc80

yonigozlan approved these changes Sep 15, 2025

View reviewed changes

yonigozlan enabled auto-merge (squash) September 15, 2025 14:48

yonigozlan merged commit ff26fe8 into huggingface:main Sep 15, 2025
23 checks passed

snorkelopstesting1-a11y mentioned this pull request Oct 11, 2025

Add Fast PromptDepthAnything Processor snorkel-marlin-repos/huggingface_transformers_pr_40602_2130fe30-b2c7-41cb-beee-e631b44292ef#1

Merged

snorkelopstesting1-a11y mentioned this pull request Oct 11, 2025

Add Fast PromptDepthAnything Processor snorkel-marlin-repos/huggingface_transformers_pr_40602_a022d0ab-de45-4126-ad35-5d40859a0560#1

Merged

Add Fast PromptDepthAnything Processor #40602

Add Fast PromptDepthAnything Processor #40602

Uh oh!

Conversation

SamuelBarryCS commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What:

Test performed:

Impact metrics

How to review:

TODO/ Next:

Uh oh!

SamuelBarryCS commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Sep 2, 2025

Uh oh!

SamuelBarryCS commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SamuelBarryCS Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yonigozlan Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

SamuelBarryCS commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SamuelBarryCS commented Sep 12, 2025

Uh oh!

yonigozlan commented Sep 15, 2025

Uh oh!

yonigozlan commented Sep 15, 2025

Uh oh!

yonigozlan left a comment

Choose a reason for hiding this comment

Uh oh!

yonigozlan Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 15, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 15, 2025

Uh oh!

Uh oh!

SamuelBarryCS commented Sep 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SamuelBarryCS commented Sep 1, 2025 •

edited

Loading

SamuelBarryCS commented Sep 2, 2025 •

edited

Loading

SamuelBarryCS commented Sep 8, 2025 •

edited

Loading

SamuelBarryCS Sep 12, 2025 •

edited

Loading

SamuelBarryCS commented Sep 12, 2025 •

edited

Loading