Skip to content

Conversation

miguelscarv
Copy link

@miguelscarv miguelscarv commented Apr 2, 2025

What does this PR do?

This PR implements a FastImageProcessor for the the VideoMAE model using BaseImageProcessorFast. Related #36978

Note:

  1. I was not able to run test_can_compile_fast_image_processor as I'm using a M1 and don't have access to a NVIDIA GPU.
  2. The test implemented in the method test_slow_fast_equivalence_batched is only able to pass when all images in the batch have a fixed width and height (by setting equal_resolution=True in the self.image_processor_tester.prepare_image_inputs). If the width and height vary across videos, then there is a low chance of passing this test, i.e. it doesn't fail every time.

Other than that, all tests passed.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@yonigozlan

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@github-actions github-actions bot marked this pull request as draft April 2, 2025 02:21
Copy link
Contributor

github-actions bot commented Apr 2, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@miguelscarv miguelscarv marked this pull request as ready for review April 2, 2025 02:25
@github-actions github-actions bot requested review from ydshieh and yonigozlan April 2, 2025 02:25
@miguelscarv miguelscarv force-pushed the add-videomae-fast-processor branch from 3d56506 to 828804b Compare April 23, 2025 00:03
@miguelscarv
Copy link
Author

Hey @yonigozlan

Saw in #37611 that fast processors for video models aren't being merged, so guessing this PR won’t go through either.

That said, would a fast image processor make sense for SmolVLM? It handles high-res images, so it could help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant