Skip to content

Conversation

tomaarsen
Copy link
Member

Resolves #2327

Hello!

Pull Request overview

  • Allow library_name in ORTModel....from_pretrained("...", library_name=...)

Details

As described in #2327, this will allow exporting and loading models using a specific library name rather than relying on Optimum's automatic inferring of the library. For Sentence Transformers, this allows me to add ONNX exporting to SparseEncoder models with:

from pprint import pprint

import torch
from sentence_transformers import SparseEncoder
from optimum.onnxruntime import ORTModelForMaskedLM

# 1. Load a pretrained SparseEncoder model
model = SparseEncoder("naver/splade-cocondenser-ensembledistil")

# Very hackishly override the model to use ORTModelForMaskedLM
class ORTModelForMaskedLMModule(ORTModelForMaskedLM, torch.nn.Module):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        torch.nn.Module.__init__(self)

model[0].auto_model = ORTModelForMaskedLMModule.from_pretrained("naver/splade-cocondenser-ensembledistil", library_name="transformers")

# Encode some texts into sparse embeddings
embeddings = model.encode(["I'm travelling to the grocery store to buy some milk."])

# Decode the embeddings again, the beauty of sparse embeddings
decoded = model.decode(embeddings, top_k=5)
pprint(decoded)
"""
[[('milk', 2.486112356185913),
  ('grocery', 1.7925951480865479),
  ('dairy', 1.6981983184814453),
  ('traveling', 1.5185797214508057),
  ('buy', 1.3063507080078125)]]
"""

As Sentence Transformers only exports models in the "transformers" way, I can add this internally in Sentence Transformers so the eventual usage by the end user is simply model = SparseEncoder("naver/splade-cocondenser-ensembledistil", backend="onnx"). Note also that I can export with OpenVINO without library_name just fine, only ONNX has this issue.

Please let me know if you'd rather go in a different direction here.

  • Tom Aarsen

Required to avoid automatic inferring of the library_name
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @tomaarsen ! Also would be nice to add a test for this (to ensure SparseEncoder export is not broken in the future)

use_io_binding: Optional[bool] = None,
# other arguments
model_save_dir: Optional[Union[str, Path, TemporaryDirectory]] = None,
library_name: Optional[str] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would prefer to have it as a class attribute instead which can be set to None by default (will be inferred in this case) and for ORTModelForMaskedLM can be set to transformers directly as it should never be set to anything else no ?

like done in optimum-intel https://github.com/huggingface/optimum-intel/blob/d93fd59aebd24ac887deb1eaab7ff6af41b09946/optimum/intel/openvino/modeling_base.py#L647

@tomaarsen
Copy link
Member Author

Will address your comments on Monday (most likely), thank you!

  • Tom Aarsen

Under modeling_diffusion it looks like ORTModel isn't used
@tomaarsen
Copy link
Member Author

I've used the class attribute approach that you proposed, and also added a test using https://huggingface.co/sparse-encoder-testing/splade-bert-tiny-nq. I'm also using this model in my own tests, so it'll stay. It's a good example of a model that would be automatically detected as Sentence Transformers, but we might want to load with ...ModelForMaskedLM.

  • Tom Aarsen

@tomaarsen tomaarsen requested a review from echarlaix July 28, 2025 10:59
Copy link
Collaborator

@echarlaix echarlaix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @tomaarsen

@echarlaix echarlaix merged commit 689c0b5 into huggingface:main Jul 29, 2025
40 of 42 checks passed
@echarlaix
Copy link
Collaborator

Also the optimum onnx / ort integration is moved in https://github.com/huggingface/optimum-onnx @tomaarsen. I'll take care of opening a PR to add these PR changes there as well

@tomaarsen
Copy link
Member Author

Oh, good to know! Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot export with ORTModelForMaskedLM on Sentence Transformers repositories

3 participants