Add eurobert #39455

ArthurZucker · 2025-07-16T15:34:06Z

What does this PR do?

…t from PreTrainModel.

…tance

…fault value

… information and typo in docs

…n and remove unneeded comment

…ion mask condition

…te related documentation

github-actions · 2025-07-16T15:35:23Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, eurobert

tomaarsen · 2025-07-25T09:58:13Z

@ArthurZucker What's the status here? Would love to get this fully out, also so we can add ONNX support for EuroBERT: huggingface/optimum#2321

Tom Aarsen

ArthurZucker · 2025-07-28T09:35:25Z

Got caught up in the release I just needed to add integration tests making sure it works well!

ArthurZucker · 2025-08-20T11:44:10Z

tests/models/eurobert/test_modeling_eurobert.py

+    # This variable is used to determine which CUDA device are we using for our runners (A10 or T4)
+    # Depending on the hardware we get different logits / generations
+    cuda_compute_capability_major_version = None
+


from transformers import AutoTokenizer, AutoModelForMaskedLM model_id = "EuroBERT/EuroBERT-2.1B" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForMaskedLM.from_pretrained(model_id, trust_remote_code=True) text = "The capital of France is <|mask|>." inputs = tokenizer(text, return_tensors="pt") outputs = model(**inputs) # To get predictions for the mask: masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id) predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1) predicted_token = tokenizer.decode(predicted_token_id) print("Predicted token:", predicted_token) # Predicted token: Paris

will run this with trust remote code

Nicolas-BZRD and others added 23 commits February 20, 2025 21:07

feat: init EuroBert model integration

9980578

fix: auto import

7e01936

Refactor: Update EuroBertMaskedLM & SentenceClassif classes to inheri…

9c1b656

…t from PreTrainModel.

feat: Introduce EuroBertPreTrainedModel class and update model inheri…

c8552c1

…tance

fix: Ruff format

d8b6d85

fix: Ruff format for modeling_auto.py

c58bfad

fix: docstrings of public objects.

402d479

Fix: EuroBERT config doc

611b64a

fix: Update classifier_pooling documentation and ensure consistent de…

d79db7c

…fault value

fix: ruff format

92e1d86

feat: EuroBERT Doc file creat

89f37b3

fix: use_cache and pretraining_tp params in config, uneeded docstring…

3061fe2

… information and typo in docs

test: Ported Llama unit tests to EuroBert

1d24d63

fix: fixed test file name, removed unnecessary attention mask creatio…

c9cc639

…n and remove unneeded comment

fix: run ruff formatting

80b58ae

docs: initial documentation for eurobert model.

a0b9111

fix: update documentation to reflect EuroBERT model and adjust attent…

78eab24

…ion mask condition

fix: remove unnecessary else clause for attention mask in EuroBert model

1e7d76c

fix: correct attention mask handling in EuroBert model

a672208

fix: Same but on right file this time :p

7da79df

feat: add EuroBertForTokenClassification model for NER tasks and upda…

c2051b7

…te related documentation

mergemain

638ba2f

remove

940b602

ArthurZucker added 3 commits July 16, 2025 17:36

revert wrong merge

ec8e499

oups

7bf3f97

modular fix

5e31c07

ArthurZucker added the New model label Jul 16, 2025

ArthurZucker added 2 commits July 16, 2025 17:51

simplify and refactor the modular

b7b5774

more fixes

45401fb

ArthurZucker commented Aug 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add eurobert #39455

Add eurobert #39455

ArthurZucker commented Jul 16, 2025

Uh oh!

github-actions bot commented Jul 16, 2025

Uh oh!

tomaarsen commented Jul 25, 2025

Uh oh!

ArthurZucker commented Jul 28, 2025

Uh oh!

ArthurZucker Aug 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add eurobert #39455

Are you sure you want to change the base?

Add eurobert #39455

Conversation

ArthurZucker commented Jul 16, 2025

What does this PR do?

Uh oh!

github-actions bot commented Jul 16, 2025

Uh oh!

tomaarsen commented Jul 25, 2025

Uh oh!

ArthurZucker commented Jul 28, 2025

Uh oh!

ArthurZucker Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants