Skip to content

Conversation

ArthurZucker
Copy link
Collaborator

What does this PR do?

Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, eurobert

@tomaarsen
Copy link
Member

@ArthurZucker What's the status here? Would love to get this fully out, also so we can add ONNX support for EuroBERT: huggingface/optimum#2321

  • Tom Aarsen

@ArthurZucker
Copy link
Collaborator Author

Got caught up in the release I just needed to add integration tests making sure it works well!

# This variable is used to determine which CUDA device are we using for our runners (A10 or T4)
# Depending on the hardware we get different logits / generations
cuda_compute_capability_major_version = None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from transformers import AutoTokenizer, AutoModelForMaskedLM

model_id = "EuroBERT/EuroBERT-2.1B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForMaskedLM.from_pretrained(model_id, trust_remote_code=True)

text = "The capital of France is <|mask|>."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

# To get predictions for the mask:
masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1)
predicted_token = tokenizer.decode(predicted_token_id)
print("Predicted token:", predicted_token)
# Predicted token:  Paris

will run this with trust remote code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants