Skip to content

upload_to_hf_hub model path mismatch with model.save #3925

@osma

Description

@osma

Describe the bug

There seems to be a mismatch between how model.save() and LudwigModel.upload_to_hf_hub() handle model paths. A model saved into a directory with a custom name using model.save cannot be uploaded to HF Hub using the same directory name.

To Reproduce
Steps to reproduce the behavior:

  1. Fine-tune a LLM (I used the Zephyr fine-tuning example from Predibase)
  2. Save the fine-tuned model into a directory using model.save("finetuned-model")
  3. Try to upload the model to HF Hub using LudwigModel.upload_to_hf_hub(MY_HF_MODEL_NAME, "finetuned-model")
  4. See error:
File .../ludwig-finetune-llm/venv/lib/python3.11/site-packages/ludwig/utils/upload_utils.py:101, in BaseModelUpload._validate_upload_parameters(repo_id, model_path, repo_type, private, commit_message, commit_description)
     99 trained_model_artifacts_path = os.path.join(model_path, "model", "model_weights")
    100 if not os.path.exists(trained_model_artifacts_path):
--> 101     raise Exception(
    102         f"Model artifacts not found at {trained_model_artifacts_path}. "
    103         f"It is possible that model at '{model_path}' hasn't been trained yet, or something went"
    104         "wrong during training where the model's weights were not saved."
    105     )

Exception: Model artifacts not found at finetuned-model/model/model_weights. It is possible that model at 'finetuned-model' hasn't been trained yet, or something wentwrong during training where the model's weights were not saved.

Expected behavior

Expected the upload to succeed, since I gave the same path name, "finetuned-model", as a parameter both to model.save and to LudwigModel.upload_to_hf_hub.

Screenshots

n/a

Environment (please complete the following information):

  • OS: Linux
  • Version: AlmaLinux release 8.7 (Stone Smilodon)
  • Python version 3.11.5
  • Ludwig version 0.9.2

Additional context

After model.save, this is the file and directory structure created under the finetuned-model directory:

model_hyperparameters.json
training_set_metadata.json
model_weights/adapter_config.json
model_weights/adapter_model.safetensors
model_weights/README.md

But according to the error message, upload_to_hf_hub is checking for the existence of finetuned-model/model/model_weights. It doesn't exist (there is no intermediate directory called model) so this fails with the above error. Below is the relevant code. Note that model is always added to the path on line 99.

# Make sure the model is actually trained
trained_model_artifacts_path = os.path.join(model_path, "model", "model_weights")
if not os.path.exists(trained_model_artifacts_path):
raise Exception(
f"Model artifacts not found at {trained_model_artifacts_path}. "
f"It is possible that model at '{model_path}' hasn't been trained yet, or something went"
"wrong during training where the model's weights were not saved."
)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions