-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
There seems to be a mismatch between how model.save()
and LudwigModel.upload_to_hf_hub()
handle model paths. A model saved into a directory with a custom name using model.save
cannot be uploaded to HF Hub using the same directory name.
To Reproduce
Steps to reproduce the behavior:
- Fine-tune a LLM (I used the Zephyr fine-tuning example from Predibase)
- Save the fine-tuned model into a directory using
model.save("finetuned-model")
- Try to upload the model to HF Hub using
LudwigModel.upload_to_hf_hub(MY_HF_MODEL_NAME, "finetuned-model")
- See error:
File .../ludwig-finetune-llm/venv/lib/python3.11/site-packages/ludwig/utils/upload_utils.py:101, in BaseModelUpload._validate_upload_parameters(repo_id, model_path, repo_type, private, commit_message, commit_description)
99 trained_model_artifacts_path = os.path.join(model_path, "model", "model_weights")
100 if not os.path.exists(trained_model_artifacts_path):
--> 101 raise Exception(
102 f"Model artifacts not found at {trained_model_artifacts_path}. "
103 f"It is possible that model at '{model_path}' hasn't been trained yet, or something went"
104 "wrong during training where the model's weights were not saved."
105 )
Exception: Model artifacts not found at finetuned-model/model/model_weights. It is possible that model at 'finetuned-model' hasn't been trained yet, or something wentwrong during training where the model's weights were not saved.
Expected behavior
Expected the upload to succeed, since I gave the same path name, "finetuned-model"
, as a parameter both to model.save
and to LudwigModel.upload_to_hf_hub
.
Screenshots
n/a
Environment (please complete the following information):
- OS: Linux
- Version: AlmaLinux release 8.7 (Stone Smilodon)
- Python version 3.11.5
- Ludwig version 0.9.2
Additional context
After model.save
, this is the file and directory structure created under the finetuned-model
directory:
model_hyperparameters.json
training_set_metadata.json
model_weights/adapter_config.json
model_weights/adapter_model.safetensors
model_weights/README.md
But according to the error message, upload_to_hf_hub
is checking for the existence of finetuned-model/model/model_weights
. It doesn't exist (there is no intermediate directory called model
) so this fails with the above error. Below is the relevant code. Note that model
is always added to the path on line 99.
ludwig/ludwig/utils/upload_utils.py
Lines 98 to 105 in 51f38c5
# Make sure the model is actually trained | |
trained_model_artifacts_path = os.path.join(model_path, "model", "model_weights") | |
if not os.path.exists(trained_model_artifacts_path): | |
raise Exception( | |
f"Model artifacts not found at {trained_model_artifacts_path}. " | |
f"It is possible that model at '{model_path}' hasn't been trained yet, or something went" | |
"wrong during training where the model's weights were not saved." | |
) |