-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Support non-ASCII characters in model file paths #99453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99453
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 7b096fb: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
First time contributing to pytorch after playing around with it recently. If there's anything I missed in the PR process, please feel free to let me know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me!
Suggesting below a small comment to clarify why we do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix, but this workaround would require to store entire file in memory before it can be saved on disk. Wouldn't it be better to use FIleIO
instead of BytesIO
I think the test file encoding is not correct and so the string is not parsed properly. |
It turned out to be a quirk of the Updated PR and verified working locally using Python 3.8.16. 🤞 Thanks for the quick insight and feedback all! |
Thanks for the tip! Will update. I initially went with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
Thanks for the quick update
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 3 jobs have failed, first few of them are: trunk / macos-12-py3-arm64 / test (default, 1, 3, macos-m1-12), trunk / macos-12-py3-arm64 / test (default, 2, 3, macos-m1-12), trunk / macos-12-py3-arm64-mps / test (default, 1, 1) Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge -r I think the CI issues are not related to your change. |
@pytorchbot successfully started a rebase job. Check the current status here |
Co-authored-by: albanD <[email protected]>
Successfully rebased |
e890003
to
7b096fb
Compare
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
# PyTorchFileWriter only supports ascii filename. | ||
# For filenames with non-ascii characters, we rely on Python | ||
# for writing out the file. | ||
self.file_stream = io.FileIO(self.name, mode='w') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mode=w
is text mode, it may corrupt binary files on Windows due to newline code conversion.
I think this should be mode='wb'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the mode
option is confusing here but should be doing the right thing:
>>> import io
>>> f = io.FileIO("tmp", 'w')
>>> f
<_io.FileIO name='tmp' mode='wb' closefd=True>
Note that the internal mode
of the FileIO object is wb
, different from the user-specified mode of w
.
Separately, The writelines()
method that FileIO inherits from BaseIO does not add any newlines per the linked documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was confusing it with open
arguments. You are right.
I was about to open a separate issue about saving with non-ascii characters in the path, but does your fix include the Tested with torch 2.0.1,
|
Hi, Yes this covers torch.save but didn't make it to 2.0.1 I'm afraid as it was finalized before we did this. |
Fixes #98918