Skip to content

Conversation

lyuwenyu
Copy link
Contributor

@lyuwenyu lyuwenyu commented Jan 2, 2025

PR types

Bug fixes


AutoTokenizer自动初始化在找相关class时候使用错误的module位置


original:

>>> from paddlenlp.transformers import AutoTokenizer
/usr/local/lib/python3.10/dist-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
>>> tokenizer = AutoTokenizer.from_pretrained('DeepFloyd/t5-v1_1-xxl')
[2025-01-02 16:33:31,091] [    INFO] - Loading configuration file /root/.paddlenlp/models/DeepFloyd/t5-v1_1-xxl/config.json
Traceback (most recent call last):
  File "/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/auto/factory.py", line 35, in getattribute_from_module
    return getattribute_from_module(paddlenlp_module, attr)
  File "/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/auto/factory.py", line 39, in getattribute_from_module
    raise ValueError(f"Could not find {attr} in {paddlenlp_module}!")
ValueError: Could not find T5Tokenizer in <module 'paddlenlp' from '/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/__init__.py'>!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/auto/tokenizer.py", line 454, in from_pretrained
    tokenizer_class_py = TOKENIZER_MAPPING[type(config)]
  File "/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/auto/factory.py", line 69, in __getitem__
    return self._load_attr_from_module(model_type, model_name)
  File "/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/auto/factory.py", line 100, in _load_attr_from_module
    return getattribute_from_module(self._modules[module_name], attr)
  File "/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/auto/factory.py", line 37, in getattribute_from_module
    raise ValueError(f"Could not find {attr} neither in {module} nor in {paddlenlp_module}!")
ValueError: Could not find T5Tokenizer neither in <module 'paddlenlp.transformers.t5' from '/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/transformers/t5/__init__.py'> nor in <module 'paddlenlp' from '/root/paddlejob/workspace/env_run/lvwenyu01/PaddleNLP/paddlenlp/__init__.py'>!

fixed:

>>> from paddlenlp.transformers import AutoTokenizer
/usr/local/lib/python3.10/dist-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. Support for replacing an already imported distutils is deprecated. In the future, this condition will fail. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
>>> tokenizer = AutoTokenizer.from_pretrained('DeepFloyd/t5-v1_1-xxl')
[2025-01-02 16:29:06,221] [    INFO] - Loading configuration file /root/.paddlenlp/models/DeepFloyd/t5-v1_1-xxl/config.json
>>> print(type(tokenizer))
<class 'paddlenlp.transformers.t5.tokenizer.T5Tokenizer'>

Copy link

codecov bot commented Jan 2, 2025

Codecov Report

Attention: Patch coverage is 80.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 53.07%. Comparing base (fa3fd39) to head (eb21452).
Report is 376 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/transformers/auto/tokenizer.py 66.66% 1 Missing ⚠️

❌ Your project check has failed because the head coverage (53.07%) is below the target coverage (58.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9726      +/-   ##
===========================================
+ Coverage    51.11%   53.07%   +1.96%     
===========================================
  Files          730      718      -12     
  Lines       122587   112495   -10092     
===========================================
- Hits         62657    59710    -2947     
+ Misses       59930    52785    -7145     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@DrownFish19 DrownFish19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

github-actions bot commented Mar 5, 2025

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

@github-actions github-actions bot added stale and removed stale labels Mar 5, 2025
Copy link

github-actions bot commented May 5, 2025

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

@github-actions github-actions bot added the stale label May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants