Skip to content

Conversation

JunnYu
Copy link
Member

@JunnYu JunnYu commented Aug 14, 2021

更新offset mapping计算的方法,主要参考了https://github.com/bojone/bert4keras/blob/master/bert4keras/tokenizers.py#L372。

@yingyibiao
Copy link
Contributor

mark

@ZeyuChen ZeyuChen changed the title fix #880 Update Offset mapping Aug 19, 2021
Copy link
Contributor

@smallv0221 smallv0221 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! Only a few small problems.


return batch_encode_inputs

def rematch(self, text):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the name get_offset_mapping better?


def rematch(self, text):
"""
changed from https://github.com/bojone/bert4keras/blob/master/bert4keras/tokenizers.py#L372
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smallv0221
Copy link
Contributor

LGTM! Thanks for your contribution.

@smallv0221 smallv0221 merged commit 582b4ca into PaddlePaddle:develop Aug 25, 2021
@JunnYu JunnYu deleted the fix#880 branch August 30, 2021 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants