-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Add BQCorpus Dataset #534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BQCorpus Dataset #534
Conversation
…nto my-new-dataset
|
||
class bq_corpus(DatasetBuilder): | ||
""" | ||
bq_corpus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be terrific if you provide more information, such as task and auther, for your dataset.
seqeval | ||
multiprocess | ||
multiprocess | ||
pre-commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why adding this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, actually I haven't see these two lines in my code file till now. Maybe I added these when I was using git bash commands. I'm a greenhand in github so I made mistakes.
if not head: | ||
head = data | ||
else: | ||
texta, textb, label = data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use "sentence1" and "sentence2" here.
Duplicated PR #562 , close this one. |
PR types
New features
PR changes
Database
Description
Issue 447, 接入千言数据集,BQ Corpus 文本相似度。