Add BQCorpus Dataset #562

dyan-dy · 2021-06-14T04:43:17Z

PR types

database

PR changes

modified bq_corpus.py according to the opinion

Description

ZeyuChen

It seems miss a init.py in paddlenlp/dataset/

smallv0221

Actually, there is not need to modify init.py to implement new datasets. @ZeyuChen @frozenfish123

smallv0221 · 2021-06-15T04:36:17Z

paddlenlp/datasets/bq_corpus.py

+
+class BQCorpus(DatasetBuilder):
+    """
+    BQCorpus: the largest dataset available for for the banking and finance sector


smallv0221 · 2021-06-15T04:37:17Z

paddlenlp/datasets/bq_corpus.py

+    META_INFO = collections.namedtuple('META_INFO', ('file', 'md5'))
+    SPLITS = {
+        'train': META_INFO(
+            os.path.join('BQCorpus', 'train.tsv'),


The filepath seems wrong. Have you check your dataset using load_dataset()

Sorry actually I've been trapped in the trouble for a long time and I don't know how to fix it ... I've been dealing with it for the past two days, and I can't solve the problem. Sorry for slowing down your progress : (

smallv0221 · 2021-06-15T06:20:09Z

paddlenlp/datasets/bq_corpus.py

+    """
+    BQCorpus: the largest dataset available for for the banking and finance sector
+
+    by frozenfish123@Wuhan University


Please also give original author information.

""" BQ Corpus (Bank Question Corpus), a large-scale domain-specific Chinese corpus for sentence semantic equivalence identification is constructed from online Webank custom service logs. The BQ corpus contains 120,000 question pairs with manual annotation. The following two lines are examples extracted from training data: ''' "微粒贷开通" 你好，我的微粒贷怎么没有开通呢 0 为什么借款后一直没有给我回拨电话怎么申请借款后没有打电话过来呢！ 1 ''' Provider: Intelligent Computing Research Center, Harbin Institute of Technology(Shenzhen) Contacts: Qingcai Chen (email: [email protected]; Fax: +86-755-26033182) More Info: https://www.luge.ai/ """

Sorry I have been preparing for my final exam these days and I didn't reply timely. Here are my chages on this opinion.

Great description! Good luck in the final.

ZeyuChen · 2021-06-16T05:52:11Z

Actually, there is not need to modify init.py to implement new datasets. @ZeyuChen @frozenfish123

Got it.

smallv0221

We decide to merge your dataset and fix it for you due to our release schedule. Excellent work! Thanks for your contribution.

dyan-dy added 2 commits June 14, 2021 12:36

modified bq_corpus.py

bb95f8b

Merge branch 'PaddlePaddle:develop' into develop

b7104f1

ZeyuChen assigned smallv0221 Jun 14, 2021

ZeyuChen added data Issues about data pipeline and dataset SIG labels Jun 14, 2021

ZeyuChen mentioned this pull request Jun 14, 2021

Add BQCorpus Dataset #534

Closed

ZeyuChen changed the title ~~modified bq_corpus.py~~ Add BQCorpus Dataset Jun 14, 2021

ZeyuChen requested a review from smallv0221 June 14, 2021 08:00

ZeyuChen reviewed Jun 14, 2021

View reviewed changes

dyan-dy mentioned this pull request Jun 14, 2021

Add BQCorpus Dataset #563

Closed

smallv0221 requested changes Jun 15, 2021

View reviewed changes

smallv0221 reviewed Jun 15, 2021

View reviewed changes

Merge branch 'develop' into develop

c8c5eea

smallv0221 approved these changes Jun 17, 2021

View reviewed changes

Merge branch 'develop' into develop

b9c2910

smallv0221 merged commit 6b1fa66 into PaddlePaddle:develop Jun 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add BQCorpus Dataset #562

Add BQCorpus Dataset #562

Uh oh!

dyan-dy commented Jun 14, 2021 •

edited

Loading

Uh oh!

ZeyuChen left a comment

Uh oh!

smallv0221 left a comment

Uh oh!

smallv0221 Jun 15, 2021

Uh oh!

smallv0221 Jun 15, 2021

Uh oh!

dyan-dy Jun 17, 2021

Uh oh!

smallv0221 Jun 15, 2021 •

edited

Loading

Uh oh!

dyan-dy Jun 17, 2021

Uh oh!

smallv0221 Jun 17, 2021

Uh oh!

ZeyuChen commented Jun 16, 2021

Uh oh!

smallv0221 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add BQCorpus Dataset #562

Add BQCorpus Dataset #562

Uh oh!

Conversation

dyan-dy commented Jun 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Description

Uh oh!

ZeyuChen left a comment

Choose a reason for hiding this comment

Uh oh!

smallv0221 left a comment

Choose a reason for hiding this comment

Uh oh!

smallv0221 Jun 15, 2021

Choose a reason for hiding this comment

Uh oh!

smallv0221 Jun 15, 2021

Choose a reason for hiding this comment

Uh oh!

dyan-dy Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

smallv0221 Jun 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dyan-dy Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

smallv0221 Jun 17, 2021

Choose a reason for hiding this comment

Uh oh!

ZeyuChen commented Jun 16, 2021

Uh oh!

smallv0221 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dyan-dy commented Jun 14, 2021 •

edited

Loading

smallv0221 Jun 15, 2021 •

edited

Loading