Add squeezebert #872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

renmada wants to merge 7 commits into PaddlePaddle:develop from renmada:squeezebert

Contributor

renmada commented Aug 10, 2021

PR types

deeplaying added 4 commits

August 10, 2021 14:49


          add blenderbot

c92f060


          add blenderbot

9fc97ef


          add blenderbot

448fdef


          add squeezebert

c161e59

CLAassistant commented Aug 10, 2021

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

deeplaying added 2 commits

August 10, 2021 17:53


          fix squeezebert

5007d4a


          fix squeezebert

84ad0e1

renmada changed the title ~~Squeezebert~~ Add squeezebert

Contributor Author

renmada commented Aug 10, 2021

模型权重
链接: https://pan.baidu.com/s/1Jis7In0veo4ODae5OR_FqA 提取码: p5bk


          update seqeezebert

Member

ZeyuChen commented Aug 15, 2021

@renmada Thanks for your contribution, please sign CLA agreement, thanks.

ZeyuChen assigned yingyibiao

smallv0221 requested changes

View reviewed changes

Contributor

smallv0221 left a comment

The mean problem is the use of config in layers. Please also provide examples with readme. You can refer to https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/language_model/bert/run_glue.py

paddlenlp/transformers/squeezebert/tokenizer.py

+                  Examples:
+                      .. code-block:: python
+                          from paddlenlp.transformers import SqueezeBertTokenizer
+                          tokenizer = SqueezeBertTokenizer.from_pretrained('SqueezeBert-small-discriminator')

Contributor

smallv0221 Aug 26, 2021

Didn't see SqueezeBert-small-discriminator .

paddlenlp/transformers/squeezebert/modeling.py



		class SqueezeBertEmbeddings(nn.Layer):
		"""Construct the embeddings from word, position and token_type embeddings."""

Contributor

smallv0221 Aug 26, 2021

"""
Construct the embeddings from word, position and token_type embeddings.
"""

paddlenlp/transformers/squeezebert/modeling.py

+                      self.position_embeddings = nn.Embedding(config.max_position_embeddings, config.embedding_size)
+                      self.token_type_embeddings = nn.Embedding(config.type_vocab_size, config.embedding_size)
+                      # self.LayerNorm is not snake-cased to stick with TensorFlow model variable name and be able to load

Contributor

smallv0221 Aug 26, 2021

Don't need explanation like this.

paddlenlp/transformers/squeezebert/modeling.py

+              class SqueezeBertEmbeddings(nn.Layer):
+                  """Construct the embeddings from word, position and token_type embeddings."""
+                  def __init__(self, config):

Contributor

smallv0221 Aug 26, 2021

Do not pass config down to every layer. Please refer to https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/bert/modeling.py

paddlenlp/transformers/squeezebert/modeling.py

+                      # position_ids (1, len position emb) is contiguous in memory and exported when serialized
+                      self.register_buffer("position_ids", paddle.arange(config.max_position_embeddings).expand((1, -1)))
+                  def forward(self, input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None):

Contributor

smallv0221 Aug 26, 2021

In PaddleNLP, only input_ids is allowed in embedding layer. Please refer to https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/bert/modeling.py

paddlenlp/transformers/squeezebert/modeling.py



		class SqueezeBertSelfAttention(nn.Layer):
		def __init__(self, config, cin, q_groups=1, k_groups=1, v_groups=1):

Contributor

smallv0221 Aug 26, 2021

Please declare all perems here instead of using config. Please refer to https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/gpt/modeling.py

paddlenlp/transformers/squeezebert/modeling.py



		class SqueezeBertLayer(nn.Layer):
		def __init__(self, config):

Contributor

smallv0221 Aug 26, 2021

The same config problem as above.

paddlenlp/transformers/squeezebert/modeling.py



		class SqueezeBertPreTrainedModel(PretrainedModel):
		base_model_prefix = "squeezebert"

Contributor

smallv0221 Aug 26, 2021

Need more documentation here.

paddlenlp/transformers/squeezebert/modeling.py

+              @register_base_model
+              class SqueezeBertModel(SqueezeBertPreTrainedModel):
+                  def __init__(self, **kwargs):

Contributor

smallv0221 Aug 26, 2021

Need more documentation here. Please refer to https://github.com/PaddlePaddle/PaddleNLP/blob/develop/paddlenlp/transformers/bert/modeling.py

paddlenlp/transformers/squeezebert/modeling.py

+                      # Tie weights if needed
+                      self.tie_weights()
+                  def tie_weights(self):

Contributor

smallv0221 Aug 26, 2021

Why do we need tie_weights? I didn't see any layer has get_output_embeddings function.

renmada mentioned this pull request

Add SqueezeBERT #937

Merged

yingyibiao assigned smallv0221 and unassigned yingyibiao

yingyibiao closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet