Skip to content

Commit

Permalink
Documentation additions
Browse files Browse the repository at this point in the history
  • Loading branch information
LysandreJik committed Aug 28, 2019
1 parent 912a377 commit 1dc43e5
Show file tree
Hide file tree
Showing 4 changed files with 56 additions and 4 deletions.
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,4 @@ The library currently contains PyTorch implementations, pre-trained model weight
model_doc/xlm
model_doc/xlnet
model_doc/roberta
model_doc/distilbert
43 changes: 43 additions & 0 deletions docs/source/model_doc/distilbert.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
DistilBERT
----------------------------------------------------

``DistilBertConfig``
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.DistilBertConfig
:members:


``DistilBertTokenizer``
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.DistilBertTokenizer
:members:


``DistilBertModel``
~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.DistilBertModel
:members:


``DistilBertForMaskedLM``
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.DistilBertForMaskedLM
:members:


``DistilBertForSequenceClassification``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.DistilBertForSequenceClassification
:members:


``DistilBertForQuestionAnswering``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: pytorch_transformers.DistilBertForQuestionAnswering
:members:
8 changes: 8 additions & 0 deletions docs/source/pretrained_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,5 +111,13 @@ Here is the full list of the currently provided pretrained models together with
| | | | ``roberta-large`` fine-tuned on `MNLI <http://www.nyu.edu/projects/bowman/multinli/>`__. |
| | | (see `details <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| DistilBERT | ``distilbert-base-uncased`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint |
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
| | ``distilbert-base-uncased-distilled-squad`` | | 6-layer, 768-hidden, 12-heads, 66M parameters |
| | | | The DistilBERT model distilled from the BERT model `bert-base-uncased` checkpoint, with an additional linear layer. |
| | | (see `details <https://medium.com/@victorsanh/8cf3380435b5>`__) |
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+

.. <https://huggingface.co/pytorch-transformers/examples.html>`__
8 changes: 4 additions & 4 deletions pytorch_transformers/modeling_distilbert.py
Original file line number Diff line number Diff line change
Expand Up @@ -433,7 +433,7 @@ def init_weights(self, module):
Here are the differences between the interface of Bert and DistilBert:
- DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belong to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`)
- DistilBert doesn't have `token_type_ids`, you don't need to indicate which token belongs to which segment. Just separate your segments with the separation token `tokenizer.sep_token` (or `[SEP]`)
- DistilBert doesn't have options to select the input positions (`position_ids` input). This could be added if necessary though, just let's us know if you need this option.
For more information on DistilBERT, please refer to our
Expand All @@ -450,9 +450,9 @@ def init_weights(self, module):

DISTILBERT_INPUTS_DOCSTRING = r"""
Inputs:
**input_ids**L ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Indices oof input sequence tokens in the vocabulary.
The input sequences should start with `[CLS]` and `[SEP]` tokens.
**input_ids** ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Indices of input sequence tokens in the vocabulary.
The input sequences should start with `[CLS]` and end with `[SEP]` tokens.
For now, ONLY BertTokenizer(`bert-base-uncased`) is supported and you should use this tokenizer when using DistilBERT.
**attention_mask**: (`optional`) ``torch.LongTensor`` of shape ``(batch_size, sequence_length)``:
Expand Down

0 comments on commit 1dc43e5

Please sign in to comment.