Skip to content

Commit

Permalink
fix M2M100 example (huggingface#10745)
Browse files Browse the repository at this point in the history
  • Loading branch information
patil-suraj authored Mar 16, 2021
1 parent b549258 commit d3d388b
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion docs/source/model_doc/m2m_100.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ multilingual it expects the sequences in a certain format: A special language id
source and target text. The source text format is :obj:`[lang_code] X [eos]`, where :obj:`lang_code` is source language
id for source text and target language id for target text, with :obj:`X` being the source or target text.

The :class:`~transformers.M2M100Tokenizer` depends on :obj:`sentencepiece` so be sure to install it before running the
examples. To install :obj:`sentencepiece` run ``pip install sentencepiece``.

- Supervised Training

.. code-block::
Expand Down Expand Up @@ -87,7 +90,7 @@ id for source text and target language id for target text, with :obj:`X` being t
"La vie est comme une boîte de chocolat."
>>> # translate Chinese to English
>>> tokenizer.src_lang = "ar_AR"
>>> tokenizer.src_lang = "zh"
>>> encoded_zh = tokenizer(chinese_text, return_tensors="pt")
>>> generated_tokens = model.generate(**encoded_zh, forced_bos_token_id=tokenizer.get_lang_id("en"))
>>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
Expand Down

0 comments on commit d3d388b

Please sign in to comment.