Fix the broken links (huggingface#628)

Vaibhavs10 · Nov 10, 2022 · 0734a34 · 0734a34
1 parent 6e4314b
commit 0734a34
Showing 1 changed file with 3 additions and 3 deletions.
diff --git a/how-to-train.md b/how-to-train.md
@@ -116,7 +116,7 @@ t a
 
 What is great is that our tokenizer is optimized for Esperanto. Compared to a generic tokenizer trained for English, more native words are represented by a single, unsplit token. Diacritics, i.e. accented characters used in Esperanto – `ĉ`, `ĝ`, `ĥ`, `ĵ`, `ŝ`, and `ŭ` – are encoded natively. We also represent sequences in a more efficient manner. Here on this corpus, the average length of encoded sequences is ~30% smaller as when using the pretrained GPT-2 tokenizer.
 
-Here’s  how you can use it in `tokenizers`, including handling the RoBERTa special tokens – of course, you’ll also be able to use it directly from `transformers`.
+Here’s how you can use it in `tokenizers`, including handling the RoBERTa special tokens – of course, you’ll also be able to use it directly from `transformers`.
 
 ```python
 from tokenizers.implementations import ByteLevelBPETokenizer
@@ -144,7 +144,7 @@ print(
 
 **Update:** The associated Colab notebook uses our new [`Trainer`](https://github.com/huggingface/transformers/blob/master/src/transformers/trainer.py) directly, instead of through a script. Feel free to pick the approach you like best.
 
-We will now train our language model using the [`run_language_modeling.py`](https://github.com/huggingface/transformers/blob/master/examples/language-modeling/run_language_modeling.py) script from `transformers` (newly renamed from `run_lm_finetuning.py` as it now supports training from scratch more seamlessly). Just remember to leave `--model_name_or_path` to `None` to train from scratch vs. from an existing model or checkpoint.
+We will now train our language model using the [`run_language_modeling.py`](https://github.com/huggingface/transformers/blob/main/examples/legacy/run_language_modeling.py) script from `transformers` (newly renamed from `run_lm_finetuning.py` as it now supports training from scratch more seamlessly). Just remember to leave `--model_name_or_path` to `None` to train from scratch vs. from an existing model or checkpoint.
 
 > We’ll train a RoBERTa-like model, which is a BERT-like with a couple of changes (check the [documentation](https://huggingface.co/transformers/model_doc/roberta.html) for more details).
 
@@ -292,7 +292,7 @@ With more complex prompts, you can probe whether your language model captured mo
 
 We now can fine-tune our new Esperanto language model on a downstream task of **Part-of-speech tagging.**
 
-As mentioned before, Esperanto is a highly regular language where word endings typically condition the grammatical part of speech. Using a dataset of annotated  Esperanto POS tags formatted in the CoNLL-2003 format (see example below), we can use the [`run_ner.py`](https://github.com/huggingface/transformers/blob/master/examples/token-classification/run_ner.py) script from `transformers`.
+As mentioned before, Esperanto is a highly regular language where word endings typically condition the grammatical part of speech. Using a dataset of annotated  Esperanto POS tags formatted in the CoNLL-2003 format (see example below), we can use the [`run_ner.py`](https://github.com/huggingface/transformers/blob/main/examples/pytorch/token-classification/run_ner.py) script from `transformers`.
 
 > POS tagging is a token classification task just as NER so we can just use the exact same script.