UKPLab · Jan 29, 2021 · Jan 29, 2021 · Jan 31, 2021 · Jan 31, 2021 · Feb 1, 2021
diff --git a/LICENSE b/LICENSE
diff --git a/NOTICE.txt b/NOTICE.txt
diff --git a/README.md b/README.md
@@ -15,10 +15,28 @@ At the moment, we provide the following models:
 
 
 **Examples:**
- - [EasyNMT Google Colab Example](https://colab.research.google.com/drive/1X47vgSiOphpxS5w_LPtjQgJmiSTNfRNC?usp=sharing)
- - [EasyNMT Opus-MT Online Demo](http://easynmt.net/demo)
+ - [EasyNMT Google Colab Example](https://colab.research.google.com/drive/1X47vgSiOphpxS5w_LPtjQgJmiSTNfRNC?usp=sharing) - Step-by-step example how to use EasyNMT with Python.
+ - [EasyNMT Opus-MT Online Demo](http://easynmt.net/demo) - Demo to test the translation quality of the Opus-MT model.
+- [EasyNMT Google Colab REST API Hosting](https://colab.research.google.com/drive/1kAh_Vt1ipA5-BuoaPX39rCIHFrhpcRpW?usp=sharing) - Example how to host a translation REST API on Google Colab and using the free GPU.
 
-## Installation
+
+## Docker & REST-API
+
+We provide ready-to-use Docker images, that wrap EasyNMT in a REST API:
+```
+docker run -p 24080:80 easynmt/api:2.0-cpu
+```
+
+Calling the REST API:
+```
+http://localhost:24080/translate?target_lang=en&text=Hallo%20Welt
+```
+
+See [docker/](docker/) for more information on the different Docker images and the REST API endpoints.
+
+Also check our [EasyNMT Google Colab REST API Hosting](https://colab.research.google.com/drive/1kAh_Vt1ipA5-BuoaPX39rCIHFrhpcRpW?usp=sharing) example, on how to use Google Colab and the free GPU to host a translation API.
+
+## Installation for Python
 You can install the package via:
 
 ```
@@ -80,16 +98,19 @@ print(model.translate(sentences, target_lang='en'))
 
 
 
+
 # Available Models
 
 The following models are currently available. They provide translations between 150+ languages.
 
 | Model | Reference | #Languages | Size | Speed GPU (Sentences/Sec on V100) | Speed CPU (Sentences/Sec) | Comment |
 | --- | --- | :---: | :---: | :---: | :---: | --- |
-| opus-mt | [Helsinki-NLP](https://github.com/Helsinki-NLP/Opus-MT) | 186 | 300 MB | 53 | 6 | Inidivudal models  (~300 MB) per translation direction
-| mbart50_m2m | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/multilingual) | 52 |  1.2 GB | 35  | 0.9| 
-| m2m_100_418M | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) | 100 | 0.9 GB | 39 | 1.1 | 
-| m2m_100_1.2B | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) | 100 | 2.4 GB | 23 |0.5 | 
+| opus-mt | [Helsinki-NLP](https://github.com/Helsinki-NLP/Opus-MT) | 186 | 300 MB | 50 | 6 | Inidivudal models  (~300 MB) per translation direction
+| mbart50_m2m | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/multilingual) | 52 |  2.3 GB | 25  | - | 
+| mbart50_m2en | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/multilingual) | 52 |  2.3 GB | 25  | - | Can only translate from the other languages to English. 
+| mbart50_en2m | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/multilingual) | 52 |  2.3 GB | 25  | - | Can only translate from English to the other languages. 
+| m2m_100_418M | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) | 100 | 1.8 GB | 22 | - | 
+| m2m_100_1.2B | [Facebook Research](https://github.com/pytorch/fairseq/tree/master/examples/m2m_100) | 100 | 5.0 GB | 13 | - | 
 
 ## Translation Quality
 
@@ -110,9 +131,9 @@ model = EasyNMT('opus-mt', max_loaded_models=10)
 
 The system will automatically detect the suitable Opus-MT model and load it. With the optional parameter `max_loaded_models` you can specify the maximal number of models that are simoultanously loaded. If you then translate with an unseen language direction, the oldest model is unloaded and the new model is loaded.
 
-## mBERT_50
+## mBART_50
 
-We provide a wrapper for the [mBART50](https://arxiv.org/abs/2008.00401) model from Facebook, that is able to translate between any pair of 50+ languages.
+We provide a wrapper for the [mBART50](https://arxiv.org/abs/2008.00401) model from Facebook, that is able to translate between any pair of 50+ languages. There are also models available to translate from English to these languages or vice versa.
 
 
 
@@ -135,8 +156,8 @@ We provide a wrapper for the [M2M 100](https://arxiv.org/abs/2010.11125) model f
 
 
 As the moment, we provide wrapper for two M2M 100 models:
-- **m2m_100_418M**: M2M model with 418 million parameters (0.9 GB)
-- **m2m_100_1.2B**: M2M model with 1.2 billion parameters (2.4 GB)
+- **m2m_100_418M**: M2M model with 418 million parameters (1.8 GB)
+- **m2m_100_1.2B**: M2M model with 1.2 billion parameters (5.0 GB)
 
 **Usage:**
 ```python
@@ -151,7 +172,7 @@ As soon as you call `EasyNMT('m2m_100_418M')` / `EasyNMT('m2m_100_1.2B')`, the r
 
 ## Author
 
-Contact person: [Nils Reimers](https://www.nils-reimers.de); [reimers@ukp.informatik.tu-darmstadt.de](mailto:reimers@ukp.informatik.tu-darmstadt.de)
+Contact person: [Nils Reimers](https://www.nils-reimers.de); [info@nils-reimers.de](mailto:info@nils-reimers.de)
 
 https://www.ukp.tu-darmstadt.de/
 

diff --git a/docker/README.md b/docker/README.md
@@ -0,0 +1,83 @@
+# Docker
+
+We provide a [Docker](https://www.docker.com/) based REST-API for EasyNMT: Send a query with your source text, and the API returns the translated text.
+
+## Setup
+
+To start the EasyNMT REST API on port `24080`run the following docker command:
+```
+docker run -p 24080:80 easynmt/api:2.0-cpu
+```
+
+This uses the CPU image. If you have **GPU (CUDA)**, there are various GPU images available. Have a look at our [Docker Hub Page](https://hub.docker.com/r/easynmt/api/tags?page=1&ordering=last_updated).
+
+
+## Usage
+
+After you started the Docker image, you can visit: [http://localhost:24080/translate?target_lang=en&text=Hallo%20Welt](http://localhost:24080/translate?target_lang=en&text=Hallo%20Welt)
+
+This should yield the following JSON:
+```
+{
+    "target_lang": "en",
+    "source_lang": null,
+    "detected_langs": [
+        "de"
+    ],
+    "translated": [
+        "Hello world"
+    ],
+    "translation_time": 0.163145542144775
+}
+```
+If you have started it with a different port, replace `24080` with the port you chose.
+
+Note, for the first translation, the respective models are downloaded. This might take some time. Consecutive calls will be faster.
+
+## Programmatic Usage
+- **Python:** [python_query_api.py](examples/python_query_api.py) - Sending requests with Python to the EasyNMT Docker API.
+- **Vue.js:** [vue_js_frontend.html](examples/vue_js_frontend.html) Vue.js Code for our [demo](http://easynmt.net/demo/).
+
+## Documentation
+
+To get an overview of all REST API endpoints, with all possible parameters and their description, you open the following url: [http://localhost:24080/docs](http://localhost:24080/docs)
+
+### Endpoints
+The following endpoints with the GET method are defined (i.e. you can call them like `http://localhost:24080/name?param1=val1&param2=val2`)
+
+```
+/translate
+    Translates the text to the given target language.
+    :param text: Text that should be translated
+    :param target_lang: Target language
+    :param source_lang: Language of text. Optional, if empty: Automatic language detection
+    :param beam_size: Beam size. Optional
+    :param perform_sentence_splitting: Split longer documents into individual sentences for translation. Optional
+    :return:  Returns a json with the translated text
+
+/language_detection
+    Detects the language for the provided text
+    :param text: A single text for which we want to know the language
+    :return: The detected language
+    
+/get_languages
+    Returns the languages the model supports
+    :param source_lang: Optional. Only return languages with this language as source
+    :param target_lang: Optional. Only return languages with this language as target
+    :return:
+```
+
+You can call the `/translate` and `/language_detection` also with a POST request, giving you the option to pass a list of multiple texts. Then all texts are translated and returned at once.
+
+### Environment Variables
+You can control the Docker image using various environment variables:
+- *MAX_WORKERS_BACKEND*: Number of worker processes for the translation. Default: 1
+- *MAX_WORKERS_FRONTEND*: Number of worker processes for language detection & model info. Default: 2
+- *EASYNMT_MODEL*: Which EasyNMT Model to load. Default: opus-mt
+- *EASYNMT_MODEL_ARGS*: Json encoded string with parameters when loading EasyNMT: Default: {}
+- *EASYNMT_MAX_TEXT_LEN*: Maximal text length for translation. Default: Not set
+- *EASYNMT_MAX_BEAM_SIZE*: Maximal beam size for translation. Default: Not set
+- *EASYNMT_BATCH_SIZE*: Batch size for translation. Default: 16
+- *TIMEOUT*: [Gunicorn timeout](https://docs.gunicorn.org/en/stable/settings.html#timeout). Default: 120
+
+All model files are stored at `/cache/`. You can mount this path to your host machine if you want to re-use previously downloaded models.
diff --git a/docker/api/cpu.dockerfile b/docker/api/cpu.dockerfile
@@ -0,0 +1,50 @@
+FROM python:3.8-slim
+LABEL maintainer="Nils Reimers <info@nils-reimers>"
+
+RUN apt-get update && apt-get -y install -y procps
+RUN pip install --no-cache-dir torch==1.8.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
+
+###################################### Same code for all docker files ###############
+
+## Install dependencies
+RUN apt-get update && apt-get -y install build-essential
+RUN pip install --no-cache-dir "uvicorn[standard]" gunicorn fastapi
+COPY ./requirements.txt /requirements.txt
+RUN pip install --no-cache-dir -r /requirements.txt
+RUN python -m nltk.downloader 'punkt'
+
+#### Scripts to start front- and backend worker
+
+COPY ./start_backend.sh /start_backend.sh
+RUN chmod +x /start_backend.sh
+
+COPY ./start_frontend.sh /start_frontend.sh
+RUN chmod +x /start_frontend.sh
+
+COPY ./start.sh /start.sh
+RUN chmod +x /start.sh
+
+COPY ./gunicorn_conf_backend.py /gunicorn_conf_backend.py
+COPY ./gunicorn_conf_frontend.py /gunicorn_conf_frontend.py
+
+#### Woking dir
+
+COPY ./src /app
+WORKDIR /app/
+ENV PYTHONPATH=/app
+EXPOSE 80
+
+####
+
+# Create cache folders
+RUN mkdir /cache
+RUN mkdir /cache/easynmt
+RUN mkdir /cache/transformers
+RUN mkdir /cache/torch
+
+ENV EASYNMT_CACHE=/cache/easynmt
+ENV TRANSFORMERS_CACHE=/cache/transformers
+ENV TORCH_CACHE=/cache/torch
+
+# Run start script
+CMD ["/start.sh"]
diff --git a/docker/api/cuda10.1.dockerfile b/docker/api/cuda10.1.dockerfile
@@ -0,0 +1,49 @@
+FROM pytorch/pytorch:1.6.0-cuda10.1-cudnn7-runtime
+LABEL maintainer="Nils Reimers <info@nils-reimers>"
+
+###################################### Same code for all docker files ###############
+
+## Install dependencies
+RUN apt-get update && apt-get -y install build-essential
+RUN pip install --no-cache-dir "uvicorn[standard]" gunicorn fastapi
+COPY ./requirements.txt /requirements.txt
+RUN pip install --no-cache-dir -r /requirements.txt
+RUN python -m nltk.downloader 'punkt'
+
+#### Scripts to start front- and backend worker
+
+COPY ./start_backend.sh /start_backend.sh
+RUN chmod +x /start_backend.sh
+
+COPY ./start_frontend.sh /start_frontend.sh
+RUN chmod +x /start_frontend.sh
+
+COPY ./start.sh /start.sh
+RUN chmod +x /start.sh
+
+COPY ./gunicorn_conf_backend.py /gunicorn_conf_backend.py
+COPY ./gunicorn_conf_frontend.py /gunicorn_conf_frontend.py
+
+#### Woking dir
+
+COPY ./src /app
+WORKDIR /app/
+ENV PYTHONPATH=/app
+EXPOSE 80
+
+####
+
+# Create cache folders
+RUN mkdir /cache/
+RUN mkdir /cache/easynmt
+RUN mkdir /cache/transformers
+RUN mkdir /cache/torch
+
+ENV EASYNMT_CACHE=/cache/easynmt
+ENV TRANSFORMERS_CACHE=/cache/transformers
+ENV TORCH_CACHE=/cache/torch
+
+# Run start script
+CMD ["/start.sh"]
+
+
diff --git a/docker/api/cuda11.0.dockerfile b/docker/api/cuda11.0.dockerfile
@@ -0,0 +1,48 @@
+FROM pytorch/pytorch:1.7.1-cuda11.0-cudnn8-runtime
+LABEL maintainer="Nils Reimers <info@nils-reimers>"
+
+###################################### Same code for all docker files ###############
+
+## Install dependencies
+RUN apt-get update && apt-get -y install build-essential
+RUN pip install --no-cache-dir "uvicorn[standard]" gunicorn fastapi
+COPY ./requirements.txt /requirements.txt
+RUN pip install --no-cache-dir -r /requirements.txt
+RUN python -m nltk.downloader 'punkt'
+
+#### Scripts to start front- and backend worker
+
+COPY ./start_backend.sh /start_backend.sh
+RUN chmod +x /start_backend.sh
+
+COPY ./start_frontend.sh /start_frontend.sh
+RUN chmod +x /start_frontend.sh
+
+COPY ./start.sh /start.sh
+RUN chmod +x /start.sh
+
+COPY ./gunicorn_conf_backend.py /gunicorn_conf_backend.py
+COPY ./gunicorn_conf_frontend.py /gunicorn_conf_frontend.py
+
+#### Woking dir
+
+COPY ./src /app
+WORKDIR /app/
+ENV PYTHONPATH=/app
+EXPOSE 80
+
+####
+
+# Create cache folders
+RUN mkdir /cache/
+RUN mkdir /cache/easynmt
+RUN mkdir /cache/transformers
+RUN mkdir /cache/torch
+
+ENV EASYNMT_CACHE=/cache/easynmt
+ENV TRANSFORMERS_CACHE=/cache/transformers
+ENV TORCH_CACHE=/cache/torch
+
+# Run start script
+CMD ["/start.sh"]
+
diff --git a/docker/api/cuda11.1.dockerfile b/docker/api/cuda11.1.dockerfile
@@ -0,0 +1,50 @@
+FROM pytorch/pytorch:1.8.0-cuda11.1-cudnn8-runtime
+LABEL maintainer="Nils Reimers <info@nils-reimers>"
+
+###################################### Same code for all docker files ###############
+
+## Install dependencies
+RUN apt-get update && apt-get -y install build-essential
+RUN pip install --no-cache-dir "uvicorn[standard]" gunicorn fastapi
+COPY ./requirements.txt /requirements.txt
+RUN pip install --no-cache-dir -r /requirements.txt
+RUN python -m nltk.downloader 'punkt'
+
+#### Scripts to start front- and backend worker
+
+COPY ./start_backend.sh /start_backend.sh
+RUN chmod +x /start_backend.sh
+
+COPY ./start_frontend.sh /start_frontend.sh
+RUN chmod +x /start_frontend.sh
+
+COPY ./start.sh /start.sh
+RUN chmod +x /start.sh
+
+COPY ./gunicorn_conf_backend.py /gunicorn_conf_backend.py
+COPY ./gunicorn_conf_frontend.py /gunicorn_conf_frontend.py
+
+#### Woking dir
+
+COPY ./src /app
+WORKDIR /app/
+ENV PYTHONPATH=/app
+EXPOSE 80
+
+####
+
+# Create cache folders
+RUN mkdir /cache/
+RUN mkdir /cache/easynmt
+RUN mkdir /cache/transformers
+RUN mkdir /cache/torch
+
+ENV EASYNMT_CACHE=/cache/easynmt
+ENV TRANSFORMERS_CACHE=/cache/transformers
+ENV TORCH_CACHE=/cache/torch
+
+# Run start script
+CMD ["/start.sh"]
+
+
+