Pure OpenCV comic translation tool
MomoTranslator, a comic translation assistant tool, currently has the following capabilities:
-
Locating comic panels.
-
Identifying speech bubbles within the panels.
-
Ordering speech bubbles based on their panels; this order can be manually adjusted as needed.
-
Recognizing text within the speech bubbles.
-
Translating text using Google and ChatGPT web.
The text-filling feature is not publicly available at the moment. However, translators interested in this feature can contact me directly to run the text-filling script.
For a demonstration of the tool's previous features, please visit this video.
One of the key characteristics of the software is its reliance on OpenCV for most functionalities. While PyTorch is not necessary, integrating it could enhance accuracy.
The reason behind not disclosing the code previously was due to incidents of plagiarism by several comic narrators on video platforms, sexual harassment, and cyberbullying. There was a concern that making the source code public could exacerbate these issues.
However, with the recent advancements in ChatGPT, there might be new methods to circumvent the worst-case scenarios, warranting an exploration into open-sourcing the tool.
First, ensure that you have Python 3.10. You can check your Python version by entering the following command in the command line:
python --version
Clone this repository or download the ZIP file and extract it.
Install the necessary Python libraries with the following command:
pip install -r requirements.txt
To download en_core_web_sm
, run the proper command in terminal:
python3 -m spacy download en_core_web_sm #Mac
python.exe -m spacy download en_core_web_sm #Win
To download nltk data, run the following code in python:
import nltk
nltk.download('words')
nltk.download('names')
nltk.download('wordnet')
nltk.download('omw-1.4')
In the command line, navigate to the project folder and execute the file pyqt5_momotranslator.py:
python pyqt5_momotranslator.py
Upon running the program, a graphical user interface will appear.
This project is licensed under the terms of the MIT license. For more information, please refer to the LICENSE file.