The purpose of this project was to create a search engine based on Elasticsearch for retrieving AI-generated images based on their prompts.
In addition, we have used machine learning models to perform sentiment analysis on the prompts, to allow the user to tune their search results based on emotion labels.
We have compared the performace of CNN, BERT and pQRNN based models for this task.
For details and references, see report/IR_report.pdf
Students:
- Mohamed Darkaoui
- Viktor Hura
- Mounir Madmar
- python
- pip
- cuda installion of pytorch and tensorflow (optional)
All the required libraries can be found in requirements.txt
report/IR_report.pdf
holds the project report
data
holds all the external data that we used, some data is missing from this repo because it's too big, but README.md
and .gitignore
files should point you where to get it.
results
holds data and models generated by our code
src
hold all the source code
It's split up in two parts, classification and search + retrieval + interface
- Preprocess the data by running the script in
src/preprocessing
directory, more info in the correspondingREADME
- (Optional) generate word vectors by using the scripts in
src/generate_wordvectors
, more info in the correspondingREADME
- Train models, validate models, and label our lexica data in
src/classification
, more info in the correspondingREADME