Search similar texts using Topic Modelling

For a given text, retrieve the associated topic and top N similar texts by Topic Modelling approach (LDA).

Approach

For a given set of documents:
- Find the ideal model parameters for topic modelling (LDA) i.e. number of topics, learning decay.
- Generate document-word matrix with weightage of each word.
- Generate topic-word matrix with number of words limited to each topic.
Predict:
- For a given text, retreive the best topic.
- Get the dominant word in the predicted topic.
- Dominant word ultimately is the topic tag
Get similar douments:
- For a given text, derive distance with all documents.
- Get the top N documents based on distance.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
search.py		search.py
texts.txt		texts.txt