Here is a relatively quick attempt to build TextRank
using
networkx
Pagerank. I compare the results with the proper implementation
here
As with most of the code throughout this repo, the code is not meant to be production-ready, but readable so one can see what is happening. You might find some of the helper function useful for your tasks
The order of the .py
scripts is:
prepare_data.py
: simple manipulation and sentence tokenizationsentence_vectors.py
: build sentence vectors averaging word vectorsreviews_summary.py
: summarize reviews using the classSummarizer
atsummarize.py
Easy.
As one might expect, the SummaNLP
implementation works better than mine.
There are explanatory notebooks in the notebooks
dir.