Used Elbow, Silhouette and Gaps Statistics for Text Summarization
The quantity of information on the internet is massively increasing and gigantic volume of data with numerous compositions accessible openly online become more widespread. It is challenging nowadays for a user to extract the information efficiently and smoothly.
As one of the methods to tackle this challenge, text summarization process diminishes the redundant information and retrieves the useful and relevant information from a text document to form a compressed and shorter version which is easy to understand and timesaving while reflecting the main idea of the discussed topic within the document.
The approaches of automatic text summarization earn a keen interest within the Text Mining and NLP (Natural Language Processing) communities because it is a laborious job to manually summarize a text document.
Mainly there are two types of text summarization, namely extractive based and abstractive based.
Here the extractive based summarization using K-Means Clustering with TFIDF (Term Frequency-Inverse Document Frequency) for summarization. The methods also reflects the idea of true K and using that value of K divides the sentences of the input document to present the final summary.