Skip to content

A natural language processing app that predicts most likely next words

License

Notifications You must be signed in to change notification settings

msinjin/nlp_next_word

Repository files navigation

nlp_next_word

This project produced a predictive text algorithm, and demonstration web interface, as part of the Coursera Data Science Capstone by Johns Hopkins University on Coursera (view certificate).

Background

Due to the complexities, subtleties and ever-changing nature of language, the most successful predictive text algorithms tend to take the approach of training models on a large body of text sources "in the wild" rather than alternatives such as applying grammatical rules (although the combination of both has potential to be even better).

To this end we will be using a large body of text (corpus) provided by SwiftKey as the training source for our predictive text models. Here we report on the nature of the data and search for insight on effective strategies on how to build text predictive algorithms.

Data

The data for this project kindly provided by SwiftKey (large zip archive).

Code

Analyses were peformed using R. Reporting written in Rmarkdown format and rendered in HTML using knitr.

Usage

A brief explanation of this project and how to use the app can be found on Rpubs.

The predictive text web app is hosted on shinyapps.io.

About

A natural language processing app that predicts most likely next words

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published