This project produced a predictive text algorithm, and demonstration web interface, as part of the Coursera Data Science Capstone by Johns Hopkins University on Coursera (view certificate).
Due to the complexities, subtleties and ever-changing nature of language, the most successful predictive text algorithms tend to take the approach of training models on a large body of text sources "in the wild" rather than alternatives such as applying grammatical rules (although the combination of both has potential to be even better).
To this end we will be using a large body of text (corpus) provided by SwiftKey as the training source for our predictive text models. Here we report on the nature of the data and search for insight on effective strategies on how to build text predictive algorithms.
The data for this project kindly provided by SwiftKey (large zip archive).
Analyses were peformed using R. Reporting written in Rmarkdown format and rendered in HTML using knitr.
A brief explanation of this project and how to use the app can be found on Rpubs.
The predictive text web app is hosted on shinyapps.io.