In 2015, Lauren Tilton (@nolauren) and I published the book Humanities Data in R. The goal was to create a self-contained introduction to R for humanities scholars looking to apply computational methods to various kinds of data (i.e., networks, images, maps, and text). The core ideas in the book are all still valid, but there have been significant improvements in available methods for text and image analysis. In this repository we hope to provide updated code that uses more modern methods to accomplish similar tasks as in the original Chapters 7 (Images), 9 (NLP), and 10 (Text Analysis).
Currently, in this repository you will find RMarkdown files and updated code for the two text-processing chapters. Because of the copyright protections on the original expository text, we cannot publicly release large portions of the original exposition here. Instead, a limited set of notes are included in the Markdown files to make them independently understandable. Please see the original text for additional notes and commentary.
In the future we plan to release a fully ported open access variation of the text here over the next year. Please contact us if you are interested in testing this text out as it is released.