This repository contains default Apache Tika configuration for integrating the Apache cTAKES clinical text and knowledge environment. A tika-config is provided that enables the cTAKES parser and maps it to PDF and IsaTab documents. A cTAKESParser.properties file is included that configures various properties about how cTAKES is run and that stores login credentials for the Unified Medical Language System (UMLS) which cTAKES requires.
Sample IsaTab data from FP001RO-all-samples.txt is also included.
Send them to Chris A. Mattmann.
- Giuseppe Totaro, JPL
- Selina Chu, JPL
- Chris A. Mattmann, JPL