Name		Name	Last commit message	Last commit date
parent directory ..
.htaccess		.htaccess
README.md		README.md

README.md

EDH Ctagged Inscriptions

What is this dataset?

This dataset includes the XML files for mainly Latin inscriptions. All inscription data is based on the Epigraphic Database Heidelberg, one of the most well-known databases in the Roman studies, by downloading all the XML files it provided in Creative Commons Licence BY-SA. However, we made slight additional edits to that, to enable more sophisticated and granular use of epigraphic text data. The modifications are as follows:

put <c> to each single character in the text
use <sourceDoc> as a text container instead of original <body>
replace <lb> empty tag with <line>

As modifications are all processed automatically using Python, there will be some small problems in the files (and maybe huge ones too).

What the advantage?

As each character in the texts is marked with <c> and given xml:id, we can refer to it as a web data resource with respective URI. This makes more granular encoding of textual data possible, for example, connecting each character resource to a corresponding image, putting calligraphic information on it, or giving 3D spatial coordinates on the character level.

Licence

All data provided in this repository can be reused under the CC BY-SA 4.0 licence

Homepage:

https://researchmap.jp/jo-fil-ho

Contact:

Jun Ogawa [email protected] (https://github.com/junjun7613)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EDH_Ctagged

EDH_Ctagged

README.md

EDH Ctagged Inscriptions

What is this dataset?

What the advantage?

Licence

Files

EDH_Ctagged

Directory actions

More options

Directory actions

More options

Latest commit

History

EDH_Ctagged

Folders and files

parent directory

README.md

EDH Ctagged Inscriptions

What is this dataset?

What the advantage?

Licence