Accepted for presentation at Balisage 2017 (late-breaking)
Authors: Ronald Haentjens Dekker, David J. Birnbaum
Clone the repo and open <reponame>/Balisage-1-3-xsl/Bal2017dekk0505.xml in a browser other than Chrome. The browser will apply a stylesheet and render the file in the Balisage-preview interface.
The XML tree paradigm has several well-known limitations for document modeling and processing. Some of these have received a lot of attention (especially overlap), some have received less (e.g., discontinuity, simultaneity, transposition, white space as crypto-overlap). Many of these have work-arounds, also well known, but—as is implicit in the term “work-around”—these work-arounds have disadvantages. Because they get the job done, however, and because XML has a large user community with diverse levels of technological expertise, it is difficult to overcome inertia and move to a technology that might offer a more comprehensive fit with the full range of document structures with which researchers need to interact both intellectually and programmatically. A high-level analysis of why XML has the limitations it has can enable us to explore how an alternative model of Text as Graph (TAG) might address these types of structures and tasks in a more natural and idiomatic way than is available within an XML paradigm.
- Main page: https://www.balisage.net/
- Program: https://www.balisage.net/2017/Program.html
- General author instructions: http://www.balisage.net/authorinstructions.html
- Schema, documentation, XSLT for transformation: http://www.balisage.net/tagset.html
- Tag set documentation: http://www.balisage.net/DocumentModels/BalisageTL/index.html (also in the repo in the <reponame>/BalisageTL subdirectory)