Four Reasons to Hate TEI
(and Then Fall in Love)

Todd R. Hanneken
St. Mary’s University

September 16, 2025
University of North Carolina at Charlotte

1. XML requires a hierarchical structure

  • Think about essential divisions of meaning and smallest unit of meaning
  • Milestones and other empty elements for accidents of representation secondary to essential meaning
  • Example: fifth-century codex manuscript palimpsested

Biblical Literature:
book > chapter > verse

1:31God saw everything that he had made, and indeed, it was very good. And there was evening and there was morning, the sixth day. 2:1Thus the heavens and the earth were finished, and all their multitude. 2And on the seventh day God finished the work that he had done, and he rested on the seventh day from all the work that he had done. 3So God blessed the seventh day and hallowed it, because on it God rested from all the work that he had done in creation. 4These are the generations of the heavens and the earth when they were created.

In the day that the LORD God made the earth and the heavens, 5when no plant of the field was yet in the earth and no herb of the field had yet sprung up…

<milestone type="ch" n="1"/>
<milestone type="vs" n="31"/>

Hypertext Edition:
page > column > line

<pb/>
<cb/>
<lb/>
<pb n="52" ana="93" xml:id="palimpsest_page_052" 
	change="#palimpsest"/>
<cb n="a"/>
<lb xml:id="o083a1" facs="#a083a1"/>

Facsimile Viewer:
page > layer > pixel

<surface xml:id="a093" n="Jub 15:20-26, page 93 (renumbered 52)" 
	start="#t093" ulx="0" uly="0" lrx="6132" lry="8176">
	<graphic width="6132px" height="8176px" n="Accurate Color" 
		url="https://jubilees.stmarytx.edu/staticiiif/Ambrosiana_C73inf_052_Ac_00/full/max/0/default.jpg" />
	<graphic width="6132px" height="8176px" n="Trace" 
		url="https://jubilees.stmarytx.edu/staticiiif/Ambrosiana_C73inf_052_Trace/full/max/0/default.png" />
	<graphic width="6132px" height="8176px" n="Raking NE" 
		url="https://jubilees.stmarytx.edu/staticiiif/Ambrosiana_C73inf_052_Ac_07/full/max/0/default.jpg" />
	<media xml:id="a093r" type="Relight" mimeType="text/html"
		url="https://jubilees.stmarytx.edu/staticrelight/Ambrosiana_C73inf_052.html"/>
	<line xml:id="a093a1" corresp="#o093a1" ulx="600" 
		lrx="2800" uly="1156" lry="1358" />
</surface>

Print Edition:
book > paragraph > sentence
> reading > word

<div xml:id="LatinJubilees">
	<head resp="#Ceriani1861">Fragmenta Parvae Genesis</head>
	<p>
		<s>
			<w>et</w>
			<app>
				<lem source="#hr"><w>auro</w></lem>
				<rdg wit="#A"><w>aro</w></rdg>
			</app>
		</s>
	</p>
</div>

2. TEI is Slow

  • Close reading, attention to detail
  • Decide the smallest unit of meaning on a per-project basis
  • Granularity can be added later
  • <persName> if one output should capitalize abraham
  • <s> if one output should capitalize and punctuate
  • <w> if one output should have spaces between words
  • Interpretation required to decide between <persName>israel</persName>, <orgName>israel</orgName>, and <placeName>israel</placeName>

3. TEI is Not Pretty

  • Book of Jubilees vs. <title level="m">Book of Jubilees</title>
  • WYSIWYG = What You See Is What You Get
  • WYSIWYM = What You See Is What You Mean
  • TEI is first and foremost a semantic coding system
    • Supports describing the rendering of the source
    • Not intended for prescribing rendering of the output
  • TEI XML for meaning, XSLT for appearance

XSLT Examples

  • Print edition capitalizes proper nouns,
    hypertext edition ignores those tags
  • Hypertext edition renders new lines as found in the manuscript,
    print edition ignores them
  • Can output a spreadsheet of all words with context and attributes in columns
  • Other output styles for ePub, Kindle, etc. from the same source

4. TEI is Too Permissive to Be Interoperable

  • E.g., bibliography can tag every field or only those necessary for hypertext rendering
  • Follow the guidelines, not just validate
  • Avoid <hi rend="italic"> in favor of more semantic <emph>, <foreign>, and <title level="m">
  • <l> is for lines of poetry, not wrapped lines of prose; <lb/> is line beginning, not line break
  • If it doesn't do what you want it to do, there is probably a reason

Lessons from Genre Theory

  • Familiarity is enough, a universal formula is not always desirable
  • Start by imitating a similar project and customize as necessary

Further Reading

  • Streamlining Text Encoding Initiative (TEI), International Image Interoperability Framework (IIIF), and Mirador: Theoretical and Practical Considerations for Interactive Critical Editions DH 2024 (paper)
  • tei2edition XSLT for rendering TEI XML as print edition, hypertext edition, facsimile viewer, and tools (GitHub)
  • Open-source textbook (OER) in TEI, rendered in HTML, PDF, ePub, and Mobi (GitHub)
  • Minimal Computing and the Liberal Arts, Association for Computers and the Humanities 2023 (paper)
  • TEI Guidelines (link)