Paper 61

2018 Washington conference submission

Back to program.

Connecting TEI and IIIF

Nicholas Laiacona - Performant Software (United States), Ben Brumfield - Brumfield Labs (United States), Naoki Kokaze - The University of Tokyo (Japan), Kiyonori Nagasaki - International Institute for Digital Humanities (Japan), Makoto Goto - National Museum of Japanese History (Japan)

Abstract: The Text Encoding Initiative (TEI) is a mature standard for representing text as XML, with widespread adoption in scholarly editing and digital humanities communities. Since many texts encoded in TEI represent transcriptions and translations of documents with digital facsimiles, it seems logical that TEI systems should use IIIF to display page images, and that IIIF systems use TEI to represent texts. But while there has been discussion of such linkages, there has been little progress in defining gaps in the existing standards or creating example implementations for projects to follow.

This presentation will review efforts by three projects to use mixed IIIF+TEI data structures to establish interoperability of image and text between Juxta Editions, FromThePage, and Engi-shiki and the lessons learned from them. Juxta Editions is a web app that allows users to transcribe and mark texts in TEI using an intuitive editing environment. In Juxta Editions, users can generate diplomatic transcriptions of manuscripts, mark people, places, and events, as well as structural features such as chapters. FromThePage is a crowdsourcing platform that lets users import documents as IIIF manifests, transcribe them, and export transcripts as TEI-XML. By identifying a TEI schema and coding strategy that allows these two softwares to exchange texts, our hope is to establish a best practice for coordinating text in TEI with IIIF resources.

The Engi-Shiki project aims to encode the texts of the Engi-Shiki, which was compiled as an administrative manual in ancient Japan around the 10th century. It includes characteristics of ancient Chinese and Japanese texts in the East Asian tradition. Dealing with Japanese materials with Western-originated theoretical models, such as TEI and IIIF, should be a touchstone to explore the possibility of IIIF as the research framework. Juxta Editions and FromThePage will share documentation of our technical approach with the Engi-Shiki project and to see how that approach interacts with their editorial objectives.

The IIIF Presentation API documentation notes a number of places where a IIIF file might logically reference an XML representation of the text. However, there is not a recommendation for how TEI texts should reference Presentation API elements. We will be primarily considering use cases involving transcription of a primary source represented with a facsimile. Should a TEI text point to a canvas or an Image API endpoint? Is there a logical way to share annotations between TEI and IIIF representations? We may also investigate surfaces and zones in TEI and where image coordinate data should be best stored.

The results of the three projects’ combined efforts should present a model for mixed TEI-IIIF projects targeting various types of texts to utilize IIIF for research activities.

Presentation type: 20 minute presentations (plus 5 mins questions)

Topics:

  • IIIF and archival collections,
  • IIIF implementations from outside Europe and North America,
  • Emerging use cases for IIIF technical specifications,
  • IIIF-compatible software and experimentation

Keywords:

  • tei,
  • text,
  • digital editions,
  • crowdsourcing,
  • scholarly editing,
  • interoperability