IIIF Discovery Technical Specification Group Charter

Introduction

Interoperable resources are only useful if they can be found. This has been well-recognized since the early days of the IIIF community, and more recently it has become clear that a concerted effort to standardize patterns that will facilitate discovery, harvesting and synchronization, indexing, and importing of IIIF resources is required.

This group will create specifications that improve the discovery process for IIIF resources, with a focus on leveraging existing techniques and tools, and promoting widespread adoption within the community. It will assist with and steer the implementation of community infrastructure, such as a registry of adopters, validators for the implementations, and transformation tools to generate the required data from existing systems and APIs.

If successful, the work will enable the collaborative development of global or thematic registries, search engines and portal applications that allow developers and end users to easily find and use content available via existing IIIF APIs.

Scope

The scope of the group’s efforts is divided into four separate but interlinked areas. Each area is important to make progress on for the overall effort to be successful and sustainable. This Technical Specifications Group expands the scope of the IIIF community into these areas, as agreed by the community and according to the community roadmap.

Work that is out of scope for this group includes the selection or creation of any descriptive metadata formats, and the selection or creation of metadata search APIs or protocols. These are out of scope as the diverse domains represented within the IIIF community already have different standards in these spaces, and previous attempts to reconcile these specifications across domains have not been successful.

The group commits to following the requirements for the specification process, including the production of two independent implementations of each feature specified, and to reach out to other communities aligned with IIIF for feedback and to encourage adoption.

1. Crawling and harvesting

The first, necessary step for enabling the discovery of IIIF resources is to have a consistent and well understood pattern for providers to publish lists of links to their available IIIF content. The work does not include transmission optimization for the content itself, for example transferring any source image content between systems, only for the discovery of the existing content. There are two distinct audiences which might require distinct solutions, as described in the following sections.

1A. IIIF community

As a community, we must solve our own problems as best fits our particular requirements and capacities. These solutions might benefit others outside of IIIF, directly or indirectly, but our primary stakeholders are those organizations and individuals that are working together to make our content more accessible. The scope of this section of work is to provide an integrated and easy to adopt specification that lets us build IIIF specific discovery platforms.

The anticipated deliverables:

Specification of how content providers publish lists of resources for discovery by the IIIF community
Recommendations for how consuming applications process those lists
Validation service that checks the lists are correctly generated
Registry of institutions’ lists
Reference implementations of both producer and consumer applications

Background work:

IIIF Collections
ResourceSync
ActivityStreams

1B. Search engines

Our secondary audience is the rest of the web, and especially web-wide search engines. These platforms have much greater reach, however they are (for the most part) not willing to adopt community-specific solutions as they don’t scale. Instead this section of work is to determine how best to promote our content to industry search providers using their technologies, regardless of how difficult or integrated that might be. The resources that will be discovered are likely to be HTML pages, rather than the IIIF resources directly.

The anticipated deliverables:

Specification of how content providers make their IIIF resources discoverable to web search engines
Reference implementations of the specification

Background work:

Sitemap Protocol
Schema.Org

2. Content indexing

The IIIF Presentation API does not include any information about the objects being presented that would allow for a fielded or advanced search, however it does have the facilities for linking to external non-IIIF descriptions of the objects. For example, the Presentation API’s manifest for a book contains a description of the book intended for humans, and might link to a description of the book intended for machines.

In order to facilitate advanced search and thematic portals where a subset of the available resources are indexed in more detail, work will be carried out to identify the common formats used in the various communities that have adopted IIIF, and provide best practices for how to reference those descriptions. The work will not include further best practices on how institutions should use those formats, but might include outreach to the maintainers of the formats to establish how to include the reverse link from the format to the IIIF resources.

The anticipated deliverables:

Recommendations for profile identifiers and formats, following existing practice.

Background work:

Newspaper Group work on METS/ALTO
EDM / IIIF Alignment work

3. Change notification

Once a system has crawled the list of resources available, there are several benefits to being updated with the changes, rather than requiring the list be re-crawled every time. In particular, it is easier to stay up to date in a timely fashion, it is easier on the providing organization to not have their content constantly crawled by robots, and it is more efficient to index only known changes rather than detect if a resource has changed. The work will include analysis and prototyping of notification systems, built on top of existing standards, to promote this efficiency.

The anticipated deliverables:

Specification for notification interactions
Reference implementations for notification generators and consumers
Validation service for notifications
If necessary, a central hub for distribution of notifications

Background work:

ResourceSync
WebMention
Linked Data Notifications
PubSub

4. Import to viewers

IIIF resources are intended to be used in different contexts, with different viewing applications, as appropriate to the needs of the user. In order to enable users to work with the content once it has been discovered, the fourth part of the work is to establish a specification of how content providers and discovery applications can allow the user to import the IIIF content into external viewing or processing systems.

The anticipated deliverables:

Specification of content reference import
Recommendations around consistent UI/UX patterns
Validation service for the import process
Reference implementations for generators and consumers

Background work:

IIIF Drag and Drop implementations

Estimated timeline

Q4 2016: Group established, work commences
Q1 2017: Gather use cases
Q2 2017: Discuss use cases and technologies
Q3 2017:
Q4 2017: Initial technology decisions & experimentation
Q1 2018:
Q2 2018:
Q3 2018:
Q4 2018: Draft specifications

Communication channels

Github Repository: http://github.com/IIIF/discovery
Slack: #discovery
Email: IIIF-Discuss list ; subject line: [discovery]
Face to Face: Annual IIIF events such as Conferences and Working Group meetings, plus as incidental travel allows
Calls: Initially bi-weekly, plus standing updates/feedback on Technical Call

Community support

Organizations

Bavarian State Library
Biblissima - Campus Condorcet
Brumfield Labs
Carnegie Museum of Art
Cornell University
Digirati
Europeana
Harvard University
J. Paul Getty Trust
Los Alamos National Laboratory
National Gallery of Art
National Library of Israel
National Library of Wales
North Carolina State University Libraries
Loyola University Maryland
Oxford University
Princeton University
Stanford University
University of Edinburgh
University of Michigan
University of Toronto
Yale Center for British Art

Technical editors

Michael Appleby
Tom Crane
Rob Sanderson
Jon Stroop
Simeon Warner

ChangeLog

Date	Description
2018-XX-YY	Revision of timeline, clarification of SEO versus Internal audiences
2017-XX-YY	Initial charter