IIIF Change Discovery API 0.4

Status of this Document

This Version: 0.4.0

Latest Stable Version: None

Previous Version: 0.3

Editors:

Copyright © 2012-2024 Editors and contributors. Published by the IIIF Consortium under the CC-BY license, see disclaimer.

Status Warning This is a work in progress and may change without notice. Implementers should be aware that this document is not stable. Implementers are likely to find the specification changing in incompatible ways. Those interested in implementing this document before it reaches beta or release stages should join the IIIF mailing list and the Discovery Specification Group, take part in the discussions, and follow the emerging issues on Github.


1. Introduction

The resources made available via the IIIF (pronounced “Triple-Eye-Eff”) Image and Presentation APIs are useful only if they can be found. Users cannot interact directly with a distributed, decentralized ecosystem but instead must rely on services that harvest and process the available content, and then provide a user interface enabling navigation to that content via searching, browsing or other paradigms. Once the user has discovered the content, they can then display it in their viewing application of choice. Machine to machine interfaces are also enabled by this approach, where software agents can interact via APIs to discover the same content and retrieve it for further analysis or processing.

This specification leverages existing techniques, specifications, and tools in order to promote widespread adoption of an easy-to-implement service. The service describes changes to IIIF content resources and the location of those resources to harvest. Content providers can implement this API to enable the collaborative development of global or thematic search engines and portal applications that ultimately allow users to easily find and engage with content available via existing IIIF APIs.

1.1. Objectives and Scope

The objective of the IIIF Change Discovery API is to provide the information needed to discover and subsequently make use of IIIF resources. The intended audience is other IIIF aware systems that can leverage the content and APIs. While this work may benefit others outside of the IIIF community directly or indirectly, the objective of the API is to specify an interoperable solution that best and most easily fulfills the discovery needs within the community of participating organizations.

The discovery of IIIF resources requires a consistent and well understood pattern for content providers to publish lists of links to their available content. This allows a baseline implementation of discovery systems that process the list, looking for resources that have been added or changed.

This process can be optimized by allowing the content providers to publish descriptions of when their content has changed, enabling consuming systems to only retrieve the resources that have been modified since they were last retrieved. These changes might include when content is deleted or otherwise becomes unavailable. Finally, for rapid synchronization, a system of notifications pushed from the publisher to a set of subscribers can reduce the amount of effort required to constantly poll all of the systems to see if anything has changed.

Work that is out of scope of this API includes the recommendation or creation of any descriptive metadata formats, and the recommendation or creation of metadata search APIs or protocols. The diverse domains represented within the IIIF community already have successful standards fulfilling these use cases, and the results of previous attempts to reconcile these standards across domains have not seen widespread adoption. Also out of scope is optimization of the transmission of content, for example recommendations about transferring any source media or data between systems.

Notification of Changes
This draft version of the specification does not include the subscription mechanism for enabling change notifications to be pushed to remote systems. The current specification only enables the polling pattern where the set of changes must be periodically reprocessed. Notifications are likely to be added in a future version before 1.0.

1.2. Terminology

The specification uses the following terms:

  • HTTP(S): The HTTP or HTTPS URI scheme and internet protocol.

The terms array, JSON object, number, and string in this document are to be interpreted as defined by the Javascript Object Notation (JSON) specification.

The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this document are to be interpreted as described in RFC 2119.

2. Overview of IIIF Resource Discovery

In order to discover IIIF resources, the state of those resources in the systems that publish them needs to be communicated succinctly and easily to a consuming system. The consumer can then use that information to retrieve and process the resources of interest and provide machine or user interfaces that enable discovery. This communication takes place via the IIIF Change Discovery API, which uses the W3C Activity Streams specification to describe and serialize changes to resources, where the semantics of those changes and the interactions between publishers and consumers are based on those identified by the ResourceSync framework.

Activities are used to describe the state of the publishing system by recording each individual change, in the order that they occur. The changes described are the creation, modification and deletion of IIIF Presentation API resources, primarily Collections and Manifests. If the consuming application is aware of all of the changes that took place in the publishing system, it would have full knowledge of the set of resources available. The focus on IIIF Collections and Manifests is because these are the main access points to published content and references to descriptive metadata about that content, however Activities describing changes to other resources, such as the IIIF Image API endpoints or the descriptive metadata about the real world objects, could also be published in this way.

The Presentation API does not directly include descriptive metadata fields suitable for indexing beyond a simple full text search. The data intentionally lacks the semantics needed to construct indexes that enable advanced or fielded search. Instead, the Presentation API uses the seeAlso property to link to external documents that can have richer and domain-specific information about the content being presented. For example, a museum object might have a seeAlso reference to a CIDOC-CRM or LIDO description, while a bibliographic resource might reference a Dublin Core or MODS description. These external descriptions should be used when possible to provide interfaces giving access to more precise matching algorithms.

This specification describes three levels of conformance that build upon each other in terms of functionality enabled and precision of the information published. Sets of changes are published in pages, which are then collected together in a collection per publisher. To reduce barriers to implementation, care has been take to allow the implementation of all levels using only static files on a web server, rather than requiring dynamic access to a database.

2.1. Listing Resources and their Changes

There are three levels of conformance at which changes can be described. Level 0 is simply a list of the resources available. Level 1 adds timestamps and ordering from earliest change to most recent, allowing a consuming application to work backwards through the list and stop processing once it encounters a change that it has already seen from a previous run. Level 2 adds information about the types of activities, enabling the explicit description of the creation and deletion of resources.

The subsections below describe first how to construct the description of the changes for each level, and then in the next section how to embed them into ordered lists.

2.1.1. Level 0: Resource References

The core information required to provide a minimally effective set of links to IIIF resources is just the URIs of those resources. However, with the addition of some additional JSON structure wrapped around those URIs, the document can also be on the path towards a robust set of information that clients can use to optimize their processing.

Starting with the IIIF resource URIs, we add an “Update” Activity wrapper around them. The order of the resources in the resulting list is unimportant, but each should only appear once. In terms of optimization, this approach provides no additional benefit over any other simpler list format, but is compatible with the following levels which introduce significant benefits. This is the minimum level for any effective interoperability.

If resources are deleted after being referred to in the resource list, the entire list should be republished without the reference to the deleted resource. Clients should also expect to encounter resource URIs that are out of date and no longer resolve to a IIIF Manifest or Collection.

Example Level 0 Activity:

{
  "type": "Update",
  "object": {
    "id": "https://example.org/iiif/1/manifest",
    "type": "Manifest"
  }
}

2.1.2. Level 1: Resource Changes

When dealing with large sets of resources, it can be useful to work with only those resources that have changed since the last time the list was processed. This can be facilitated by the addition of a time stamp that indicates when a resource was last modified or initially created. This is included using the endTime property, representing the time at which the activity of publishing the resource was finished. Lists with multiple activities are then ordered such that the most recent activities occur last. Consumers will then process the list of Activities in reverse order, from last to first, stopping when they encounter an Activity they have already processed in a previous run.

Example Level 1 Activity:

{
  "type": "Update",
  "object": {
    "id": "https://example.org/iiif/1/manifest",
    "type": "Manifest"
  },
  "endTime": "2017-09-20T00:00:00Z"
}

2.1.3. Level 2: Resource Creation, Change and Deletion

At the most detailed level, a log of all of the Activities that have taken place can be recorded, with the likelihood of multiple Activities per IIIF resource. Use of the additional types of “Create” and “Delete” allows explicit description of creations and deletions, enabling a synchronization process to remove resources as well as add or update them.

A complete change history is not required, and sometimes not even desirable. If there are many, frequent changes to a resource, then an implementation may omit any number of individual changes, but should always have the most recent change included in the list. Changes that are deemed to be insignificant to the publisher of the list may also be omitted, such as changes that affect only the syntax of the document but not the content.

Example Level 2 Activity:

{
  "type": "Create",
  "object": {
    "id": "https://example.org/iiif/1/manifest",
    "type": "Manifest"
  },
  "endTime": "2017-09-20T00:00:00Z"
}

2.2. Pages of Changes

The Activities are collected together into pages that together make up the entire set of changes that the publishing system is aware of. Pages reference the previous and next pages in that set, and the overall collection of which they are part. The Activities are listed such that the most recent activities occur last.

Pages are subsequently collected together in ordered collections, described in the following section.

{
  "@context": "http://iiif.io/api/discovery/1/context.json",
  "id": "https://example.org/activity/page-1",
  "type": "OrderedCollectionPage",
  "partOf": {
    "id": "https://example.org/activity/all-changes",
    "type": "OrderedCollection"
  },
  "prev": {
    "id": "https://example.org/activity/page-0",
    "type": "OrderedCollectionPage"
  },
  "next": {
    "id": "https://example.org/activity/page-2",
    "type": "OrderedCollectionPage"
  },
  "orderedItems": [
    {
      "type": "Update",
      "object": {
        "id": "https://example.org/iiif/9/manifest",
        "type": "Manifest"
      },
      "endTime": "2018-03-10T10:00:00Z"
    },
    {
      "type": "Update",
      "object": {
        "id": "https://example.org/iiif/2/manifest",
        "type": "Manifest"
      },
      "endTime": "2018-03-11T16:30:00Z"
    }
  ]
}

2.3. Collections of Pages

As the number of Activities is likely too many to usefully be represented in a single Page, they are collected together into a Collection as the initial entry point. The Collection references the URIs of the first and last pages.

{
  "@context": "http://iiif.io/api/discovery/1/context.json",
  "id": "https://example.org/activity/all-changes",
  "type": "OrderedCollection",
  "totalItems": 21456,
  "first": {
    "id": "https://example.org/activity/page-0",
    "type": "OrderedCollectionPage"
  },
  "last": {
    "id": "https://example.org/activity/page-214",
    "type": "OrderedCollectionPage"
  }
}

3. Activity Streams Details

The W3C Activity Streams specification defines a “model for representing potential and completed activities”, and is compatible with the design patterns established for IIIF APIs. It is defined in terms of JSON-LD and can be seamlessly integrated with the existing IIIF APIs. The model can be used to represent activities carried out by content publishers of creating, updating, and deleting (or otherwise de-publishing) IIIF resources.

This section is a summary of the properties and types used by this specification, and defined by Activity Streams. This is intended to ease implementation efforts by collecting the relevant information together.

Properties, beyond those described in this specification, that the consuming application does not have code to process must be ignored. Other properties defined by Activity Streams may be used, such as origin or instrument, but there are no current use cases that would warrant their inclusion in this specification.

3.1. Ordered Collection

The top-most resource for managing the lists of Activities is an Ordered Collection, broken up into Ordered Collection Pages. This is the same pattern that the Web Annotation model uses for Annotation Collections and Annotation Pages. The Collection does not directly contain any of the Activities, instead it refers to the first and last pages of the list.

The overall ordering of the Collection is from the oldest Activity as the first entry in the first page, to the most recent as the last entry in the last page. Consuming applications should therefore start at the end and walk backwards through the list, and stop when they reach a timestamp before the time they last processed the list.

Content providers must publish an Ordered Collection at the HTTP(S) URI listed in the id property of the Collection.

id

The identifier of the Ordered Collection.

Ordered Collections must have an id property. The value must be a string and it must be an HTTP(S) URI. The JSON representation of the Ordered Collection must be available at the URI.

{ "id": "https://example.org/activity/all-changes" }
type

The class of the Ordered Collection.

Ordered Collections must have a type property. The value must be OrderedCollection.

{ "type": "OrderedCollection" }
first

A link to the first Ordered Collection Page for this Collection.

Ordered Collections should have a first property. The value must be a JSON object, with the id and type properties. The value of the id property must be a string, and it must be the HTTP(S) URI of the first page of items in the Collection. The value of the type property must be the string OrderedCollectionPage.

{
  "first": {
    "id": "https://example.org/activity/page-0",
    "type": "OrderedCollectionPage"
  }
}
last

A link to the last Ordered Collection Page for this Collection. As the client processing algorithm works backwards from the most recent to least recent, the inclusion of last is required, but first is only recommended.

Ordered Collections must have a last property. The value must be a JSON object, with the id and type properties. The value of the id property must be a string, and it must be the HTTP(S) URI of the last page of items in the Collection. The value of the type property must be the string OrderedCollectionPage.

{
  "last": {
    "id": "https://example.org/activity/page-1234",
    "type": "OrderedCollectionPage"
  }
}
totalItems

The total number of Activities in the entire Ordered Collection.

Ordered Collections may have a totalItems property. The value must be a non-negative integer.

{ "totalItems": 21456 }
seeAlso

This property is used to refer to one or more documents that semantically describe the set of resources that are being acted upon in the Activities within the Ordered Collection. This would allow the Ordered Collection to refer to, for example, a DCAT description of the dataset. For Ordered Collections that aggregate activities and/or objects from multiple sources, the referenced description should describe the complete aggregation rather than an individual source.

Ordered Collections may have a seeAlso property. The value must be an array of one or more JSON objects, with the id and type properties. The value of the id property must be a string, and it must be the HTTP(S) URI of the description of the dataset. The value of the type property must be the string Dataset. The JSON object may have the format property, the value of which must be a string, and it must be the MIME media type of the referenced description document.

{
  "seeAlso": [
    {
      "id": "https://example.org/dataset/all-dcat.jsonld",
      "type": "Dataset",
      "format": "application/ld+json"
    }
  ]
}
partOf

This property is used to refer to a parent Ordered Collection, of which this Ordered Collection is part. This would allow a publisher to have thematic or temporal sets of activities, for example to have different collections of activities for their paintings from their sculptures, or their modern content from their archival.

Ordered Collections may have a partOf property. The value must be an array of one or more JSON objects, with the id and type properties. The value of the id property must be a string, and it must be the HTTP(S) URI of the parent collection. The value of the type property must be the string OrderedCollection.

{
  "partOf": [
    {
      "id": "https://example.org/aggregated-changes",
      "type": "OrderedCollection"
    }
  ]
}

rights

A string that identifies a license or rights statement that applies to the usage of the Ordered Collection. The value must be drawn from the set of Creative Commons license URIs, the RightsStatements.org rights statement URIs, or those added via the extension mechanism. The inclusion of this property is informative, and for example could be used to display an icon representing the rights assertions.

The value must be a string. If the value is drawn from Creative Commons or RightsStatements.org, then the string must be a URI defined by that specification.

{ "rights": "https://creativecommons.org/licenses/by/4.0/" }
Complete Ordered Collection Example
{
  "@context": "http://iiif.io/api/discovery/1/context.json",
  "id": "https://example.org/activity/all-changes",
  "type": "OrderedCollection",
  "totalItems": 21456,
  "rights": "https://creativecommons.org/licenses/by/4.0/",
  "seeAlso": [
    {
      "id": "https://example.org/dataset/all-dcat.jsonld",
      "type": "Dataset",
      "format": "application/ld+json"
    }
  ],
  "partOf": [
    {
      "id": "https://example.org/aggregated-changes",
      "type": "OrderedCollection"
    }
  ],
  "first": {
    "id": "https://example.org/activity/page-0",
    "type": "OrderedCollectionPage"
  },
  "last": {
    "id": "https://example.org/activity/page-214",
    "type": "OrderedCollectionPage"
  }
}

3.2. Ordered Collection Page

The list of Activities is ordered both from page to page by following prev (or next) relationships, and internally within the page in the orderedItems property. The number of entries in each page is up to the implementer, and cannot be modified at request time by the client. Pages are not required to have the same number of entries as any other page.

Content providers must publish at least one Ordered Collection Page at the HTTP(S) URI given in the id property of the Page.

id

The identifier of the Collection Page.

Ordered Collection Pages must have an id property. The value must be a string and it must be an HTTP(S) URI. The JSON representation of the Ordered Collection Page must be available at the URI.

{ "id": "https://example.org/activity/page-0" }
type

The class of the Ordered Collection Page.

Ordered Collections must have a type property. The value must be OrderedCollectionPage.

{ "type": "OrderedCollectionPage" }
partOf

The Ordered Collection that this Page is part of.

Ordered Collection Pages should have a partOf property. The value must be a JSON object, with the id and type properties. The value of the id property must be the a string, and must be the HTTP(S) URI of the Ordered Collection that this page is part of. The value of the type property must be the string OrderedCollection.

{
  "partOf": {
    "id": "https://example.org/activity/all-changes",
    "type": "OrderedCollection"
  }
}
startIndex

The position of the first item in this page’s orderedItems list, relative to the overall ordering across all pages within the Collection. The first entry in the overall list has a startIndex of 0. If the first page has 20 entries, the first entry on the second page would therefore be 20.

Ordered Collection Pages may have a startIndex property. The value must be a non-negative integer.

{ "startIndex": 20 }
next

A reference to the next page in the list of pages.

Ordered Collection Pages should have a next property, unless they are the last Page in the Collection. The value must be a JSON object, with the id and type properties. The value of the id property must be the a string, and must be the HTTP(S) URI of the following Ordered Collection Page. The value of the type property must be the string OrderedCollectionPage.

{
  "next": {
    "id": "https://example.org/activity/page-2",
    "type": "OrderedCollectionPage"
  }
}
prev

A reference to the previous page in the list of pages.

Ordered Collection Pages must have a prev property, unless they are the first page in the Collection. The value must be a JSON object, with the id and type properties. The value of the id property must be the a string, and must be the HTTP(S) URI of the preceding Ordered Collection Page. The value of the type property must be the string OrderedCollectionPage.

{
  "prev": {
    "id": "https://example.org/activity/page-1",
    "type": "OrderedCollectionPage"
  }
}
orderedItems

The Activities that are listed as part of this page.

Ordered Collection Pages must have a orderedItems property. The value must be an array, with at least one item. Each item must be a JSON object, conforming to the requirements of an Activity.

{
  "orderedItems": [
     {
     	"type": "Update",
     	"object": {
     		"id": "https://example.org/iiif/1/manifest",
     		"type": "Manifest"
     	},
     	"endTime": "2018-03-10T10:00:00Z"
     }
  ]
}
Complete Ordered Collection Page Example
{
  "@context": "http://iiif.io/api/discovery/1/context.json",
  "id": "https://example.org/activity/page-1",
  "type": "OrderedCollectionPage",
  "startIndex": 20,
  "partOf": {
    "id": "https://example.org/activity/all-changes",
    "type": "OrderedCollection"
  },
  "prev": {
    "id": "https://example.org/activity/page-0",
    "type": "OrderedCollectionPage"
  },
  "next": {
    "id": "https://example.org/activity/page-2",
    "type": "OrderedCollectionPage"
  },
  "orderedItems": [
    {
      "type": "Update",
      "object": {
        "id": "https://example.org/iiif/1/manifest",
        "type": "Manifest"
      },
      "endTime": "2018-03-10T10:00:00Z"
    }
  ]
}

3.3. Activities

The Activities are the means of describing the changes that have occurred in the content provider’s system.

Content providers may publish Activities separately from Ordered Collection Pages, and if so they must be at the HTTP(S) URI given in the id property of the Activity.

id

An identifier for the Activity.

Activities may have an id property. The value must be a string and it must be an HTTP(S) URI. The JSON representation of the Activity may be available at the URI.

{ "id": "https://example.org/activity/1" }
type

The type of Activity.

This specification uses the types described in the table below.

Type Definition
Create The initial creation of the resource. Each resource should have at most one Create Activity in which it is the object, but if the URI of the resource is re-used after it a Delete activity, then there may be more than one.
Update Any change to the resource. In a system that does not distinguish creation from modification, then all changes may have the Update type.
Delete The deletion of the resource, or its de-publication from the web. Each resource should have at most one Delete Activity in which it is the object, but may have more than one if it is subsequently republished and then deleted again.
Move The re-publishing of the resource at a new URI, with the same content. Each resource may have zero or more Move activities in which it is the object or target.

Activities must have the type property. The value must be a registered Activity class, and should be one of Create, Update, or Delete.

{ "type": "Update" }
object

The IIIF resource that was affected by the Activity. It is an implementation decision whether there are separate lists of Activities, one per object type, or a single list with all of the object types combined.

In the case of the Move activity, the object property contains the id and type of the source from whence it was moved. The new location will be in the target property, described below.

Activities must have the object property. The value must be a JSON object, with the id and type properties. The id must be an HTTP(S) URI. The type should be a class defined in the IIIF Presentation API, and should be one of Collection, or Manifest. The object may have a seeAlso property, as defined for OrderedCollection above, to reference a description document of the object resource. The document referenced in the seeAlso property may also be referenced with the seeAlso property in an instance of the IIIF Presentation API. The type of the document referenced in the seeAlso property should be given as Dataset, meaning that it is data rather than a human-readable document.

{
  "object": {
    "id": "http://example.org/iiif/1/manifest",
    "type": "Manifest",
    "seeAlso": [
      {
        "id": "https://example.org/dataset/single-item.jsonld",
        "type": "Dataset",
        "format": "application/ld+json"
      }
    ]
  }
}
target

The new location of the IIIF resource, after it was affected by a Move activity.

Move activities must have the target property. The value must be a JSON object, with the id and type properties. The id must be an HTTP(S) URI, and must be different from the URI given in the object property’s id. The type should be a class defined in the IIIF Presentation API, and should be the same as the object property’s type.

{
  "target": {
    "id": "http://example.org/a/manifest",
    "type": "Manifest",
    "seeAlso": [
      {
        "id": "https://example.org/single-item-a.jsonld",
        "type": "Dataset",
        "format": "application/ld+json"
      }
    ]
  }
}
endTime

The time at which the Activity was finished. It is up to the implementer to decide whether the value of endTime is the timestamp for the publication of the IIIF resource online or is the timestamp of the modification to the data in the managing system if these are different, but the decision must be consistently applied. The changed resource given in object must be available at its URI at or before the timestamp given in endTime. The value of endTime should be before the time that the Activity is published as part of its Ordered Collection.

Activities should have the endTime property. The value must be a datetime expressed in UTC in the xsd:dateTime format.

{ "endTime": "2017-09-21T00:00:00Z" }
startTime

The time at which the Activity was started.

Activities may have the startTime property. The value must be a datetime expressed in UTC in the xsd:dateTime format.

{ "startTime": "2017-09-20T23:58:00Z" }
summary

A short textual description of the Activity. This is intended primarily to be used for debugging purposes or explanatory messages.

Activities may have the summary property. The value must be a string.

{ "summary": "admin updated the manifest, fixing reported bug #15." }
actor

The organization, person, or software agent that carried out the Activity.

Activities may have the actor property. The value must be a JSON object, with the id and type properties. The id should be an HTTP(S) URI. The type must be one of Application, Organization, or Person.

{
  "actor": {
    "id": "https://example.org/person/admin1",
    "type": "Person"
  }
}
Complete Activity Example

A complete example Activity would thus look like the following example. Note that it does not have a @context property, as it is always embedded within a CollectionPage. Please note also that this is a complete example with all of the fields, and most implementations will not need nor expose this level of data.

{
  "id": "https://example.org/activity/1",
  "type": "Update",
  "summary": "admin updated the manifest, fixing reported bug #15.",
  "object": {
    "id": "https://example.org/iiif/1/manifest",
    "type": "Manifest",
    "seeAlso": [
      {
        "id": "https://example.org/dataset/single-item.jsonld",
        "type": "Dataset",
        "format": "application/ld+json"
      }
    ]
  },
  "endTime": "2017-09-21T00:00:00Z",
  "startTime": "2017-09-20T23:58:00Z",
  "actor": {
    "id": "https://example.org/person/admin1",
    "type": "Person"
  }
}

3.4. Linked Data Context and Extensions

3.4.1. @context

The top level resource in the response must have the @context property, and it should appear as the very first key/value pair of the JSON representation. This property lets Linked Data processors interpret the document as a graph. The value of the property must be either the URI of the IIIF Discovery context document, http://iiif.io/api/discovery/1/context.json, or an array of strings, where the URI of the IIIF Discovery context document is the last item in the array.

{
  "@context": "http://iiif.io/api/discovery/1/context.json"
}

3.4.2. Extensions

If any additional classes or properties are desired beyond the ones defined in this specification or the ActivityStreams specification, then those classes or properties should be mapped to RDF terms in one or more additional context documents. These extension contexts should be added to the top level @context property, and must be before the URI of the Discovery context. The JSON-LD 1.1 functionality of defining terms only within a specific property, known as scoped contexts, must be used to minimize cross-extension collisions. Extensions intended for broad use should be registered in the extensions registry.

{
  "@context": [
    "http://example.org/extension/context.json",
    "http://iiif.io/api/discovery/1/context.json"
  ]
}

3.5. Activity Streams Processing Algorithm

The aim of the processing algorithm is to inform consuming applications how to make best use of the available information. The specification does not require any particular processing of the information by the consuming application, but considers indexing of the resource as a common use case in section 3.5.3.

3.5.1. Collection Algorithm

Given the URI of an ActivityStreams Collection (collection) as input, a conforming processor should:

  1. Initialization:
    1. Let processedItems be an empty array
    2. Let lastCrawl be the timestamp of the previous time the algorithm was executed
  2. Retrieve the representation of collection via HTTP(S)
  3. Validate that the retrieved representation contains at least the features required for processing
  4. Find the URI of the last page at collection.last.id (pageN)
  5. Apply the results of the page algorithm to pageN

3.5.2. Page Algorithm

Given the URI of an ActivityStreams CollectionPage (page), a list of processed items (processedItems), the date of last crawling (lastCrawl), and a processing function (process()) as input, a conforming processor should:

  1. Retrieve the representation of page via HTTP(S)
  2. Validate that the retrieved representation contains at least the features required for processing
  3. Find the set of updates of the page at page.orderedItems (items)
  4. In reverse order, iterate through the activities (activity) in items:
    1. For each activity, if activity.endTime is before lastCrawl, then terminate ;
    2. If the updated resource's uri at activity.object.id is in processedItems, then continue ;
    3. Otherwise, if activity.type is Update or Create, then find the URI of the updated resource at activity.object.id (object) and process its inclusion ;
    4. Otherwise, if activity.type is Delete, then find the URI of the deleted resource at activity.object.id and process its removal ;
    5. Otherwise, if activity.type is Move, then find the original URI of the moved resource at activity.object and process its removal, and find the new URI of the moved resource at activity.target and process its inclusion.
    6. Add the processed resource's URI to processedItems
  5. Finally, find the URI of the previous page at collection.prev.id (pageN1)
  6. If there is a previous page, apply the results of the page algorithm to pageN1

3.5.3. Resource Processing: Indexing

While there are many possible algorithms to process the discovered resources and activities, a core use case for the Change Discovery API is to maintain an up-to-date index of the resources.

In this case, the objective of the consuming application is to find accurate, machine-readable descriptive information that might be used to build an index, and thus the application should use the IIIF Presentation API seeAlso property to retrieve such a description if available. For different types of resource, and for different domains, the referenced descriptive resources will have different formats and semantics. If there are no such descriptions, or none that can be processed, the data in the Manifest and in other IIIF resources might be used as a last resort, despite its presentational intent.

4. Network Considerations

4.1. Activities for Access-Restricted Content

Activities may be published about content that has access restrictions. Clients must not assume that they will be able to access every resource that is the object of an Activity, and must not assume that if they cannot access it, then it has been deleted and therefore remove it from their indexes. For example, the content might be protected by an authentication system that is denying access, or there might simply be a temporary network outage preventing the content from being retrieved. An end user might be able to provide the right credentials to gain access.

Content may also change state from being available to being protected by access restrictions, or become available having previously been protected. There are no new activity classes for these situations, and the publisher should issue the regular Update or Delete activities, if that is the behavior desired for harvesters.

4.2. Negotiable Resources

Some HTTP(S) URIs are able to respond with different representations of the same content in response to requests with different headers, such as the same URI being able to return both version 2 and version 3 of the IIIF Presentation API based on the Accept header. This is known as “content negotiation”, and such resources are known as “negotiable resources”. The representations that can be negotiated for are known as “variants”.

Negotiable resources are not supported by the Discovery API, only variants. This means that there would be one Activity entry for each of the representations that are available, and that each representation must have its own URI, even if it can also be reached via the negotiable resource. In the case of negotiating for different versions of the IIIF Presentation API, the format property can be used to include the full media type of the resource, where the version specific context document is given in the profile parameter. If there is additional descriptive resources available, each such resource would describe all of the variants, and thus the seeAlso property of each Activity would refer to the same descriptions, allowing the variants to be connected together.

Two variants of the same negotiable resource can be represented as follows.

{
  "orderedItems": [
    {
      "type": "Update",
      "object": {
        "id": "https://example.org/iiif/1/manifest/v2",
        "type": "Manifest",
        "seeAlso": "https://example.org/iiif/1/metadata.xml",
        "format": "application/ld+json;profile=\"https://iiif.io/api/presentation/2/context.json\""
      },
      "endTime": "2018-03-10T10:00:00Z"
    },
    {
      "type": "Update",
      "object": {
        "id": "https://example.org/iiif/1/manifest/v3",
        "type": "Manifest",
        "seeAlso": "https://example.org/iiif/1/metadata.xml",
        "format": "application/ld+json;profile=\"https://iiif.io/api/presentation/3/context.json\""
      },
      "endTime": "2018-03-10T10:00:00Z"
    }
  ]
}

Appendices

A. Acknowledgements

Many thanks to the members of the IIIF community for their continuous engagement, innovative ideas, and feedback.

Many of the changes in this version are due to the work of the IIIF Discovery Technical Specification Group, chaired by Antoine Isaac (Europeana), Matthew McGrattan (Digirati) and Rob Sanderson (J. Paul Getty Trust). The IIIF Community thanks them for their leadership, and the members of the group for their tireless work.

B. Change Log

Date Description
2019-11-01 Version 0.4 (unnamed)
2019-03-20 Version 0.3 (unnamed)
2018-11-12 Version 0.2 (unnamed)
2018-05-04 Version 0.1 (unnamed)