IIIF Content Search API 2.0

Status of this Document

This Version: 2.0.0

Latest Stable Version: 2.0.0

Previous Version: 1.0.0

Editors

Copyright © 2012-2023 Editors and contributors. Published by the IIIF Consortium under the CC-BY license, see disclaimer.


1. Introduction

In the IIIF (pronounced “Triple-Eye-Eff”) Presentation API, content is brought together from distributed systems via annotations. That content might include images, audio, video, rich or plain text, or anything else. In a vibrant and dynamic system, that content can come from many sources and be rich, varied and abundant. Of that list of content types, textual resources lend themselves to being searched, either as the transcription, translation or edition of the intellectual content, or commentary, description, tagging or other annotations about the object.

This specification lays out the interoperability mechanism for performing these searches within the IIIF context. The scope of the specification is searching annotation content within a single IIIF resource, such as a Manifest, Canvas, Range or Collection. Every effort is made to keep the interaction as consistent with existing IIIF patterns as possible. Searching for metadata or other descriptive properties is not in scope for this work.

In order to make searches easier against unknown content, a related service for the automatic completion of search terms is also specified.

We welcome feedback on all IIIF Specifications.

1.1. Use Cases

Use cases for being able to search the annotations within the Presentation API include:

  • Searching OCR generated text to find words or phrases within a book, newspaper or other primarily textual content.
  • Searching transcribed content, provided by crowd-sourcing or transformation of scholarly output.
  • Searching multiple streams of content, such as the translation or edition, rather than the raw transcription of the content, to jump to the appropriate part of an object.
  • Searching on sections of text, such as defined chapters or articles.
  • Searching for user provided commentary about the resource, either as a discovery mechanism for the resource or for the discussion.
  • Discovering similar sections of text to compare either the content or the object.
  • Searching for non-textual annotations, such as tags or highlights.
  • Searching within captions, subtitles or transcriptions of audio/visual material.

User interfaces that could be built using the search response include highlighting matching words in the display, providing a heatmap of where the matches occur within the object, and providing a mechanism to jump between points within the object.

1.2. Terminology

This specification uses the following terms:

  • embedded: When a resource (A) is embedded within an embedding resource (B), the complete JSON representation of resource A is present within the JSON representation of resource B, and dereferencing the URI of resource A will not result in additional information. Example: Canvas A is embedded in Manifest B.
  • referenced: When a resource (A) is referenced from a referencing resource (B), an incomplete JSON representation of resource A is present within the JSON representation of resource B, and dereferencing the URI of resource A will result in additional information. Example: Manifest A is referenced from Collection B.

The terms array, JSON object, number, string, and boolean in this document are to be interpreted as defined by the Javascript Object Notation (JSON) specification.

The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this document are to be interpreted as described in RFC 2119.

1.3. Common Specification Features

All IIIF specifications share common features to ensure consistency across the IIIF ecosystem. These features are documented in the Presentation API and are foundational to this specification. Common principles for the design of the specifications are documented in the IIIF Design Principles.

2. Overview

The IIIF Presentation API provides just enough information to a viewer so that it can present the images and other content to the user in a rich and understandable way. Those content resources may have textual annotations associated with them. Annotations may also be associated with the structural components of the Presentation API, such as a Collection, Manifest, or Range. Further, annotations can be replied to by annotating them to form a threaded discussion about the commentary, transcription, edition or translation.

Annotations are made available in IIIF via Annotation Pages, where typically the included Annotations target the same resource or part of it. Where known, these Annotation Pages can be directly referenced from the Manifest to allow clients to simply follow the link to retrieve them. This specification uses Annotation Pages to deliver search results, in which the Annotations in one Annotation Page can target multiple Canvases or other resources, and the Annotation Pages are likely to be generated dynamically.

Beyond the ability to search for words or phrases, users find it helpful to have suggestions for what terms they should be searching for. This facility is often called autocomplete or type-ahead, and within the context of a single object can provide insight into the language and content.

This specification defines two services to be associated with IIIF resources: the Content Search service and the Autocomplete service.

3. Declaring Services

The Content Search and the Autocomplete services are associated with IIIF resources using the service property, defined by the Presentation API as an array of JSON objects. These objects must have the id and type properties. The value of the id property must be the URI used to interact with the service.

Any resource in the Presentation API may have a Content Search service associated with it. The resource determines the scope of the content that will be searched. A service associated with a Manifest will search all of the annotations on Canvases or other resources below the Manifest, a service associated with a particular Range will only search the Canvases within the Range, or a service on a Canvas will search only Annotations on that particular Canvas.

An example service description block:

{
  // ... the resource that the search service is associated with ...
  "service": [
    {
      "id": "https://example.org/services/identifier/search",
      "type": "SearchService2"
    }
  ]
}

Any Content Search service may have a nested Autocomplete service which provides term completion functionality specific to the Content Search service. This structure allows multiple Content Search services to be referenced, each with their own Autocomplete service.

The above service description block would become:

{
  // Resource that the services are associated with ...
  "service": [
    {
      "id": "https://example.org/services/identifier/search",
      "type": "SearchService2",
      "service": [
        {
          "id": "https://example.org/services/identifier/autocomplete",
          "type": "AutoCompleteService2"
        }
      ]
    }
  ]
}

The Content Search service takes a query, typically including a search term or URI. Results may be constrained by other properties, such as the date the annotation was created or last modified, the motivation for the annotation, or the user that created the annotation.

4.1. Request

A search request is made to a service that is associated with a particular Presentation API resource. The URIs for services associated with different resources must be different to allow the client to use the correct one for the desired scope of the search. To perform a search, the client must use the HTTP GET method to make the request to the service, with query parameters to specify the search terms.

4.1.1. Query Parameters

The following query parameters are defined:

Parameter Definition
q A space separated list of search terms. For example, the search terms might be words (to search for within textual bodies) or URIs (to search identities of annotation body resources). The semantics of multiple, space separated terms is server implementation dependent.
motivation A space separated list of motivation terms. If multiple motivations are supplied, an annotation matches the search if any of the motivations are present. Common values for the motivation parameter can be found in the IIIF Registry of Motivations, including the two Content Search motivations contextualizing and highlighting defined in sections 4.3.1 and 4.3.2 below.
date A space separated list of date ranges. An annotation matches if the date on which it was created falls within any of the supplied date ranges. The date ranges must be supplied in the ISO8601 start/end format: YYYY-MM-DDThh:mm:ssZ/YYYY-MM-DDThh:mm:ssZ. The start and end dates must be combined date and time values expressed in UTC and must use the Z timezone indicator.
user A space separated list of URIs that are the identities of users. If multiple users are supplied, an annotation matches the search if any of the supplied users created the annotation.

Other than q, which is recommended, all other parameters are optional in the request. The default, if a parameter is empty or not supplied, is to not restrict the annotations that match the search by that parameter. If the value is supplied but the field is not present in an annotation, then the search does not match that annotation. For example if an annotation does not have a creator, and the query specifies a user parameter, then the annotation does not match the query.

Servers should implement the q and motivation parameters and may implement the other parameters. Parameters defined by this specification that are received in a request but not implemented must be ignored, and must be included in the ignored property in the response, described below.

4.1.2. Example Request

Consider the example request:

https://example.org/service/manifest/search?q=bird&motivation=painting

This request would search for annotations with the word “bird” in their textual content, and have the motivation of painting. It would search annotations within the resource with which the service was associated.

4.2. Responses

The response from the server must be an Annotation Page, following the format from the Presentation API with some additional features. This allows clients that already implement the Annotation Page format to avoid further implementation work to support search results.

4.2.1. Simple Lists

The simplest response is a normal Annotation Page, where all of the matching Annotations are returned in a single response.

The total number of matching Annotations is the length of the items array, as all matching Annotations have been returned in the response. Every Annotation must be fully embedded in the response.

Consider the example request:

https://example.org/service/manifest/search?q=bird&motivation=painting

This request might result in:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=bird&motivation=painting",
  "type": "AnnotationPage",

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-line",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "A bird in the hand is worth two in the bush",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=100,100,250,20"
    }
    // Further matching annotations here ...
  ]
}

4.2.2. Paging Results

For long lists of Annotations, the server may divide the response into multiple Annotation Pages within one Annotation Collection. The initial response is the first Annotation Page, which includes an embedded Annotation Collection and references subsequent pages to be retrieved.

The URI of the Annotation Page reported in the id property may be different from the one used by the client to request the search. This would allow, for example, a page query parameter to be appended to the URI to allow the server to track which Annotation Page is being requested.

When results are paged, the Annotation Pages have several additional properties:

  • partOf - The Annotation Page must have a partOf property. The value is a JSON object, which is the embedded Annotation Collection resource, following the structure defined below.
  • next - The Annotation Page must have a next property if there is a subsequent page. The value is a JSON object with an id of the URI of the subsequent page, and type with a value of AnnotationPage.
  • prev - The Annotation Page should have a prev property if there is a previous page. The value is a JSON object with id containing the URI of the previous page, and type with a value of AnnotationPage.
  • startIndex - The Annotation Page may have a startIndex property, which is the position of the first Annotation in this page’s items list, relative to the overall ordering of Annotations across all pages within the Annotation Collection. The value is a zero-based integer.

The embedded Annotation Collection has the following properties:

  • first - The Annotation Collection must have a first property. The value is a JSON object, with an id of the URI of the first page, and type with a value of AnnotationPage.
  • last - The Annotation Collection may have a last property. The value is a JSON object, with an id of the URI of the last page, and type with a value of AnnotationPage.
  • total - The Annotation Collection may have a total property. The value is an integer, which is the total number of Annotations in the Collection, across all Annotation Pages.

Consider the example request:

https://example.org/service/manifest/search?q=bird

This request might result in the following response:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=bird&page=1",
  "type": "AnnotationPage",

  "partOf": {
    "id": "https://example.org/service/manifest/search?q=bird",
    "type": "AnnotationCollection",
    "total": 125,
    "first": {
      "id": "https://example.org/service/identifier/search?q=bird&page=1",
      "type": "AnnotationPage"
    },
    "last": {
      "id": "https://example.org/service/identifier/search?q=bird&page=13",
      "type": "AnnotationPage"
    }
  },
  "next": {
    "id": "https://example.org/service/identifier/search?q=bird&page=2",
    "type": "AnnotationPage"
  },
  "startIndex": 0,

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-line",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "A bird in the hand is worth two in the bush",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=100,100,250,20"
    }
    // Further annotations from the first page here ...
  ]
}  

4.2.3. References to Containing Resources

It is possible that Canvases referenced by the Annotations in the results are contained in Manifests that the client has not loaded, for example when searching in a Collection. In this and similar cases, it is important to have a reference to the containing resource so that a client can retrieve and render it.

This reference to the containing resource is included in the target structure of the Annotation, in a partOf property, with a JSON object as its value. The id and type of the containing resource must be given, and a label should be included.

Consider the example request:

https://example.org/service/collection/search?q=bird&motivation=painting

This request might result in:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/collection/search?q=bird&motivation=painting",
  "type": "AnnotationPage",

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-line",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "A bird in the hand is worth two in the bush",
        "format": "text/plain"
      },
      "target": {
        "id": "https://example.org/identifier/canvas1#xywh=100,100,250,20",
        "partOf": {
          "id": "https://example.org/manifest1868",
          "type": "Manifest",
          "label": {
            "en": [ "Example Manifest" ]
          }
        }
      }
    }
    // Further annotations here ...
  ]
}

4.2.4. Ignored Parameters

If the server has ignored any of the parameters in the request, then an ignored property must be present, and must contain a list of the ignored parameters. Servers may omit ignored query parameters from the id of the Annotation Page.

Consider the example request:

http://example.org/service/manifest/search?q=bird&user=https%3A%2F%2Fexample.com%2Fusers%2Fwigglesworth

If the user parameter was ignored when processing this request, the response could be:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "http://example.org/service/manifest/search?q=bird&page=1",
  "type": "AnnotationPage",

  "ignored": [ "user" ],

  "items": [
    // Annotations ...
  ]
}

4.3. Extended Responses

Clients may require additional information about the matches in order to generate a rich user experience for search. This additional information about matches in search results is provided by further Annotations in a property called annotations. This structure maintains the distinction in the Presentation API, where the main content annotations are listed in items and additional annotations such as comments are listed in annotations. The value of annotations is an array containing a single Annotation Page, in which all of the Annotations reference Annotations in the items property.

The structure of extended responses is:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=bird",
  "type": "AnnotationPage",

  "items": [
    // Matching Annotations here ...
  ],

  "annotations": [
    {
      "type": "AnnotationPage",
      "items": [
        // Annotations with additional information about matching Annotations here ...
      ]
    }
  ]
}

4.3.1. Match Context

Search interfaces often display text before and after the matching text in a search result, as a snippet which shows the match in the context of the surrounding content. This is most useful when the service has word-level boundaries of the text on the Canvas, such as when OCR has been used to generate the text positions.

To meet this requirement, the Annotations have a motivation of contextualizing, and a Web Annotation Data Model TextQuoteSelector with prefix and suffix of the text immediately before and after the matching content in the annotation. The matching content is conveyed in the exact property. The selector has the URI of the annotation it refers to in the source property, to be matched against the id property of the annotations in items.

Consider a search request for the query term “bird”:

https://example.org/service/manifest/search?q=bird

This request might match the plural “birds”:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=bird",
  "type": "AnnotationPage",

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-bird",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "birds",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=200,100,40,20"
    }
    // Further 'bird' annotations here ...
  ],

  "annotations": [
    {
      "type": "AnnotationPage",
      "items": [
        {
          "id": "https://example.org/identifier/annotation/match-1",
          "type": "Annotation",
          "motivation": "contextualizing",
          "target": {
            "type": "SpecificResource",
            "source": "https://example.org/identifier/annotation/anno-bird",
            "selector": [
              {
                "type": "TextQuoteSelector",
                "prefix": "There are two ",
                "exact": "birds",
                "suffix": " in the bush"
              }
            ]
          }
        }
      ]
    }
  ]
}

4.3.2. Match Highlighting

Many systems do not have full word-level coordinate information, and are restricted to line or paragraph level boundaries. In these cases the client may display the entire annotation and highlight the matches within it. This is similar, but different, to the match context use case. Here, the match is somewhere within the body property of the annotation and the client needs to make it more prominent.

The client needs to know the text that matched and enough information about where it occurs in the content to reliably highlight it and not highlight non-matching content. To do this, the service can use the selector pattern to supply the text before and after the matching term within the content of the annotation, again via a TextQuoteSelector object. The value of the motivation property is highlighting in this case, to distinguish from the match context use case. Non-textual content, such as audio or video resources, would use other selectors instead, but the pattern would otherwise remain the same.

Consider the example request:

https://example.org/service/manifest/search?q=bird

This request might have the response:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=bird",
  "type": "AnnotationPage",

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-bird",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "There are two birds in the bush",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=200,100,200,20"
    }
    // Further 'bird' annotations here ...
  ],

  "annotations": [
    {
      "type": "AnnotationPage",
      "items": [
        {
          "id": "https://example.org/identifier/annotation/match-1",
          "type": "Annotation",
          "motivation": "highlighting",
          "target": {
            "type": "SpecificResource",
            "source": "https://example.org/identifier/annotation/anno-bird",
            "selector": [
              {
                "type": "TextQuoteSelector",
                "prefix": "There are two ",
                "exact": "birds",
                "suffix": " in the bush"
              }
            ]
          }
        }
      ]
    }
  ]
}

4.3.3. Multi-Match Annotations

A query might result in multiple matches within a single annotation, especially if wildcards or stemming are enabled or the content of the annotation is long. This is handled by including the matching Annotation once in items, and the multiple entries that refer to it in the annotations list. Each entry then uses a different TextQuoteSelector on the same matching Annotation to describe where the matching content can be found. A client can process each entry in turn to highlight each match in the Annotation.

The Annotation Page in annotations may embed an Annotation Collection allowing the response to include the total number of additional information Annotations. This Annotation Collection must be different from the Annotation Collection embedded within the Annotation Page at the top level of the response. In this and similar scenarios, the values of the total properties of the two Annotation Collections may not be the same: the total number of matching annotations is determined by the division of the content, whereas the total number of additional annotations is determined by the query.

Consider the request for words beginning with “b”:

https://example.org/service/manifest/search?q=b*

The request might have the response:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=b*&page=1",
  "type": "AnnotationPage",

  "partOf": {
    "id": "https://example.org/service/manifest/search?q=b*",
    "type": "AnnotationCollection",
    "total": 129
  },

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-bird",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "There are two birds in the bush",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=200,100,200,20"
    }
    // Further 'b' annotations here ...
  ],

  "annotations": [
    {
      "id": "https://example.org/service/additional/search?q=b*&page=1",
      "type": "AnnotationPage",
      "partOf": {
        "id": "https://example.org/service/additional/search?q=b*",
        "type": "AnnotationCollection",
        "total": 521
      },
      "items": [
        {
          "id": "https://example.org/additional/annotation/match-1",
          "type": "Annotation",
          "motivation": "highlighting",
          "target": {
            "type": "SpecificResource",
            "source": "https://example.org/identifier/annotation/anno-bird",
            "selector": [
              {
                "type": "TextQuoteSelector",
                "prefix": "There are two ",
                "exact": "birds",
                "suffix": " in the bush"
              }
            ]
          }
        },
        {
          "id": "https://example.org/additional/annotation/match-2",
          "type": "Annotation",
          "motivation": "highlighting",
          "target": {
            "type": "SpecificResource",
            "source": "https://example.org/identifier/annotation/anno-bird",
            "selector": [
              {
                "type": "TextQuoteSelector",
                "prefix": "birds in the ",
                "exact": "bush"
              }
            ]
          }
        }
      ]
    }
  ]
}

4.3.4. Multi-Annotation Matches

For some queries, matching result text may be spread across multiple annotations that encode the sections of the source text. This means that multiple matching annotations may be required to match a single search.

For example, imagine a set of manual transcription annotations which are divided up line by line, and that there are two lines of text. In this example the first line is “A bird in the hand”, the second line is “is worth two in the bush”, and the search is for the phrase “hand is”. Therefore the match comprises parts of both line-based annotations.

In cases like this there are more annotations in the items list than in the annotations list as two or more annotations will be needed to make a match. This is handled by referencing all of the required matching annotations as multiple targets in a single annotation with the highlighting motivation in the annotations list.

Consider the example request:

http://example.org/service/manifest/search?q=hand+is

This request might have the response:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/manifest/search?q=hand+is",
  "type": "AnnotationPage",

  "items": [
    {
      "id": "https://example.org/identifier/annotation/anno-hand",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "A bird in the hand",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=200,100,150,30"
    },
    {
      "id": "https://example.org/identifier/annotation/anno-is",
      "type": "Annotation",
      "motivation": "painting",
      "body": {
        "type": "TextualBody",
        "value": "is worth two in the bush.",
        "format": "text/plain"
      },
      "target": "https://example.org/identifier/canvas1#xywh=200,140,170,30"
    }
  ],

  "annotations": [
    {
      "id": "https://example.org/service/additional/search?q=hand+is&page=1",
      "type": "AnnotationPage",
      "partOf": {
        "id": "https://example.org/service/additional/search?q=hand+is",
        "type": "AnnotationCollection",
        "total": 1
      },
      "items": [
        {
          "id": "https://example.org/additional/annotation/match-1",
          "type": "Annotation",
          "motivation": "highlighting",
          "target": [
            {
              "type": "SpecificResource",
              "source": "https://example.org/identifier/annotation/anno-hand",
              "selector": [
                {
                  "type": "TextQuoteSelector",
                  "prefix": "bird in the ",
                  "exact": "hand"
                }
              ]
            },
            {
              "type": "SpecificResource",
              "source": "https://example.org/identifier/annotation/anno-is",
              "selector": [
                {
                  "type": "TextQuoteSelector",
                  "exact": "is",
                  "suffix": " worth two in the"
                }
              ]
            }
          ]
        }
      ]
    }
  ]
}

5. Autocomplete

An Autocomplete service returns terms that can be used to perform a search using the related search service, given some characters.

5.1. Autocomplete Request

An Autocomplete request takes the same parameters as a Content Search request, with one addition:

Parameter Definition
min The minimum number of occurrences for a term in the index in order for it to appear within the response; default is 1 if not present. Support for this parameter is optional

The q parameter must be present. Its value is interpreted as a single character string to match within terms in the index, often beginning characters. For example, the query term of ‘bir’ might complete to ‘bird’, ‘biro’, ‘birth’, and ‘birthday’.

The other parameters (motivation, date and user), if supported, refine the set of terms in the response to only ones from the annotations that match those filters. For example, if the motivation is given as painting, then only text from painting transcriptions will contribute to the list of terms in the response.

An example request would be:

https://example.org/service/identifier/autocomplete?q=bir&motivation=painting&user=https%3A%2F%2Fexample.com%2Fusers%2Fwigglesworth

5.2. Autocomplete Response

Most auto-complete scenarios can be fulfilled by a simple list of terms. These terms can be converted into a search by using them as the value of the q parameter of the related Content Search service.

In order to accommodate this use case, a new class TermPage is introduced for the response, and a class Term to describe the values.

The TermPage has the following properties:

  • id - The TermPage must have an id property. The value is a URI and may be different from the one requested.
  • type - The TermPage must have a type property. The value must be TermPage.
  • ignored - The TermPage may have an ignored property. The value must be an array of strings, each of which is the name of a query parameter which was ignored by the server.
  • items - The TermPage must have an items property. The value must be an array of zero or more items. Each item must be a JSON object, each of which is a Term.

In this simple case, Term resources have a single property value which contains the term string. The number of terms provided in the list is determined by the server.

Consider the example request:

https://example.org/service/identifier/autocomplete?q=bir

This request might have the response:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/identifier/autocomplete?q=bir",
  "type": "TermPage",  
  "items": [
    {
      "value": "bird"
    },
    {
      "value": "biro"
    },
    {
      "value": "birthday"
    }
  ]
}

5.3. Extended Autocomplete Responses

There are cases where a simple list of terms is not sufficient to support use of the Autocomplete service. These include when a label is required to allow a user to select an appropriate term, when the term cannot be used as a query parameter directly, and when it is useful to know the total number of occurences of a term within the index.

The use cases are fulfilled by extending the properties of the Term resources to include further information.

Term resources have the following properties:

  • type - The Term may have a type property. If present, the value must be Term. The use of the property is not recommended to keep the response shorter.
  • value - The Term must have a value property. The value is a the string form of the term.
  • total - The Term may have a total property. The value is an integer, which is the number of times the term occurs in the index.
  • label - The Term may have a label property. The value is a JSON object which follows the definition of a language map. This label should be displayed to the user instead of the value, for example when the value is a URI or a string that has been manipulated with stemming or other normalization.
  • language - The Term may have a language property. The value is a string conforming to the BCP 47 language code specification, and gives the language of the term string in the value property.
  • service - The Term may have a service property. The value is an array of JSON objects, where each object is a Service. The Term must include an entry for the full link to the related SearchService2, when the value cannot be used directly in the q parameter. In this case, the id of the service is the full link including the q and other parameters.

The usage of the properties in terms need not be consistent within a single response, and properties should only be included when needed.

The terms should be provided in ascending alphabetically sorted order, but other orders are allowed, such as by the term’s total count descending to put the most common matches first, or to alphabetize on the label rather than the value.

Consider the example request:

https://example.org/service/identifier/autocomplete?q=bir&user=https%3A%2F%2Fexample.com%2Fusers%2Fwigglesworth

This request might generate the response:

{
  "@context": "http://iiif.io/api/search/2/context.json",
  "id": "https://example.org/service/identifier/autocomplete?q=bir",
  "type": "TermPage",
  "ignored": [ "user" ],
  "items": [
    {
      "value": "bird",
      "language": "en",
      "total": 15
      },
    {
      "type": "Term",
      "value": "https://semtag.example.org/tag/biro",
      "total": 3,
      "label": {
        "en": [ "biro" ]
      },
      "service": [
        {
          "id": "https://example.org/service/identifier/search?motivation=tagging&q=semtag:biro",
          "type": "SearchService2"
        }
      ],      
    },
    {
      "value": "birth",
      "total": 9,
      "label": {
        "en": [ "birth" ],
        "fr": [ "naissance" ]
      }
    }
  ]
}

Appendices

A. Versioning

This specification follows Semantic Versioning. See the note Versioning of APIs for details regarding how this is implemented.

B. Acknowledgements

Many thanks to the members of the IIIF for their continuous engagement, innovative ideas and feedback.

This specification is due primarily to the work of the IIIF Content Search Technical Specification Group, chaired by Mike Bennett, Dawn Childress (UCLA), Tom Crane (Digirati), and the IIIF Editors, including Maria Whitaker (Indiana University, Editor 2020-2021). The IIIF Community thanks them for their leadership, and the members of the group for their tireless work.

C. Change Log

Date Description
2022-11-15 Version 2.0 (Mr. Wigglesworth) View change log
2016-05-12 Version 1.0 (Lost Summer)
2015-07-20 Version 0.9 (Trip Life)