CLARIAH Expert Meeting on Video Annotation Interoperability.

CLARIAH Expert Meeting on Video Annotation Interoperability

By: Liliana Melgar and Marijn Koolen

CLARIAH had the pleasure to organize a two-day face-to-face expert meeting and workshop that aimed to improve annotation interoperability within the CLARIAH infrastructure as well as with external annotation tools. The meeting was organized by Marijn Koolen (KNAW Humanities Cluster) and Liliana Melgar (University of Amsterdam) and took place on July 12-13 in Amsterdam.

While analyzing data, scholars often need to annotate various types of resources (texts, images, or audio-visual). Annotating, in the broadest sense of the term, consists of activities such as segmenting, transcribing, adding notes, tags, classifying those tags, grouping resources in bookmarks, linking, or adding other user-metadata to those sources and their fragments. Since these activities are essential to interpretation, annotation has been recognized as one of the “scholarly primitives” (Unsworth, 2000), and as an essential part of knowledge production from the perspective of scholarly work as a research ecology (Walkowsky, 2016).

As an infrastructure project, CLARIAH aims to support humanities scholars in their research at different levels. In relation to scholarly annotation, during the first phase of CLARIAH (Core, 2015-2018), we have done some research about scholarly annotating behavior, and on the more technical aspects needed to provide structural semantics to annotations (see reference list at the end of this post). We also organized an annotation symposium in 2017, in which scholars and developers of annotation tools and standards gathered for one day in Amsterdam to align their work and look for future collaborations.

In CLARIAH Plus (2019-2023), one of the aims is to increase the support for the creation and exchange of annotation data between applications. Thus, as a follow-up to the previous activities, and as a way to initiate the next phase of CLARIAH, we decided to focus on interoperability of annotations, of audio-visual sources to begin with. Our aim is to try to facilitate data interoperability between the tools that are commonly used by scholars in their work, and to try to come up with an exchange format for scholarly annotations of audio-visual data based on the W3C annotation model.

For this reason, expert software developers of audiovisual annotation tools gathered in Amsterdam during those two days in July 2018, to discuss the interoperability between their tools. One important distinction we made in this work was between data interoperability based on data models and facilitating interoperability at the semantic level (which we did not tackle in this meeting). As Walkowsky (2016) clarifies: “knowledge about structure and properties of annotations is one thing, knowledge about concepts which are used in annotations is another.”

Participants Annotation Expert Meeting

The participants of the expert meeting were the programmers of applications that are being developed in important projects in the Humanities where audio-visual media is at the center. The experts who participated in this meeting were:

The two-day meeting included presentations of the tools´ data models. Following the methodology used in previous interoperability efforts (e.g., by Schmidt et al., 2016), we had a plenary discussion in which the developers identified the main challenges for data interoperability. These included:

These topics were grouped into four themes, and participants worked in groups on each of them:

  1. Selectors and targets for different media types, such as selections of arbitrary and dynamic shape, tracing objects across a temporal segment and how to represent these in W3C Web Annotations.
  2. User projects (so-called “hermeneutic units” by scholars) vs. annotation collections, and how tiers relate to these.
  3. Provenance and context: Where, by whom, and how annotations were made.
  4. Annotation content and motivation: what the content of an annotation looks like or can contain, such as a code, a plain text string, a URI to an external term from a vocabulary, etc.

In the second day, participants summarized the main conclusions of the previous day, and came up with concrete test cases to start exploring options for interoperability. Based on the W3C annotation model, the participants worked on possible cross-walks, i.e. alignments of the key elements that their tools and data models use, so that annotation data can be exchanged. The idea of this second group exercise was to compare, per tool, how each of them will deal with the following cases when exporting/importing:

This exercise brought some major hurdles to the surface, but also generated potential solutions and future plans for improving annotation data interoperability between the involved tools. The participants, besides agreeing on the importance of seeking interoperability to support scholars in their work, also agreed on that this is needed for preserving annotation data for archival purposes, since it is common that scholarly annotations are lost when tools are no longer supported.

The meeting concluded by listing the most important challenges and tasks for next encounter, which will most likely take place in Spring 2019. As preparation for the follow-up meeting, the developers in this group of experts will be collaboratively working on developing the test cases further, and in conducting these interoperability tests with their data. More information can be found in the project GitHub repository.

REFERENCES

Boot, P., Dekker, R. H., Koolen, M., & Melgar Estrada, L. (2017). Facilitating Fine-Grained Open Annotations of Scholarly Sources. Presented at the Digital Humanities, Montreal, Canada: DH2017.

Melgar Estrada, L., Hielscher, E., Koolen, M., Olesen, C., Noordegraaf, J., & Blom, J. (2017). Film analysis as annotation: Exploring current tools and their affordances. The Moving Image: The Journal of the Association of Moving Image Archivists, in print.

Melgar Estrada, L., & Koolen, M. (2017). Audiovisual media annotation using Qualitative Data Analysis Software: a comparative analysis. The Qualitative Report, in print.

Melgar Estrada, L., Koolen, M., Huurdeman, H., & Blom, J. (2017). A process model of time-based media annotation in a scholarly context. In CHIIR 2017: ACM SIGIR Conference on Human Information Interaction and Retrieval. Oslo.