Skip to the content.

Home | About | Blog | Discussion |

Dealing with Resource Structure for Annotations

Table of Contents

  1. Use Cases
  2. Requirements
  3. Architecture and Responsibilities
  4. Representing Annotations and Resource Structure
  5. Further reading

1. Use cases

To do: elaborate on use cases and integrate them more in the later sections to make those sections easier to understand.

The initial use case for this project was presented at IAnnotate 2016 by Peter Boot:

Potential other use cases:

2. Requirements

The annotation client is loaded in a browser window together with one or more resources, marked up with structural information embedded RDFa, that can be annotated. Each top-level resource in the browser window can have individually annotatable sub-resources.

2.1 Annotation Functionalities

The annotation client should be able to:

2.2 Domain constraints

The requirements above provide few constraints on what structures of resources and sub-resources are possible. In principle, resources could be linked in cycles, where for instance one digital edition representing the translation of a letter is a sub-resource of the original letter, but a different edition may represent the original as a sub-resource of the translation.

To ensure a clearly defined and computationally tractable set of annotations to retrieve for a given resources, it’s desirable to avoid cycles in resource relational structure. Therefore, we propose the follow constraints:

2.3 Domain model characteristics

The domain constraints described above have a number of consequences for the domain model:

3. Architecture and Responsibilities

In Peter’s IAnnotate presentation, he suggested an architecture with three components:

4. Representing Structure and Annotations

There are three general approaches to representing the resource structure in the context of an annotation:

  1. Structure Embedded in Annotation: embedding structural relation information in annotation targets.
  2. Structure as Separate Annotation: representing structural relation information as separate annotation.
  3. Structure as Separate Model: representing structural relation information in a separate data model.

4.1 Desirable data model characteristics

The annotation client and server have to use an exchange protocol and data model to exchange information about annotations and resources. Below is a (rather ad hoc) list of characteristics by which to compare different data models and help decide on a model that best fits the problem domain and constraints.

4.2. Structure Embedded in Annotation

Each annotation contains the structure information about the resource in the annotation target, using selectors and refinements.

The example below is an annotation conforming to the W3C Web Annotation model, where the target is the receiver of a letter from the Van Gogh correspondence. The body of the annotation indicates that the receiver is classified with a term from DBpedia, i.e. the person referred to as receiver:

{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "created": 1483949925,
  "body": [
    {
      "vocabulary": "DBpedia",
      "value": "Theo van Gogh (art dealer)",
      "purpose": "classifying",
      "id": "http://dbpedia.org/resource/Theo_van_Gogh_(art_dealer)"
    }
  ],
  "motivation": "classifying",
  "creator": "marijn",
  "type": "Annotation",
  "target": [
    {
      "type": "Text",
      "source": "urn:vangogh:let001",
      "selector": {
        "conformsTo": "http://boot.huygens.knaw.nl/annotate/vangoghontology.ttl#",
        "value": "urn:vangogh:let001.receiver",
        "type": "FragmentSelector"
      }
    }
  ],
  "id": "urn:uuid:8f62be7d-de56-464b-8d9a-5fb0b69fc00b"
}

4.3. Structure as Separate Annotation

It is possible to separate resource structure information from annotation information by representing them as separate data structures. One approach is to represent the structural relation between a resource and a sub-resource as a separate annotation.

With the structural information removed from the actual annotation, the receiver annotation is represented as follows:

{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "created": 1483949925,
  "body": [
    {
      "vocabulary": "DBpedia",
      "value": "Theo van Gogh (art dealer)",
      "purpose": "classifying",
      "id": "http://dbpedia.org/resource/Theo_van_Gogh_(art_dealer)"
    }
  ],
  "motivation": "classifying",
  "creator": "marijn",
  "type": "Annotation",
  "target": [
    {
      "id": "urn:vangogh:let001.receiver",
      "type": "Text"
    }
  ],
  "id": "urn:uuid:8f62be7d-de56-464b-8d9a-5fb0b69fc00b"
}

The structural relation is represented in a separate annotation as follows:

{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "created": 1483949925,
  "body": [
    {
      "vocabulary": "http://boot.huygens.knaw.nl/annotate/vangoghontology.ttl#",
      "id": "urn:vangogh:let001.receiver"
    }
  ],
  "motivation": "linking",
  "creator": "marijn",
  "type": "Annotation",
  "target": [
    {
      "id": "urn:vangogh:let001",
    }
  ],
  "id": "urn:uuid:8f62be7d-de56-464b-8d9a-5fb0b70fc00b"
}

4.4. Structure as Separate Model

A way to solve the problem of ambiguity is to represent structural information in a different data model, based on e.g. the Annotatable Thing ontology, or using an existing data model such as the IIIF Presentation model or a schema definition from Schema.org.

In this case, a choice has to be made on when the client sends structural information to the server and what structure information to send. A lazy client sends only structural relations between an annotated target and its ancestors when that annotation is made. A pro-active client sends the entire resource structure (i.e. the resource and all its sub-resources) when a new resource is loaded in the browser window.

To compare the different models for handling hierarchical structure, a different example annotation is used, which identifies the paragraph in the translation of a letter that contains the salutation. :

{
  "@context": "http://www.w3.org/ns/anno.jsonld",
  "created": 1483949925,
  "body": [
    {
      "vocabulary": "DBpedia",
      "value": "Salutation",
      "purpose": "classifying",
      "id": "http://dbpedia.org/resource/Salutation"
    }
  ],
  "motivation": "classifying",
  "creator": "marijn",
  "type": "Annotation",
  "target": [
    {
      "id": "urn:vangogh:let001:translation:p.2",
      "type": "Text"
    }
  ],
  "id": "urn:uuid:8f62be7d-de56-464b-8d9a-5fb0b69fc00b"
}

The target is the URN of the second paragraph of the English translation of the letter (which contains the salutation to the receiver, “Dear Theo” in the translation) that is originally written in Dutch (“Waarde Theo”). All information regarding the relation between the annotated paragraph and the original letter, its translation and the larger correspondence should be handled separately in a structure-oriented data model.

4.4.1. Structural representation via Annotatable Thing Ontology:

A straightforward way for the client to communicate structural information about the resource is to send the RDFa information of a resource using the vocabulary that it’s based on as context. In the case of the van Gogh Correspondence, this is the Annotatable Thing ontology created by Peter Boot:

{
  "@context": "http://boot.huygens.knaw.nl/annotate/vangoghontology.json", 
  "@type": "Letter", 
  "id": "urn:vangogh:letter001", 
  "hasMetadataItem": [
	{
      "@id": "urn:vangogh:letter001.sender", 
      "@type": "Sender", 
    },
	{
      "@id": "urn:vangogh:letter001.receiver", 
      "@type": "Receiver", 
    },
	{
      "@id": "urn:vangogh:letter001.date", 
      "@type": "Date", 
    },
  ],
  "hasPart": [
	{
      "@id": "urn:vangogh:letter001:p.1", 
      "@type": "ParagraphInLetter", 
    },
	{
      "@id": "urn:vangogh:letter001:p.2", 
      "@type": "ParagraphInLetter", 
    },
	{
      "@id": "urn:vangogh:letter001:p.3", 
      "@type": "ParagraphInLetter", 
    },
  ],
  "hasNote": [
	{
      "@id": "urn:vangogh:letter001:note.1", 
      "@type": "Note", 
    },
	{
      "@id": "urn:vangogh:letter001:note.2", 
      "@type": "Note", 
    },
	{
      "@id": "urn:vangogh:letter001:note.3", 
      "@type": "Note", 
    },
  ],
  "hasEnrichment": [
	{
      "@id": "urn:vangogh:letter001.translation", 
      "@type": "CreativeWork Translation", 
      "hasPart": [
		{
	      "@id": "urn:vangogh:letter001:translation:p.1", 
	      "@type": "ParagraphInLetter", 
	    },
		{
	      "@id": "urn:vangogh:letter001:translation:p.2", 
	      "@type": "ParagraphInLetter", 
	    },
		{
	      "@id": "urn:vangogh:letter001:translation:p.3", 
	      "@type": "ParagraphInLetter", 
	    },
	  ],
	  "hasNote": [
		{
	      "@id": "urn:vangogh:letter001:translation:note.1", 
	      "@type": "Note", 
	    },
		{
	      "@id": "urn:vangogh:letter001:translation:note.2", 
	      "@type": "Note", 
	    },
		{
	      "@id": "urn:vangogh:letter001:translation:note.3", 
	      "@type": "Note", 
	    },
	  ],
    },
  ]
}

The annotation server can store all structural relations including those between the original letter and its translation, and between the translation and its second paragraph. This allows traversal from any of these three resources to the annotation about the salutation.

Note:

{
	"@context": "http://boot.huygens.knaw.nl/annotate/vangoghontology.json", 
	"@type": "Correspondence", 
	"id": "urn:vangogh:letter001", 
	"hasPart": [
		{
			"id": "urn:vangogh:letter001", 
			"type": "Letter"
		},
		{
			"id": "urn:vangogh:letter002", 
			"type": "Letter"
		},
		{
			"id": "urn:vangogh:letter003", 
			"type": "Letter"
		},
		{
			"id": "urn:vangogh:letter004", 
			"type": "Letter"
		},
		...
	]
}

This collection level registration introduces no new requirements for the annotation client and server and allows for discovery of annotations made on collection items in different contexts.

4.4.2. Using the Annotatable Thing Ontology as an abstract class

A question to consider is whether it is possible and preferable to send only (a reference to) the ontology as an abstract class that explains the structural relations for letters in general, such that the server knows that a Letter has a Sender and a Receiver and may have a Translation without having to explicitly receive and store all the relations between the URNs of the sub-resources. Thus, for an annotation on the Translation, the target refers to the Translation property of the letter identified by its URN urn:vangogh:letter001.

The gain would be that potentially less information is sent by the client. The server only needs to retrieve the ontology once to store it as an abstract class so it knows what properties a Letter has. However, if the ontology is very elaborate or complex while individual resources based on it are typically much simpler, it might require a smaller payload to send only the few resource IRIs and their structural relations.

To work with ontologies as abstract classes, the annotations themselves should also rely on ontology information, that is, use the URN of the letter as target and use the path to the annotated sub-resource(s) as selectors within that target. There are several problems that need to be solved. [ which problems? ]

In a way, using the ontology as an abstract class requires a similar solution as the All-in-one approach: the entire path from the top resource (i.e. the letter) to the annotated sub-resource (e.g. a paragraph in the translation of the letter) has to be represented in the annotation target. An unsolved problem remains with identifying the relation between a translation of a letter and its original when only the translation is displayed: does it have its own URN? If so, how is its relation with the original letter stored via the abstract class?

4.4.3. Structural representation via IIIF

An example has been worked out in an earlier document analysing the IIIF Presentation model, in the section IIIF Collections and Manifests.

4.4.4. Structural representation via Schema.org

An alternative to using our own Annotatable Thing ontology is to rely on Schema.org. For instance, the van Gogh correspondence can be modelled using a combination of a number of pre-defined schemas:

{
  "@context": "http://schema.org/docs/jsonldcontext.json",
  "@type": "Message",
  "id": "urn:vangogh:letter001",
  "hasPart": [
    {
      "id": "urn:vangogh:letter001.sender",
    },
    {
      "id": "urn:vangogh:letter001.receiver",
    },
    {
      "id": "urn:vangogh:letter001.date",
    },
    {
      "id": "urn:vangogh:letter001.locationnote",
    },
    {
      "id": "urn:vangogh:letter001.sourcenote",
    },
    {
      "id": "urn:vangogh:letter001:p.1",
    },
    {
      "id": "urn:vangogh:letter001:p.2",
    },
    {
      "id": "urn:vangogh:letter001:p.3",
    },
    {
      "id": "urn:vangogh:letter001:p.4",
    },
    {
      "id": "urn:vangogh:letter001:p.5",
    },
    {
      "id": "urn:vangogh:letter001:p.6",
    },
    {
      "id": "urn:vangogh:letter001:p.7",
    } 
    {
      "id": "urn:vangogh:letter001:note.1",
    },
    {
      "id": "urn:vangogh:letter001:note.2",
    },
    {
      "id": "urn:vangogh:letter001:note.3",
    },
    {
	  "@type": "TranslationOfWork",
	  "id": "urn:vangogh:letter001.translation",
	  "hasPart": [
        {
          "id": "urn:vangogh:letter001:translation.p.1",
		},
		{
		  "id": "urn:vangogh:letter001:translation.p.2",
		},
		{
		  "id": "urn:vangogh:letter001:translation.p.3",
		},
		{
		  "id": "urn:vangogh:letter001:translation.p.4",
		},
		{
		  "id": "urn:vangogh:letter001:translation.p.5",
		},
		{
		  "id": "urn:vangogh:letter001:translation.p.6",
		},
		{
		  "id": "urn:vangogh:letter001:translation.p.7",
        }
      ]
    }
  ]
}

This schema also has properties sender , recipient and dateCreated, but it’s probably clearer to just refer to all sub-resources as parts.

4.5 Choosing between Modelling Options

The first two options (All-in-one and Structure as annotation) lead to severe problems:

Of the Structure as separate model options, the IIIF and Annotation as Abstract Class models have severe problems:

From the above, it seems that the most viable options is to use the Annotatable Thing ontology as a context to represent resource structure. Perhaps this could be treated as a special case of Schema.org as a specific schema for annotation. Alternatively, it could be seen as a general ontology that can be extended with domain-specific schemas from Schema.org.

5. Further reading

5.1 FRBR

5.2 Europeana Data Model

5.3 Schema.org

5.4 IIIF