Research Notes

The following pages constitute the collected research notes for the creation of STEMMA®. I’ve tried to assemble these in a coherent fashion so that they might be used as a resource for similar work elsewhere.


The work is not claimed to be exhaustive but the covered a huge range of topics related to family history data. They contain many links to other resources, but also make many independent points and observations that should be considered.


Note that the STEMMA specification, as written up here, is still a working research project and does not yet address every point raised in these research notes.


In addition to the subjects listed below, there is also an exploration of the many cultural variations around the world, including a short introduction to globalisation from a computer software perspective.



Relationship to Citations


Data Control

Data sensitivity

Data protection


Informal permissions and prohibitions




Machine-readable date values

Machine-readable calendar specifications



Synchronised dates (aka dual dates, or double dates)



Representation of transcription anomalies

Representation of original emphasis, footnotes, marginalia, etc

Adding alternative meaning/spelling to transcribed text

Proof and GPS

Conclusion sharing

Reasoning and Proof Arguments

Linking conclusions to reasoning, evidence, information, and sources

Personae, and equivalents for other subject entities

Source mining



Simple and protracted events

Hierarchical events

Relation to Persons, Places, Animals, and Groups

Inheritance of event properties

Relational constraints between events



Partially controlled vocabularies for tag values

Custom Properties, including units

Schema extensions

Additional types of subject entity (beyond Person, Animal, Place, Group)



Family units

Time-dependent Person and Animal association

Group hierarchies

Alternative names and spelling

Creation and Demise

Related entities (splits, joins, and other connections)



Name structure

Name sorting, collation, case conversation

Formal and informal presentation styles

Alternative names and spelling

Time dependency

Personal Name Authorities

Membership of Groups

Include animals in a parallel fashion to persons


Physical data formats

Data Model versus Serialisation Format

Container format for data and attachments (or Bundle)

Run-time object model



Distinguished from Location and Postal Address

Place hierarchies

Alternative names and spelling

Time dependency

Place Authorities

Creation and Demise

Related entities (splits, joins, and other connections)




Connection to Person, Animal, Place, or Group

Separation of original value from interpreted value

Transcription anomalies

Data-types, fractional values, and units of measurement

Conclusion links


Personae, and equivalent for other subject entities



Attachments (e.g. images, documents)

Physical Artefacts

Transmission format


Sources & citations

Simple and complex citations

Citation elements

Analytical commentary

Discursive notes

Citation styles & modes

Multi-source and conflated citations

Citation Templates

Pre-formed citation strings


Structured narrative

Mark-up (descriptive and semantic)

Linking text to text, and text to data

Original attributes used for emphasis

General annotation-style notes

Transcription of manuscript, typescript, and audio sources


® STEMMA is a registered trademark of Tony Proctor.