STEMMA History
Development of the STEMMA® ("Source Text for Event and
Ménage MApping”) data model and source format began around 2011. This page
charts its chronological history.
Draft [2012-01-02]
First draft specification uploaded to new STEMMA Web site.
Draft [2012-04-27]
The STEMMA research notes were collected together and made (almost) readable.
The 70+ pages were uploaded to the STEMMA site as a resource for any similar
work on family history data to utilise.
V1.0 [2012-07-16]
STEMMA passed from being a draft specification to the first fully working
version.
A number of its features were streamlined or revised as a result of it being
applied to my own data, and following further research. The copious associated
Research Notes were updated in keeping with the new specification, and
supplemented by a Data Model section that shows the model being applied to a
number of case studies.
New or improved features include:
- Rationalised way of
extending partially controlled vocabularies in order to support custom
types, subtypes, roles, styles, and other tag values.
- Unified approach to
defining core and custom properties for Persons and Places.
- Streamlined handling of
multi-valued properties (and of Citation/Resource parameters) such as
'Roles'.
- Support for local-events
(i.e. that only affect one person).
- Support for Events with
multiple sources of information, i.e. multiple sets of properties for each
associated Person.
- Support for Dublin Core
semantic tags for both Person/Place properties and Resource/Citation
parameters. Support for their machine-readable OpenURLs.
- Streamlined approach to
Person and Place names that retains their unified handling but
accommodates name types, name styles and sorting for different cultures.
- Support for Dual Dates
(aka Double Years).
- Support for URL hyperlinks
in narrative.
- Support for general
reference notes in narrative.
- Copyright and other
permissions/prohibitions.
- Identification of physical
artefacts as Resources.
- Extended inheritance
mechanism to Resources (e.g. attachments) and Citations (e.g. sources) so
that the same details may be shared between multiple entities.
V2.0 [2013-05-28]
STEMMA underwent a considerable number of refinements to
both strengthen and streamline its specification. Features include:
- Better support for
recording transcriptions, including uncertain characters, marginalia,
original emphasis, alternative spellings/meanings.
- Better separation of
evidence from conclusion for marked-up references and for Property values.
- Generic Group concept that
can be used to model time-dependent Sets of Person, e.g. family units.
- Support for attribution of
individuals, whether represented within the family history or external to
it. Contact details, including address, phone, email addresses, Web sites,
and messaging systems.
- Revised date-string
representation for world calendars.
- Downloads section added to
Web site.
V2.1 [2013-10-16]
Changes include:
- Changes to NoteRef. Added
new Anom element for transcription anomalies.
- Allow <br> in
orig-text data.
- Added Hamlet place-type.
V2.2 [2014-04-17]
Changes include:
- Changed Group to also represent
real-life entity rather than just Person Sets. Include hierarchy, events,
alternative names, subtype, resources (docs & photos).
- Added GroupRef mark-up.
- Added GroupRef data-type.
- Move Person GroupLnk
inside EventLnk/Eventlet.
- Split
BirthEvent/DeathEvent to allow for Eventlet.
- Have equivalent to
Birth/Death (Creation & Demise) for Group and Place. Replaces Void in
Places.
- Handle "related
entities" with JoinFrom (in Creation), SplitTo (in Demise), and
RelatedTo elements in Place/Group.
- Simplify Eventlet by
removing hierarchy support.
- Persisted Counters in
Dataset header for assisted key generation.
- Added GroupProperties to
ExtendedProperties.
- Event Place optional in
Eventlet.
- Resource entity
distinguishes physical artefacts and images thereof.
- Improved digital data-types
for Resources.
- Sensitivity levels
accepted on Resources (e.g. photographs and documents).
- External IDs accepted on
Person, Place, Group, and Event entities.
V3.0 [2014-10-20]
Major change to trim excess flexibility, and to address
certain known failings:
- Could not represent the
Properties for unidentified or incidental people in a given source.
- Overloading of Role
Property with relationships.
- Problems representing a
“directed Property”, to another entity reference, as opposed to, say,
Head.Wife.
- Cannot inherit from an
Event when it has Detail elements in it.
- Representation of
top-level research reports.
Changes include:
- Reversal of
Person-to-Event (etc) links to place Properties in the Event, alongside
the respective source details.
- Added References element
to Event for representing subject references in the sources, and their
respective Properties. This element supersedes the previous Detail
element.
- Introduction of “abstract”
entities for the sole purposes of inheritance.
- Make Event hierarchies
bottom-up rather than top-down for consistency and ease of validation.
- Deprecation of parameter
substitution into Citation URIs; both named parameter markers and the ‘=?’
form.
- Inclusion of NARRATIVE as
a top-level Dataset entity for research reports and authored works.
- Changed semantic types on Properties
and Parameters to use “DC:” namespace prefix rather than simply “DC.”.
DCType attribute changed to SemType.
- Reinstatement of
Event-specific Property values to represent named items of information for
an event.
- Explicit control over
entity-Key imports for multi-Dataset Documents and multi-Document
collections.
- PersonEL, PlaceEL, and
GroupEL data-types added for Properties that describe a relationship
between two evidential subjects, such as person-to-person.
- Addition of ‘Header’
TEXT_TYPE for details of authorship, title, etc., in narrative works.
- Adjustments to
NAME_VARIANTS to move the Type attribute, add an Initial=’boolean’ option
for using initials, an indication of cultural style, and an optional
override for character sorting.
- Added optional PersonalName
(within Person entity) to complement PlaceName (in Place) and GroupName
(in Group).
- Revise syntax of
<Constraints> element to associate narrative with a specific
constraint, e.g. to express causal relationships.
- Added optional coordinates
to a Place in order to represent a point, an enclosed area (i.e. polygon),
or an open line (e.g. for a street).
- Separation of Relationship
from Role.
- Several new event-types
and event-subtypes.
V4.0 [2015-11-22]
Major change to finally accommodate sources, information,
evidence, and conclusions in a single model that supports the major approaches
to research and representation.
Changes include:
- Introduction of a new
Source entity that embraces both Citations and Resources for a particular
information source. Citations and Resource entities are now connected to
Source entity rather than to each other.
- Support for source
assimilation & analysis, source
mining, and the ability to drill-down on conclusions, all provided via
the Source entity.
- The <References>
element, within Events, is now superseded by <SourceLnk> which links
to the new Source entity. Enclosed *Ref elements (e.g. <PersonRef>)
changed to *Lnk elements for consistency. Removal of the ID attribute introduced
in V3.0.
- Support for cross-source
analysis and correlation via a new Matrix entity.
- Support for a generalised
approach to multi-tier personae.
- Additional of Animal
entity, strongly modelled on Person entity, including related mark-up and
namespaces.
- <CitationLnk>/<ResourceLnk>
from Person, Place, Group, and Event entities, changed to <SourceLnk>.
- Reviewed the goal of
sticking to XHTML tags for presentation,
replacement of the <Hi> element with HTML-like ones, and the
addition of support for <sup>/<sub> elements, columnar text, simple
tables, and indentation.
- Removal of ‘Unreadable’
mode from the <Anom> element.
- Support for distinguishing
manuscript and typescript transcriptions in the <Text> element.
Support for numbering lines and pages in transcriptions. Positional
control over annotations such as marginalia.
- <FromText> element
added to <Narrative> in order to share re-usable sections of text.
This has meant that the NoteKey attribute, in the semantic mark-up, was no
longer required and so was deleted.
- Categorisation of the
layers in a Citation chain.
- The optional
<DisplayFormat> element of the Citation entity has been
re-interpreted as a set of pre-formatted language-specific strings. This may
exist in addition to the mandatory set of named parameter values, and the
two together can also be used as a simple citation-template.
- The Intrinsic Functions,
mentioned at the end of Semantic
Mark-up, have been changed to Intrinsic Methods in preparation for defining
a run-time object model. The set is also supplemented by ones for
accessing subject-entity names.
- Small changes to
subject-entity *-name-mode vocabularies to factor-out a generic name-mode
(missing from previous specification).
- Place coordinates
(including bounding shapes) are now time-dependent, the same as any
parent-Place link.
- Added Canton and Colony to
place-type vocabulary. The place-type of House is now replaced by Number
and Apartment for flexibility.
- <Quality>,
<Reliability>, and <Credibility> elements moved from the
Citation entity to the new Source entity.
Although refinements will continue, I anticipate this to be
the last major change to the STEMMA specification. I will, therefore,
concentrate subsequent efforts on describing its advantages and philosophy, and
in providing more worked examples.
V4.1 [2017-04-19]
Refinements to STEMMA specification, especially in the areas
of transcription (multiple contributors, audio, and linking to images or
recordings) and narrative mark-up (tabulated data, and citations).
Employment of the revised narrative support may be viewed in
the fully-worked examples at: parallaxview.co/stemma/Resources/JessonLesson.xml
and parallaxview.co/stemma/Resources/MoreOnGeorgeHearson.xml.
- ‘WhereIn’ attribute added
to Citation Parameter definitions. This finally provides the missing
criteria necessary for the automatic generation of shortened subsequent
reference-note citations. ‘Subst’ attribute added to Citation parameter
values in order to override formatting, or provide a substitution for
cases on of a value being unavailable.
- <ParentCitationLnk>
now allowed in both <CitationLnk> and <CitationRef> elements
in order to create transient chained citations.
- Quality element, within
Source entity, moved inside the Frame element.
- Review of entries in citation-layer-type
namespace.
- DataControl element of
Resource entity supports attribution text.
- Control of table widths,
and individual column widths and alignments.
- Ability to align images when
embedded within narrative.
- Ability to hyperlink
images embedded in narrative.
- Requirement for enclosing
Narrative element dropped for Text elements, except for top-level
Narrative entities. Text elements can now be nested.
- <cb> replaced with
<col>, and relationship between paragraphs and columns now reversed
(paragraphs now within columns).
- ResourceRef
Mode=SynchImage allows synchronisation between images and transcriptions.
- Corresponding SVG-x/y
coordinates added to elements <page>, <col>, <p>, and
<line>. Additional <posn> element defined to associate
coordinates with arbitrary text locations.
- <Page>/<Line>
renamed to <page>/><line> and moved alongside
<p>/<col> as related to structure and content rather than
semantics.
- Mode=Tablenote attribute supplementing
Foonote and Endnote in various places.
- Text-element
Header=boolean attribute replaced with Class=Header | H1 | H2 | H3 | Caption
| Footnote | Endnote | Legend | Tablenote.
- Text-element Class=Caption
attribute used in Resource/ResourceRef and tables for generating captions.
- Text-element Class=Footnote
| Endnote | Tablenote attribute used in CitationRef to allow pre-formed
(preferred) citations.
- Deprecated the <Text>
attributes Abstract=boolean, Extract=boolean, Manuscript=boolean, and Transcript=boolean..<voice>
mark-up added to supplement existing <ts>/<ms> mark-up. <ts>/<ms>/<voice>
all enhanced to cope with different hands, voices, fonts, colours, etc.
- In transcripts of audio recordings,
support for multiple voices, overlapping dialogue, intonation, gestures,
noises, pauses, timestamps, etc.
- ResourceRef
Mode=SynchAudio allows synchronisation between audio recordings and
transcriptions, analogous to SynchImage for textual transcription (above).
- Complete revision of Mode
values for CitationRef element.
- Relaxation of Date
Parameters in order to cover the full range of calendars. One requirement
was to represent the date-of-issue for newspaper sources that predated the
Julian-to-Gregorian changeover.
V4.1 still [2020-12-10]
Canonical URL changed to https://parallaxview.co/stemma following a move from Google Sites to neocities.org. The previous URLs (http://www.familyhistorydata.parallaxview.co, http://stemma.parallaxview.co, and http://www.parallaxview.co/familyhistorydata) will be redirected to the new URL.
V4.1 still [2024-01-01]
At the end of 2020, the public version of the STEMMA specification was fixed at V4.1, although a number of small changes in specification and direction have occurred internally for private usage. Little work has taken place on the informational sub-model that was to support a dynamic research process, but a spin-off of the associated experimentation was the SVG Family-Tree Generator (SVG-FTG) that is now a separate product. Work has focused, instead, on the conclusional sub-model (see STEMMA Latest). Since that time, GEDCOM v7.0 has been defined, but STEMMA remains in a niche of its own.