Symbolic Names

This section defines the format of the symbolic names used within STEMMA. This includes the Key attribute associated with various entities (both top-level and enclosed ones), the Dataset Name attribute, and the names of any Source/Citation/Resource Parameters.


They are composed of ASCII alphanumeric characters (i.e. A-Z, a-z, 0-9), underscore, and hyphen, but must start with a letter or underscore. Symbolic names are treated in a case-sensitive way.


Q: What length limits should apply? Any limits should only be for the purpose of a sanity constraint. Q: Should any other characters be allowed?


Entity Key names are local to the respective Dataset. If references need to be analysed across multiple Datasets (e.g. when comparing them) then their Key names should be decorated with the parent Dataset name in order to make them unique, e.g. TonysTree:Tony123. If multiple STEMMA Documents are loaded concurrently then a similar decoration must be applied using the external Document names, e.g. “Example”:TonysTree:Tony123. This effectively defines a hierarchical namespace and the syntax of the decorations is designed to follow the precedent set by the URN standard.


All Key names are considered to be the same scope level within their Dataset. That is, even though some entities may have named sub-elements, their names are not considered to be subordinate to that of their parent entity. Although this was considered, it would introduce an unjustifiable complexity since some cases involve three levels and their access is not constrained by those same parents. A reference to Key.Key2 might otherwise require decomposition using separate attributes to facilitate XML query languages such as XPath, XSL Pattern, etc.


Parameter names are local to their respective Source/Citation/Resource entity. When they are substituted into the allowed text values, they employ the ${name} style of substitution syntax.


There are a number of vocabularies implemented in the attribute values and element data used by STEMMA, and these may be controlled (predefined) or partially controlled (extensible). These include type names, role names, mode names, and Property names. As section Extended Vocabularies explains, these values constitute Fragment identifiers and may be qualified with their corresponding namespace URI. They are therefore case-sensitive and limited to the ‘unreserved character’ set defined in RFC3986 (URI: Generic Syntax). This is similar to the aforementioned character set but additionally including period and tilde.


In the STEMMA examples provided, a single leading character is used in Key names to reflect their declared type. This is purely a convention and not prescribed by the specification. The following characters are currently used:


            a          Animal entity.

            c          Citation entity.

            d          Detail link, for drill-down/drill-up (see below).

            e          Event entity.

            g          Group entity.

            l           SourceLet

            m         Matrix entity.

            n          Narrative element.

            p          Person entity (and sometimes a Contact entity).

            r           Resource entity.

            s          Source entity.

            t           Text element.

            w         Place entity (for “where”).


For detail links (explained under Source), a second character usually indicates the nature of the target as follows:


a          ProtoAnimal.

c          Commentary.

d          ProtoDate.

e          ProtoEvent.

g          ProtoGroup.

p          ProtoPerson.

s          Source fragment.

w         ProtoPlace.


Various free-form identifying tags are used on the <Counter> element (Document Structure); the <ts>, <ms>, and <voice> ‘id’ and ‘scheme’ attributes (Descriptive Mark-up); and various elements in <ContactDetails> (Contacts). There are currently no restrictions on their content other than the characters being printable, including space, and that any leading “prefix:” should be associated with a corresponding namespace. They therefore follow the same rules as external identifiers in the <ExtID> element (see EXTERNAL_ID)