openEHR logo

Object Data Instance Notation (ODIN)

Issuer: openEHR Specification Program

Release: Release-1.0.3

Status: STABLE

Revision: [latest_issue]

Date: [latest_issue_date]

Keywords: data, notation, JSON, syntax

© 2003 - 2015 The openEHR Foundation

The openEHR Foundation is an independent, non-profit community organisation, facilitating the sharing of health records by consumers and clinicians via open-source, standards-based implementations.


image Creative Commons Attribution-NoDerivs 3.0 Unported.



Amendment Record

Issue Details Raiser Completed

R E L E A S E     1.0.3


Separate from ADL spec and rename to ODIN.

T Beale

15 Apr 2013


Add fractional seconds to dADL grammar.

T Beale

15 Aug 2012

R E L E A S E     1.0.2


SPEC-268: Correct missing parentheses in dADL type identifiers. dADL grammar rules updated.

R Chen

12 Dec 2008

R E L E A S E     1.0.1


SPEC-213: Include 'T' in dADL Date/time, Time and Duration values.

T Beale

13 Mar 2007


SPEC-216: Allow mixture of W, D etc in ISO8601 Duration (deviation from standard).

S Heard


R E L E A S E     1.0


SPEC-143. Add partial date/time values to dADL syntax.

S Heard

18 Jun 2005

SPEC-149. Add URIs to dADL and remove query() syntax.

T Beale

R E L E A S E     0.95


SPEC-110. Added explanatory material; added domain type support; rewrote of most dADL sections.

T Beale

15 Nov 2004

R E L E A S E     0.9


SPEC-75. Added dADL object model.

T Beale

28 Dec 2003


SPEC-70. Copyright Assigned by Ocean Informatics P/L Australia to The openEHR Foundation.

T Beale,
S Heard

29 Nov 2003


Added simple value list continuation (",…​").
Changed path syntax so that trailing '/' required for object paths.
Remove ranges with excluded limits.
Added terms and term lists to dADL leaf types.

T Beale

01 Nov 2003


Additions during HL7 WGM Memphis Sept 2003

T Beale

09 Sep 2003


Renamed dDL to dADL. Changed path syntax to conform (nearly) to Xpath.

T Beale

03 Sep 2003


Rewritten with sections on dDL.

T Beale

28 July 2003


Initial Writing

T Beale

10 July 2003


The work reported in this paper has been funded in by the following organisations:

  • University College London - Centre for Health Informatics and Multi-professional Education (CHIME);

  • Ocean Informatics.

Special thanks to Prof David Ingram, head of CHIME, who provided a vision and collegial working environment ever since the days of GEHR (1992).


  • 'openEHR' is a trademark of the openEHR Foundation

  • 'Java' is a registered trademark of Oracle Corporation

  • 'Microsoft' and '.Net' are trademarks of the Microsoft Corporation

1. Preface

1.1. Purpose

This document describes the syntax of the Object Data Instance Notation (ODIN), previously known as the 'dADL' syntax from the openEHR ADL specifications. It is intended for software developers, technically-oriented domain specialists and subject matter experts (SMEs). ODIN is designed as a human-readable and computer-processible data representation syntax, and can be hand-edited using a normal text editor.

1.2. Nomenclature

In this document, the term 'attribute' denotes any stored property of a type defined in an object model, including primitive attributes and any kind of relationship such as an association or aggregation. Where XML is mentioned, XML 'attributes' are always referred to explicitly as 'XML attributes'.

1.3. Status

This specification is in the STABLE state. The development version of this document can be found at

Known omissions or questions are indicated in the text with a 'to be determined' paragraph, as follows:

TBD: (example To Be Determined paragraph)

Users are encouraged to comment on and/or advise on these paragraphs as well as the main content. Feedback should be provided either on the technical mailing list, or on the specifications issue tracker.

2. Overview

The ODIN syntax provides a formal means of expressing instance data based on an underlying object-oriented or relational information model, which is readable both by humans and machines. The general appearance is exemplified by the following:

person = (List<PERSON>) <
    [01234] = <
        name = < -- person's name
            forenames = <"Sherlock">
            family_name = <"Holmes">
            salutation = <"Mr">
        address = < -- person's address
            habitation_number = <"221B">
            street_name = <"Baker St">
            city = <"London">
            country = <"England">
    [01235] = < -- etc >

In the above the attribute names person , name , address etc, and the type List<PERSON> are all assumed to come from an information model. The [01234] and [01235] tags identify container items.

The basic design principle of ODIN is to be able to represent data in a way that is both machine-processible and human readable, while making the fewest assumptions possible about the information model to which the data conforms. To this end, type names are optional; often, only attribute names and values are explicitly shown. No syntactical assumptions are made about whether the underlying model is relational, object-oriented or what it actually looks like. More than one information model can be compatible with the same ODIN-expressed data. The UML semantics of composition/aggregation and association are expressible, as are shared objects. Literal leaf values are only of 'standard' widely recognised types, i.e. Integer, Real, Boolean, String, Character and a range of Date/time types. In standard ODIN, all complex types are expressed structurally.

The ODIN syntax as described above has a number of useful characteristics that enable the extensive use of paths to navigate it, and express references. These include:

  • each <> block corresponds to an object (i.e. an instance of some type in an information model);

  • the name before an '=' is always an attribute name or else a container element key, which attaches to the attribute of the enclosing block;

  • the formal type of leaf values can be inferred purely from syntax;

  • paths can be formed by navigating down a tree branch and concatenating attribute name, container keys (where they are encountered) and '/' characters;

  • every node is reachable by a path;

  • dynamically bound types can be explicitly indicated;

  • shared objects can be referred to by path references.

3. Basics

3.1. File Encoding

ODIN files are intended to be capable of accommodating characters from any language, and may be used for multiple languages at once, e.g. as translation files for software. Accordingly, the assumed encoding for an ODIN file is UTF-8 unicode.

ODIN files encoded in latin 1 (ISO-8859-1) or another variant of ISO-8859 - both containing accented characters with unicode codes outside the ASCII 0-127 range - may work perfectly well, for various reasons:

  • the contain nothing but ASCII, i.e. unicode code-points 0 - 127; this will be the case in English-language authored archetypes containing no foreign words;

  • some layer of the operating system is smart enough to do an on-the-fly conversion of characters above 127 into UTF-8, even if the archetype tool being used is designed for pure UTF-8 only;

  • a tool using the ODIN file might support UTF-8 and ISO-8859 variants.

For situations where binary UTF-8 cannot be supported, ASCII encoding of unicode characters above code-point 127 should only be done using the system supported by many programming languages today, namely \u escaped UTF-16. In this system, unicode codepoints are mapped to either:

  • \uHHHH - 4 hex digits which will be the same (possibly 0-filled on the left) as the unicode code-point number expressed in hexadecimal; this applies to unicode codepoints in the range U+0000 - U+FFFF (the 'base multi-lingual plane', BMP);

  • \uHHHHHHHH - 8 hex digits to encode unicode code-points in the range U+10000 through U+10FFFF (non-BMP planes); the algorithm is described in [rfc_2781].

It is not expected that the above approach will be commonly needed, and it may not be needed at all; it is preferable to find ways to ensure that native UTF-8 can be supported, since this reduces the burden for ODIN parser and tool implementers. The above guidance is therefore provided only to ensure a standard approach is used for ASCII-encoded unicode, if it becomes unavoidable.

Thus, while the only officially designated encoding for ODIN and its constituent syntaxes is UTF-8, real software systems may be more tolerant. This document therefore specifies that any tool designed to process ODIN files need only support UTF-8; supporting other encodings is an optional extra. This could change in the future, if required by the ODIN user community.

URIs, which have their own data type in ODIN, are handled in a specific way: all characters outside the 'unreserved set' defined by [rfc_3986] are 'percent-encoded'. The unreserved set is:

unreserved : ALPHA | DIGIT | '-' | '.' | '_' | '~' ;

ALPHA : [a-zA-Z] ;
DIGIT : [0-9] ;

3.2. Special Character Sequences

In string and character data values, characters not in the lower ASCII (0-127) range should normally be UTF-8 encoded, but with the option of using the following quoted forms customary in software development:

  • \r - carriage return

  • \n - linefeed

  • \t - tab

  • \\ - backslash

  • \" - literal double quote within a String terminal value

  • \' - literal single quote within a Character terminal value

Any other character combination starting with a backslash is illegal; to get the effect of a literal backslash, the \\ sequence should always be used.

Typically in a normal string, including multi-line paragraphs as used in ODIN, only \\ and \" are likely to be necessary, since all of the others can be accommodated in their literal forms; the same goes for single characters - only \\ and \' are likely to commonly occur. However, some authors may prefer to use \n and \t to make intended formatting clearer, or to allow for text editors that do not react properly to such characters. Parsers should therefore support the above list.

3.3. Keywords

ODIN has no keywords of its own: all identifiers are assumed to come from an information model.

3.4. Reserved Characters

In ODIN, a small number of characters are reserved and have the following meanings:

  • <: open an object block;

  • >: close an object block;

  • =: indicate attribute value = object block;

  • (, ): type name or plug-in syntax type delimiters;

  • <#: open an object block expressed in a plug-in syntax;

  • #>: close an object block expressed in a plug-in syntax.

Within <> delimiters, various characters are used as follows to indicate primitive values:

  • ": double quote characters are used to delimit string values;

  • ': single quote characters are used to delimit single character values;

  • |: bar characters are used to delimit intervals;

  • []: brackets are used to delimit coded terms.


In an ODIN text, comments are indicated by the characters "--". Multi-line comments are achieved using the "--" leader on each line where the comment continues.

In this document, ODIN comments are shown in brown.

3.6. Information Model Identifiers

Two types of identifiers from information models are used in ODIN: type names and attribute names. A type name is any identifier with an initial upper case letter, followed by any combination of letters, digits and underscores. A generic type name (including nested forms) additionally may include commas and angle brackets, and must be syntactically correct as per the UML. An attribute name is any identifier with an initial lower case letter, followed by any combination of letters, digits and underscores. Any convention that obeys this rule is allowed.

At least two well-known conventions that are ubiquitous in information models obey the above rule. One of these is the following convention:

  • type names are in all uppercase, e.g. PERSON , except for 'built-in' types, such as primitive types (` Integer` , String , Boolean , Real , Double ) and assumed container types (List<T> , Set<T> , Hash<T, U> ), which are in mixed case, in order to provide easy differentiation of built-in types from constructed types defined in the reference model. Built-in types are the same types assumed by UML, OCL, IDL and other similar object-oriented formalisms.

  • attribute names are shown in all lowercase, e.g. home_address .

  • in both type names and attribute names, underscores are used to represent word breaks. This convention is used to maximise the readability of this document.

The other common style is the programmer’s mixed-case or "camel case" convention exemplified by Person and homeAddress , as long as they obey the rule above. The convention chosen for any particular ODIN document should be based on the convention used in the underlying information model. Identifiers are shown in green in this document.

3.7. Semi-colons

Semi-colons can be used to separate ODIN blocks, for example when it is preferable to include multiple attribute/value pairs on one line. Semi-colons make no semantic difference at all, and are included only as a matter of taste. The following examples are equivalent:

term = <text = <"plan">; description = <"The clinician's advice">>
term = <text = <"plan"> description = <"The clinician's advice">>

term = <
    text = <"plan">
    description = <"The clinician's advice">

Semi-colons are completely optional in ODIN.

4. ODIN Artefacts

An ODIN text may be created in two different physical forms: embedded fragments and documents. For both the following general structure applies:

odin_text           ::=   ( schema_identifier )? main_text ;
schema_identifier   ::=   `@` schema '=' URI ;

where 'URI' is an value of the URI primitive type. This is used to indicate the schema, including its version, on which the main ODIN text is based. It is optional because in many cases the schema is known a priori, or can be inferred from context.

The following sections describe various types of ODIN artefact.

4.1. Embedded Fragment

Fragments of ODIN text can appear embedded within other artefacts. A fragment typically includes no object identiers nor schema identifier since both of these are assumed to be inferred from the surrounding context. A typical fragment has the following appearance:

... other formalism text ...
... delimiter ...

-- ODIN Embedded Fragment
    attr_1 = <
        attr_12 = <
            attr_13 = <leaf_value>
    attr_2 = <
        attr_22 = <leaf_value>

... delimiter ...
... other formalism text ...

4.2. Document

An ODIN document is considered a standalone artefact whose contents can take various forms, all assumed to correspond to one or more serialised objects.

4.2.1. Implicit Object Document

A document containing only an embedded fragment such as shown above is considered to be an 'implict' object document, and its contents are assumed to consist of values of various object properties. This format is a degenerate form of the 'anonymous' object document but so common and useful it is treated as a legal ODIN form.

Implicit object documents are commonly used to serialise models, such as information model schemas, application configuration files and so on. The usual assumption is that the filename and/or ODIN content will identify the artefact sufficiently for tools to know what its information model is.

4.2.2. Anonymous Object Document

Any other kind of ODIN document contains one or more explicit objects. The anonymous object form has one object per document, with no object identifier and consists of an outer '<>' delimiter pair containing an ODIN embedded fragment, i.e.:

-- ODIN Anonymous Object Document
    attr_1 = <
        attr_12 = <
            attr_13 = <leaf_value>
    attr_2 = <
        attr_22 = <leaf_value>

This form has no practical benefit over the implicit document form, but is syntactically more correct, and should be supported by parsers.

4.2.3. Identified Object Document

The next variant corresponds to serialisations of multiple objects, each of which is identified.

-- ODIN Identified Object Document
["id_1"] = <
    attr_1 = <
        attr_12 = <
            attr_13 = <leaf_value>

["id_2"] = <
    attr_1 = <
        attr_12 = <
            attr_13 = <leaf_value>


["id_N"] = <
    attr_1 = <
        attr_12 = <
            attr_13 = <leaf_value>

Identifiers can be values of the String, Integer or any Date/Time primitive types. Strings are most commonly used e.g.:

-- ODIN Identified Object Document

["aaa"] = <
["bbb"] = <
["ccc"] = <

Identified Object Documents are most commonly used for representing serialised in-memory instance networks, i.e. the notion of an 'object dump'.

5. Content

5.1. General Structure

Within any kind of ODIN object instance (i.e. implied, anonymous or identified), the structure is a hierarchy of attribute names and object values. In its simplest form, an ODIN object consists of repetitions of the following pattern:

object = attribute_name '=' '<' value '>' ;

Each attribute name is the name of an attribute in an implied or actual object or relational model. Each "value" is either a literal value of a primitive type (see section [See Primitive Types]) or a further nesting of attribute names and values, terminating in leaf nodes of primitive type values. Where sibling attribute nodes occur, the attribute identifiers must be unique, just as in a standard object or relational model.

The following shows a typical structure.

attr_1 = <
    attr_2 = <
        attr_3 = <leaf_value>
        attr_4 = <leaf_value>
    attr_5 = <
        attr_3 = <
            attr_6 = <leaf_value>
        attr_7 = <leaf_value>
attr_8 = <...>

In the above structure, each "<>" encloses an instance of some type. The hierarchical structure corresponds to the part-of relationship between objects, otherwise known as composition and aggregation relationships in object-oriented formalisms such as UML (the difference between the two is usually described as being "sub-objects related by aggregation can have independent lifetimes, whereas sub-objects related by composition have co-termimal lifetimes and are always destroyed with the parent"; ODIN does not differentiate between the two, since it is the business of a model, not the data, to express such semantics). Associations between instances in ODIN are also representable by references, and are described in the section Associations and Shared Objects.

Validity rules for object-structuring of ODIN are as follows:

VDATU: attribute name uniqueness: sibling attributes occurring within an object node must be uniquely named with respect to each other, in the same way as in class definitions in an information model.

5.2. Paths

For any ODIN structure, a set of paths can be extracted that correspond to the tree structure of the data. The complete set of paths for the above example is as follows.

    /attr_1/attr_2/attr_3           -- path to a leaf value
    /attr_1/attr_2/attr_4           -- path to a leaf value
    /attr_1/attr_5/attr_3/attr_6    -- path to a leaf value
    /attr_1/attr_5/attr_7           -- path to a leaf value

The path syntax used with ODIN maps trivially to W3C Xpath and Xquery paths, and is described in section Path Syntax.

5.3. Void Objects

A void object, i.e. an object attribute that has no value is allowed in an ODIN text, but ignored by parsers. It is legal to output void objects, but not recommended. A void object looks as follows:

    address = <...>    -- person's address

5.4. Container Objects

The syntax described so far allows an instance of an arbitrarily large object to be expressed, but does not support attributes of container types such as lists, sets and hash tables, i.e. items whose type in an underlying reference model is something like attr:List<Type> , attr:Set<Type> or attr: Hash<ValueType, KeyType> . There are two ways instance data of such container objects can be expressed in ODIN. The first applies to leaf values and is to use a list style literal value for , where the "list nature" of the data is expressed within the manifest value itself, as in the following examples.

fruits = <"pear", "cumquat", "peach">
some_primes = <1, 2, 3, 5>

See Lists of Built-in Types for the complete description of list leaf types.

However for containers holding non-primitive values, including more container objects, a different syntax is needed. Consider by way of example that an instance of the container List<Person> could in theory be expressed as follows.


people = <
    <name = <...> date_of_birth = <...> sex = <...> interests = <...> >
    <name = <...> date_of_birth = <...> sex = <...> interests = <...> >
    -- etc

Here, 'anonymous' blocks of data are repeated inside the outer block. However, this makes the data hard to read, and does not provide an easy way of constructing paths to the contained items. A better syntax becomes more obvious when we consider that members of container objects in their computable form are nearly always accessed by a method such as member(i) , item[i] or just plain [i] , in the case of array access in the C-based languages.

ODIN opts for the array-style syntax, known in ODIN as container member keys. No attribute name is explicitly given; any primitive comparable value is allowed as the key, rather than just integers used in C-style array access. Further, if integers are used, it is not assumed that they dictate ordinal indexing, i.e. it is possible to use a series of keys [2] , [4] , [8] etc. The following example shows one version of the above container in valid ODIN:

people = <
    [1] = <name = <...> birth_date = <...> interests = <...> >
    [2] = <name = <...> birth_date = <...> interests = <...> >
    [3] = <name = <...> birth_date = <...> interests = <...> >

Strings and dates may also be used. Keys are coloured blue in the this specification in order to distinguish the run-time status of key values from the design-time status of class and attribute names. The following example shows the use of string values as keys for the contained items.

people = <
    ["akmal:1975-04-22"] = <name = <...> birth_date = <...> interests = <...> >
    ["akmal:1962-02-11"] = <name = <...> birth_date = <...> interests = <...> >
    ["gianni:1978-11-30"] = <name = <...> birth_date = <...> interests = <...> >

The syntax for primitive values used as keys follows exactly the same syntax described below for data of primitive types. It is convenient in some cases to construct key values from one or more of the values of the contained items, in the same way as relational database keys are constructed from sufficient field values to guarantee uniqueness. However, they need not be - they may be independent of the contained data, as in the case of hash tables, where the keys are part of the hash table structure, or equally, they may simply be integer index values, as in the 'locations' attribute in the 'school_schedule' structure shown below.

Container structures can appear anywhere in an overall instance structure, allowing complex data such as the following to be expressed in a readable way.

school_schedule = <
    lesson_times = <08:30:00, 09:30:00, 10:30:00, ...>

    locations = <
        [1] = <"under the big plane tree">
        [2] = <"under the north arch">
        [3] = <"in a garden">

    subjects = <
        ["philosophy:plato"] = < -- note construction of key
            name = <"philosophy">
            teacher = <"plato">
            topics = <"meta-physics", "natural science">
            weighting = <76%>
        ["philosophy:kant"] = <
            name = <"philosophy">
            teacher = <"kant">
            topics = <"meaning and reason", "meta-physics", "ethics">
            weighting = <80%>
        ["art"] = <
            name = <"art">
            teacher = <"goya">
            topics = <"technique", "portraiture", "satire">
            weighting = <78%>

The example above conforms directly to the object-oriented type specification (given in a pascal-like syntax):

    lesson_times: List<Time>
    locations: List<String>
    subjects: List<SUBJECT> -- or it could be Hash<SUBJECT>

    name: String
    teacher: String
    topics: List<String>
    weighting: Real

Other class specifications corresponding to the same data are possible, but will all be isomorphic to the above.

How key values relate to a particular object structure depends on the object model being used during the ODIN parsing process. It is possible to write a parser which makes reasonable inferences from an information model whose instances are represented as ODIN text; it is also possible to include explicit typing information in the ODIN itself (see section [See Adding Type Information] below).

The validity rule for objects within a container attribute is as follows:

VDOBU: object identifier uniqueness: sibling objects occurring within a container attribute must be uniquely identified with respect to each other.

Paths through container objects are formed in the same way as paths in other structured data, with the addition of the key, to ensure uniqueness. The key is included syntactically enclosed in brackets, in a similar way to Xpath predicates. Paths through containers in the above example include the following:

/school_schedule/locations[1]                   -- path to "under the big..."
/school_schedule/subjects["philosophy:kant"]    -- path to "kant"

5.5. Nested Container Objects

In some cases the data of interest are instances of nested container types, such as List<List<Message>> (a list of Message lists) or Hash<List<Integer>, String> (a hash of integer lists keyed by strings). The ODIN syntax for such structures follows directly from the syntax for a single container object. The following example shows an instance of the type List<List<String>> expressed in ODIN syntax.

list_of_string_lists = <
    [1] = <
        [1] = <"first string in first list">
        [2] = <"second string in first list">
    [2] = <
        [1] = <"first string in second list">
        [2] = <"second string in second list">
        [3] = <"third string in second list">
    [3] = <
        [1] = <"only string in third list">

The paths of the above example are as follows:


5.6. Adding Type Information

In many cases, ODIN data is of a simple structure, very regular, and highly repetitive, such as the expression of simple demographic data. In such cases, it is preferable to express as little as possible about the informatoin model on which the data are based, since various software components want to use the data, and use it in different ways. However, there are also cases where the data is highly complex, and more model information is needed to help a parser. Examples include large design databases for aircraft and health records. Data obeying more complex models typically include sub-objects that are of a subtype of the statically declared type in the information model, in other words, dynamically bound types.

Where dynamic binding occurs in the data, it must be indicated in an ODIN document. Typing information is added to using a syntactical addition inspired by the (type) casting operator of the C language, whose meaning is approximately: force the type of the result of the following expression to be type. In ODIN typing is therefore done by including the type name in parentheses after the '=' sign, as in the following example.

destinations = <
    ["seville"] = (TOURIST_DESTINATION) <
        profile = (DESTINATION_PROFILE) <...>
        hotels = <
            ["gran sevilla"] = (HISTORIC_HOTEL) <...>
            ["sofitel"] = (LUXURY_HOTEL) <...>
            ["hotel real"] = (PENSION) <...>
        attractions = <
            ["la corrida"] = (SPORT_VENUE) <...>
            ["Alcázar"] = (HISTORIC_SITE) <...>

The path set from the above example is as follows:

/destinations["seville"]/hotels["gran sevilla"]
/destinations["seville"]/hotels["hotel real"]


/hotels["hotel real"]
/hotels["gran sevilla"]

In the above, no type identifiers are included after the hotels and attractions attributes, so it is assumed by the parser that they are of their statically declared type, typically something like List<HOTEL> and List<ATTRACTION> respectively. Nevertheless, complete typing information can be included, as follows.

hotels = (List<HOTEL>) <
    ["gran sevilla"] = (HISTORIC_HOTEL) <...>

This illustrates the use of generic, i.e. template types, expressed in the standard UML syntax, using angle brackets. Any number of template arguments and any level of nesting is allowed, as in the UML. At first view, there may appear to be a risk of confusion between template type '<>' delimiters and the standard ODIN block delimiters. However the parsing rules are easy to state; essentially the difference is that an ODIN data block is always preceded by an '=' symbol.

Type identifiers can also include namespace information, which is necessary when same-named types appear in different packages of a model. Namespaces are included by prepending package names, separated by the '.' character, in the same way as in most programming languages, as in the qualified type names org.openehr.rm.ehr.content.ENTRY and Core.Abstractions.Relationships.Relationship.

6. References

6.1. Associations and Shared Objects

All of the facilities described so far allow any object-oriented data to be faithfully expressed in a formal, systematic way which is both machine- and human-readable, and allow any node in the data to be addressed using an Xpath-style path. The availability of reliable paths allows not only the representation of single 'business objects', which are the equivalent of UML aggregation (and composition) hierarchies, but also the representation of associations between objects, and by extension, shared objects.

6.1.1. Within An Object

Consider that in the example above, 'hotel' objects may be shared objects, referred to by association. This can be expressed as follows.

destinations = <
    ["seville"] = <
        hotels = <
            ["gran sevilla"] = </hotels["gran sevilla"]>
            ["sofitel"] = </hotels["sofitel"]>
            ["hotel real"] = </hotels["hotel real"]>

bookings = <
    ["seville:0134"] = <
        customer_id = <"0134">
        period = <...>
        hotel = </hotels["sofitel"]>

hotels = <
    ["gran sevilla"] = (HISTORIC_HOTEL) <...>
    ["sofitel"] = (LUXURY_HOTEL) <...>
    ["hotel real"] = (PENSION) <...>

Associations are expressed via the use of fully qualified paths as the data for an attribute. In this example, there are references from a list of destinations, and from a booking list, to the same hotel object. If type information is included, it should go in the declarations of the relevant objects; type declarations can also be used before path references, which might be useful if the association type is an ancestor type (i.e. more general type) of the type of the actual object being referred to.

6.1.2. Across Objects

In an ODIN document containing identified objects, with references across objects, reference paths will include object identifiers, as shown below:

["travel_db_0293822"] = <
    destinations = <
        ["seville"] = <
            hotels = <
                ["gran sevilla"] = <["tourism_db_13"]/hotels["gran sevilla"]>
                ["sofitel"] = <["tourism_db_13"]/hotels["sofitel"]>
                ["hotel real"] = <["tourism_db_13"]/hotels["hotel real"]>

    bookings = <
        ["seville:0134"] = <
            customer_id = <"0134">
            period = <...>
            hotel = <["tourism_db_13"]/hotels["sofitel"]>

["tourism_db_13"] = <
    hotels = <
        ["gran sevilla"] = (HISTORIC_HOTEL) <...>
        ["sofitel"] = (LUXURY_HOTEL) <...>
        ["hotel real"] = (PENSION) <...>

6.1.3. Across ODIN Documents

Data in other ODIN documents can be referred to using a URI containing a reference path to locate the document, with the internal path included as described above.

7. Leaf Data

All ODIN data eventually devolve to instances of the primitive types String , Integer , Real , Double , String , Character , various date/time types, lists or intervals of these types, and a few special types. ODIN does not use type or attribute names for instances of primitive types, only manifest values, making it possible to assume as little as possible about type names and structures of the primitive types. In all the following examples, the manifest data values are assumed to appear immediately inside a leaf pair of angle brackets, i.e.

some_attribute = <manifest value>

7.1. Primitive Types

7.1.1. Character Data

Characters are shown in a number of ways. In the literal form, a character is shown in single quotes, as follows:


Characters outside the low ASCII (0-127) range must be UTF-8 encoded, with a small number of backslash-quoted ASCII characters allowed, as described in the section File Encoding.

7.1.2. String Data

All strings are enclosed in double quotes, as follows:

    "this is a string"

Quotes are encoded using ISO/IEC 10646 codes, e.g. :

    "this is a much longer string, what one might call a &quot;phrase&quot;."

Line extension of strings is done simply by including returns in the string. The exact contents of the string are computed as being the characters between the double quote characters, with the removal of white space leaders up to the left-most character of the first line of the string. This has the effect of allowing the inclusion of multi-line strings in ODIN texts, in their most natural human-readable form, e.g.:

    text = <"And now the STORM-BLAST came, and he
            Was tyrannous and strong :
            He struck with his o'ertaking wings,
            And chased us south along.">

String data can be used to contain almost any other kind of data, which is intended to be parsed as some other formalism. Characters outside the low ASCII (0-127) range must be UTF-8 encoded, with a small number of backslash-quoted ASCII characters allowed, as described in section File Encoding.

7.1.3. Integer Data

Integers are represented simply as numbers, e.g.:


Commas or periods for breaking long numbers are not allowed, since they confuse the use of commas used to denote list items (See Lists of Built-in Types below).

7.1.4. Real Data

Real numbers are assumed whenever a decimal is detected in a number, e.g.:


Commas or periods for breaking long numbers are not allowed. Only periods may be used to separate the decimal part of a number; unfortunately, the European use of the comma for this purpose conflicts with the use of the comma to distinguish list items (See Lists of Built-in Types below).

7.1.5. Boolean Data

Boolean values can be indicated by the following values (case-insensitive):


7.1.6. Dates and Times

Complete Date/Times

In ODIN, full and partial dates, times and durations can be expressed. All full dates, times and durations are expressed using a subset of ISO8601. The Support IM provides a full explanation of the ISO8601 semantics supported in openEHR.

In ODIN, the use of ISO 8601 allows extended form only (i.e. ':' and '-' must be used). The ISO 8601 method of representing partial dates consisting of a single year number, and partial times consisting of hours only are not supported, since they are ambiguous. See below for partial forms.

Patterns for complete dates and times in ODIN include the following:

ISO8601_DATE      : YEAR '-' MONTH ( '-' DAY )? ;
ISO8601_TIME      : HOUR ':' MINUTE ( ':' SECOND ( ',' INTEGER )?)? ( TIMEZONE )? ;
ISO8601_DATE_TIME : YEAR '-' MONTH '-' DAY 'T' HOUR (':' MINUTE (':' SECOND ( ',' DIGIT+ )?)?)? ( TIMEZONE )? ;
// ISO8601 DURATION PnYnMnWnDTnnHnnMnn.nnnS
// here we allow a deviation from the standard to allow weeks to be mixed in with the rest since this commonly occurs in medicine
ISO8601_DURATION  : 'P'(DIGIT+[yY])?(DIGIT+[mM])?(DIGIT+[wW])?(DIGIT+[dD])?('T'(DIGIT+[hH])?(DIGIT+[mM])?(DIGIT+('.'DIGIT+)?[sS])?)? ;

fragment TIMEZONE :     'Z' | ('+'|'-') HOUR_MIN ;   // hour offset, e.g. `+0930`, or else literal `Z` indicating +0000.
fragment YEAR     :     [1-9][0-9]* ;
fragment MONTH    :     ( [0][0-9] | [1][0-2] ) ;    // month in year
fragment DAY      :     ( [012][0-9] | [3][0-2] ) ;  // day in month
fragment HOUR     :     ( [01]?[0-9] | [2][0-3] ) ;  // hour in 24 hour clock
fragment MINUTE   :     [0-5][0-9] ;                 // minutes
fragment HOUR_MIN :     ( [01]?[0-9] | [2][0-3] ) [0-5][0-9] ;  // hour / minutes quad digit pattern
fragment SECOND   :     [0-5][0-9] ;                 // seconds

Durations are expressed using a string which starts with 'P', and is followed by a list of periods, each appended by a single letter designator: 'Y' for years, "M' for months, 'W' for weeks, 'D' for days, 'H' for hours, 'M' for minutes, and 'S' for seconds. The literal 'T' separates the YMWD part from the HMS part, ensuring that months and minutes can be distinguished. Examples of date/time data include:

    1919-01-23                 -- birthdate of Django Reinhardt
    16:35:04,5                 -- rise of Venus in Sydney on 24 Jul 2003
    2001-05-12T07:35:20+1000   -- timestamp on an email received from Australia
    P22DT4H15M0S               -- period of 22 days, 4 hours, 15 minutes
Partial Date/Times

Two ways of expressing partial (i.e. incomplete) date/times are supported in ODIN. The ISO 8601 incomplete formats are supported in extended form only (i.e. with '-' and ':' separators) for all patterns that are unambiguous on their own. Dates consisting of only the year, and times consisting of only the hour are not supported, since both of these syntactically look like integers. The supported ISO 8601 patterns are as follows:

    yyyy-MM            -- a date with no days
    hh:mm              -- a time with no seconds
    yyyy-MM-ddThh:mm   -- a date/time with no seconds
    yyyy-MM-ddThh      -- a date/time, no minutes or seconds

To deal with the limitations of ISO 8601 partial patterns in a context-free parsing environment, a second form of pattern is supported in ODIN, based on ISO 8601. In this form, '?' characters are substituted for missing digits. Valid partial dates follow the patterns:

    yyyy-MM-??         -- date with unknown day in month
    yyyy-??-??         -- date with unknown month and day

Valid partial times follow the patterns:

    hh:mm:??           -- time with unknown seconds
    hh:??:??           -- time with unknown minutes and seconds

Valid date/times follow the patterns:

    yyyy-MM-dd Thh:mm:??    -- date/time with unknown seconds
    yyyy-MM-dd Thh:??:??    -- date/time with unknown minutes and seconds
    yyyy-MM-ddT??:??:??     -- date/time with unknown time
    yyyy-MM-??T??:??:??     -- date/time with unknown day and time
    yyyy-??-??T??:??:??     -- date/time with unknown month, day and time

7.2. Intervals of Ordered Primitive Types

Intervals of any ordered primitive type, i.e., Integer, Real, Date, Time, Date_time and Duration, can be stated using the following uniform syntax, where N, M are instances of any of the ordered types:

    |N..M|        -- the two-sided range N >= x <= M;
    |>N..M|       -- the two-sided range N > x <= M;
    |N..<M|       -- the two-sided range N >= x <M;
    |>N..<M|      -- the two-sided range N > x <M;
    |<N|          -- the one-sided range x < N;
    |>N|          -- the one-sided range x > N;
    |>=N|         -- the one-sided range x >= N;
    |<=N|         -- the one-sided range x <= N;
    |N +/-M|      -- interval of N ± M.

The allowable values for N and M include any value in the range of the relevant type.

Examples of this syntax include:

    |0..5|              -- integer interval
    |0.0..1000.0|       -- real interval
    |0.0..<1000.0|      -- real interval 0.0 >= x < 1000.0
    |08:02..09:10|      -- interval of time
    |>=1939-02-01|      -- open-ended interval of dates
    |5.0 +/-0.5|        -- 4.5 - 5.5
    |>=0|               -- >= 0

7.3. Other Built-in Types

7.3.1. URIs

URI can be expressed as ODIN data in the usual way found on the web, and follow the standard syntax from [rfc_3986]. Examples of URIs in ODIN:

Encoding of special characters in URIs follows the IETF RFC 3986, as described in the section File Encoding.

7.3.2. Coded Terms

Coded terms are ubiquitous in medical and clinical information, and are likely to become so in most other industries, as ontologically-based information systems and the 'semantic web' emerge. The logical structure of a coded term is simple: it consists of an identifier of a terminology (with optional version), and an identifier of a code within that terminology. The ODIN string representation is of the following form:


where terminology_id is an alphanumeric name, optionally following by a version in parentheses, and code is a string. The allowed characters in each part are described in the grammaar.

Examples from clinical data:

    [icd10AM::F60.1]            -- from ICD10AM
    [snomed_ct::2004950]        -- from snomed-ct
    [snomed_ct(3.1)::2004950]   -- from snomed-ct v 3.1

7.4. Lists of Built-in Types

Data of any primitive type can occur singly or in lists, which are shown as comma-separated lists of item, all of the same type, such as in the following examples:

    "cyan", "magenta", "yellow", "black"    -- printer's colours
    1, 1, 2, 3, 5                           -- first 5 fibonacci numbers
    08:02, 08:35, 09:10                     -- set of train times

No assumption is made in the syntax about whether a list represents a set, a list or some other kind of sequence - such semantics must be taken from an underlying information model.

Lists which happen to have only one datum are indicated by using a comma followed by a list continuation marker of three dots, i.e. "…​", e.g.:

    "en", ...       -- languages
    "icd10", ...    -- terminologies
    [at0200], ...

White space may be freely used or avoided in lists, i.e. the following two lists are identical:

    1, 1, 2,3

8. Path Syntax

8.1. Semantics

The general form of the path syntax is as follows (see syntax section below for full specification):

path            =     ['/'] path_segment { '/' path_segment } ;
path_segment    =     attr_name [ '[' object_id ']' ] ;

Essentially, ODIN paths consist of segments separated by slashes ('/'), where each segment is an attribute name with optional object identifier predicate, indicated by brackets ('[]').

ODIN Paths are formed from an alternation of segments made up of an attribute name and optional object node identifier predicate, separated by slash ('/') characters. Node identifiers are delimited by brackets (i.e. []).

Paths are absolute or relative with respect to the document in which they are mentioned. Absolute paths commence with an initial slash ('/') character.

A typical ODIN path used to refer to a node in an ODIN text is as follows.


In the following sections, paths are shown for all the ODIN data examples.

8.2. Relationship with W3C Xpath

The ODIN path syntax is semantically a subset of the Xpath query language, with a few syntactic shortcuts to reduce the verbosity of the most common cases. Xpath differentiates between "children" and "attributes" sub-items of an object due to the difference in XML between Elements (true sub-objects) and Attributes (tag-embedded primitive values). In ODIN, as with any pure object formalism, there is no such distinction, and all subparts of any object are referenced in the manner of Xpath children; in particular, in the Xpath abbreviated syntax, the key child:: does not need to be used.

ODIN does not distinguish attributes from children, and also assumes the node_id attribute. Thus, the following expressions are legal for cADL structures:

items[1]            -- the first member of 'items'
items["systolic"]   -- the member of 'items' with meaning 'systolic'
items["at0001"]     -- the member of 'items' with node id 'at0001'

The Xpath equivalents are:

items[1]                                -- the first member of 'items'
items[@key = 'systolic']                -- the member of 'items' with key "systolic"
items[@archetype_node_id = 'at0001']    -- the member of 'items' with archetype_node_id attribute 'at0001'

9. Plug-in Syntaxes

Using the ODIN syntax, any object structure can be serialised. In some cases, the requirement is to express some part of the structure in an abstract syntax, rather than in the more literal seriliased object form of ODIN. ODIN provides for this possibility by allowing the value of any object (i.e. what appears between any matching pair of <> delimiters) to be expressed in some other syntax, known as a "plug-in" syntax. Plug-in syntaxes are indicated in ODIN in a similar way as typed objects, i.e. by the use of the syntax type in parentheses preceding the <> block. For a plug-in section, the <> delimiters are modified to <# #>, to allow for easier parser design, and easier recognition of such blocks by human readers. The general form is as follows:

attr_name = (syntax) <#

The following example illustrates a cADL plug-in section in an archetype, which it itself an ODIN document:

definition = (cadl) <#
    ENTRY[at0000] ∈ { -- blood pressure measurement
        name ∈ { -- any synonym of BP
            CODED_TEXT ∈ {
                code ∈ {
                    CODE_PHRASE ∈ {[ac0001]}

Clearly, many plug-in syntaxes might one day be used within ODIN data; there is no guarantee that every ODIN parser will support them. The general approach to parsing should be to use plug-in parsers, i.e. to obtain a parser for a plug-in syntax that can be built into the existing parser framework.

Appendix A: Relationship with other Syntaxes

A.1. XML

A common question about ODIN is why it is needed, when there is already XML? To start with, this question highlights the widespread misconception about XML, namely that because it can be read by a text editor, it is intended for humans. In fact, XML is designed for machine processing, and is textual to guarantee its interoperability, not its readability. Realistic examples of XML (e.g. XML-schema instance, OWL-RDF ontologies) are generally unreadable for humans. ODIN is on the other hand designed as a human-writable and readable formalism that is also machine processable; it may be thought of as an abstract syntax for object-oriented data. ODIN also differs from XML by:

  • providing a more comprehensive set of leaf data types, including intervals of numerics and date/time types, and lists of all primitive types;

  • adhering to object-oriented semantics, particularly for container types, which XML schema languages generally do not;

  • not using the confusing XML notion of 'attributes' and 'elements' to represent what are essentially object properties;

  • requiring roughly half the space of the equivalent XML.

This does not prevent ODIN documents being converted to XML and indeed the conversion to XML instance is rather straighforward.

A.1.1. Expression of ODIN in XML

The ODIN syntax maps relatively easily to XML instance. It is important to realise that developers using XML often develop different mappings for object-oriented data, due to the fact that XML does not have systematic object-oriented semantics. This is particularly the case where containers such as lists and sets such as 'employees: List<Person>' are mapped to XML; many implementors have to invent additional tags such as 'employee' to make the mapping appear visually correct. The particular mapping chosen here is designed to be a faithful reflection of the semantics of the object-oriented data, and does not try take into account visual aesthetics of the XML. The result is that Xpath expressions are the same for ODIN and XML, and also correspond to what one would expect based on an underlying object model.

The main elements of the mapping are as follows.

Single Attributes

Single attribute nodes map to tagged nodes of the same name.

Container Attributes

Container attribute nodes map to a series of tagged nodes of the same name, each with the XML attribute 'id' set to the ODIN key. For example, the ODIN:

subjects = <
    ["philosophy:plato"] = <
        name = <"philosophy">
    ["philosophy:kant"] = <
        name = <"philosophy">

maps to the XML:

<subjects id="philosophy:plato">
    <name> philosophy </name>
<subjects id="philosophy:kant">
    <name> philosophy </name>

This guarantees that the path subjects[@id='philosophy:plato']/name navigates to the same element in both ODIN and the XML.

Nested Container Attributes

Nested container attribute nodes map to a series of tagged nodes of the same name, each with the XML attribute 'id' set to the ODIN key. For example, consider an object structure defined by the signature countries:Hash<Hash<Hotel,String>,String> . An instance of this in ODIN looks as follows:

countries = <
    ["spain"] = <
        ["hotels"] = <...>
        ["attractions"] = <...>
    ["egypt"] = <
        ["hotels"] = <...>
        ["attractions"] = <...>

can be mapped to the XML in which the synthesised element tag "_items" and the attribute "key" are used:

<countries key="spain">
    <_items key="hotels">
    <_items key="attractions">

<countries key="egypt">
    <_items id="hotels">
    <_items key="attractions">

In this case, the ODIN path countries["spain"]/["hotels"] will be transformed to the Xpath countries[@key="spain"]/_items[@key="hotels"] in order to navigate to the same element.

Type Names

Type names map to XML 'type' attributes e.g. the ODIN:

destinations = <
    ["seville"] = (TOURIST_DESTINATION) <
        profile = (DESTINATION_PROFILE) <...>
        hotels = <
            ["gran sevilla"] = (HISTORIC_HOTEL) <...>

maps to:

<destinations id="seville" rm:type="TOURIST_DESTINATION">
	<profile rm:type="DESTINATION_PROFILE">
	<hotels id="gran sevilla" rm:type="HISTORIC_HOTEL">


The JavaScript Object Notation (JSON) was designed with the aim of representing JavaScript data objects in a programming language independent way, primarily for use with the web and JavaScript. The majority of use was for small fragments, although in more recent years it is starting to be used for more complex data representation tasks, for example with REST web services.

A.2.1. Leaf types

ODIN has more terminal types than JSON, including the date/time types, and the Interval types.

Date/time types would typically be mapped to and from Strings containing ISO8601 syntax dates and times.

The interval is a built-in ODIN type that would need to be explicitly expanded into a JSON structure, with an assumed model of the parts of the Interval. For this purpose, the following model is recommended as a basis for constructing the JSON equivalent:

class Interval <T: Ordered> {
    T lower;
    T upper;
    Boolean lower_included;
    Boolean upper_included;

A.2.2. Typing

ODIN supports optional type markers, which are not available with JSON. In a conversion situation these would need to be converted to an explicit structure.

Appendix B: Syntax Specification

The grammar and lexical specification for the standard ODIN syntax is shown below in ANTLR4 form.  

//	description: Antlr4 grammar for Object Data Instance Notation (ODIN)
//	author:      Thomas Beale <>
//	support:     openEHR Specifications PR tracker <>
//	copyright:   Copyright (c) 2015 openEHR Foundation
//	license:     Apache 2.0 License <>

grammar odin;
import odin_values;

// -------------------------- Parse Rules --------------------------

odin_text :
    | object_value_block

attr_vals : ( attr_val ';'? )+ ;

attr_val : attribute_id '=' object_block ;

object_block :
    | object_reference_block

object_value_block : ( '(' type_id ')' )? '<' ( primitive_object | attr_vals? | keyed_object* ) '>' ;

keyed_object : '[' primitive_value ']' '=' object_block ; // TODO: probably should limit to String and Integer?

// ------ leaf types ------

primitive_object :
    | primitive_list_value
    | primitive_interval_value

primitive_value :
    | integer_value
    | real_value
    | boolean_value
    | character_value
    | term_code_value
    | date_value
    | time_value
    | date_time_value
    | duration_value
    | uri_value

primitive_list_value :
    | integer_list_value
    | real_list_value
    | boolean_list_value
    | character_list_value
    | term_code_list_value
    | date_list_value
    | time_list_value
    | date_time_list_value
    | duration_list_value

primitive_interval_value :
    | real_interval_value
    | date_interval_value
    | time_interval_value
    | date_time_interval_value
    | duration_interval_value

object_reference_block : '<' odin_path_list '>' ;

odin_path_list     : odin_path ( ( ',' odin_path )+ | SYM_LIST_CONTINUE )? ;
odin_path          : '/' | odin_path_segment+ ;
odin_path_segment  : '/' odin_path_element ;
odin_path_element  : attribute_id ( '[' ( STRING | INTEGER ) ']' )? ;

The following grammar defines ODIN terminal value syntax, and can be imported by any parser needing the ODIN values.

// grammar defining ODIN terminal value types, including atoms, lists and intervals

grammar odin_values;
import base_patterns;

string_value : STRING ;
string_list_value : string_value ( ( ',' string_value )+ | ',' SYM_LIST_CONTINUE ) ;

integer_value : ( '+' | '-' )? INTEGER ;
integer_list_value : integer_value ( ( ',' integer_value )+ | ',' SYM_LIST_CONTINUE ) ;
integer_interval_value :
      '|' SYM_GT? integer_value SYM_INTERVAL_SEP SYM_LT? integer_value '|'
    | '|' relop? integer_value '|'
integer_interval_list_value : integer_interval_value ( ( ',' integer_interval_value )+ | ',' SYM_LIST_CONTINUE ) ;

real_value : ( '+' | '-' )? REAL ;
real_list_value : real_value ( ( ',' real_value )+ | ',' SYM_LIST_CONTINUE ) ;
real_interval_value :
      '|' SYM_GT? real_value SYM_INTERVAL_SEP SYM_LT? real_value '|'
    | '|' relop? real_value '|'
real_interval_list_value : real_interval_value ( ( ',' real_interval_value )+ | ',' SYM_LIST_CONTINUE ) ;

boolean_value : SYM_TRUE | SYM_FALSE ;
boolean_list_value : boolean_value ( ( ',' boolean_value )+ | ',' SYM_LIST_CONTINUE ) ;

character_value : CHARACTER ;
character_list_value : character_value ( ( ',' character_value )+ | ',' SYM_LIST_CONTINUE ) ;

date_value : ISO8601_DATE ;
date_list_value : date_value ( ( ',' date_value )+ | ',' SYM_LIST_CONTINUE ) ;
date_interval_value :
      '|' SYM_GT? date_value SYM_INTERVAL_SEP SYM_LT? date_value '|'
    | '|' relop? date_value '|'
date_interval_list_value : date_interval_value ( ( ',' date_interval_value )+ | ',' SYM_LIST_CONTINUE ) ;

time_value : ISO8601_TIME ;
time_list_value : time_value ( ( ',' time_value )+ | ',' SYM_LIST_CONTINUE ) ;
time_interval_value :
      '|' SYM_GT? time_value SYM_INTERVAL_SEP SYM_LT? time_value '|'
    | '|' relop? time_value '|'
time_interval_list_value : time_interval_value ( ( ',' time_interval_value )+ | ',' SYM_LIST_CONTINUE ) ;

date_time_value : ISO8601_DATE_TIME ;
date_time_list_value : date_time_value ( ( ',' date_time_value )+ | ',' SYM_LIST_CONTINUE ) ;
date_time_interval_value :
      '|' SYM_GT? date_time_value SYM_INTERVAL_SEP SYM_LT? date_time_value '|'
    | '|' relop? date_time_value '|'
date_time_interval_list_value : date_time_interval_value ( ( ',' date_time_interval_value )+ | ',' SYM_LIST_CONTINUE ) ;

duration_value : ISO8601_DURATION ;
duration_list_value : duration_value ( ( ',' duration_value )+ | ',' SYM_LIST_CONTINUE ) ;
duration_interval_value :
      '|' SYM_GT? duration_value SYM_INTERVAL_SEP SYM_LT? duration_value '|'
    | '|' relop? duration_value '|'
duration_interval_list_value : duration_interval_value ( ( ',' duration_interval_value )+ | ',' SYM_LIST_CONTINUE ) ;

term_code_value : TERM_CODE_REF ;
term_code_list_value : term_code_value ( ( ',' term_code_value )+ | ',' SYM_LIST_CONTINUE ) ;

uri_value : URI ;

relop : SYM_GT | SYM_LT | SYM_LE | SYM_GE ;

The following grammar defines syntax of generic base patterns.

//  General purpose patterns used in all openEHR parser and lexer tools

grammar base_patterns;

// -------------------------- Parse Rules --------------------------

type_id      : ALPHA_UC_ID ( '<' type_id ( ',' type_id )* '>' )? ;
attribute_id : ALPHA_LC_ID ;
identifier   : ALPHA_UC_ID | ALPHA_LC_ID ;


// -------------------------- Lexer patterns --------------------------

// ---------- symbols ----------

SYM_GT : '>' ;
SYM_LT : '<' ;
SYM_LE : '<=' ;
SYM_GE : '>=' ;
SYM_NE : '/=' | '!=' ;
SYM_EQ : '=' ;


// ---------- whitespace & comments ----------

WS         : [ \t\r]+    -> skip ;
LINE       : '\n'        -> skip ;     // increment line count
H_CMT_LINE : '--------' '-'*? '\n'  ;  // special type of comment for splitting template overlays
CMT_LINE   : '--' .*? '\n'  -> skip ;  // (increment line count)

// ---------- ISO8601 Date/Time values ----------

// TODO: consider adding non-standard but unambiguous patterns like YEAR '-' ( MONTH | '??' ) '-' ( DAY | '??' )
ISO8601_DATE      : YEAR '-' MONTH ( '-' DAY )? ;
ISO8601_TIME      : HOUR ':' MINUTE ( ':' SECOND ( ',' INTEGER )?)? ( TIMEZONE )? ;
ISO8601_DATE_TIME : YEAR '-' MONTH '-' DAY 'T' HOUR (':' MINUTE (':' SECOND ( ',' DIGIT+ )?)?)? ( TIMEZONE )? ;
fragment TIMEZONE : 'Z' | ('+'|'-') HOUR_MIN ;   // hour offset, e.g. `+0930`, or else literal `Z` indicating +0000.
fragment YEAR     : [1-9][0-9]* ;
fragment MONTH    : ( [0][0-9] | [1][0-2] ) ;    // month in year
fragment DAY      : ( [012][0-9] | [3][0-2] ) ;  // day in month
fragment HOUR     : ( [01]?[0-9] | [2][0-3] ) ;  // hour in 24 hour clock
fragment MINUTE   : [0-5][0-9] ;                 // minutes
fragment HOUR_MIN : ( [01]?[0-9] | [2][0-3] ) [0-5][0-9] ;  // hour / minutes quad digit pattern
fragment SECOND   : [0-5][0-9] ;                 // seconds

// ISO8601 DURATION PnYnMnWnDTnnHnnMnn.nnnS
// here we allow a deviation from the standard to allow weeks to be // mixed in with the rest since this commonly occurs in medicine
// TODO: the following will incorrectly match just 'P'
ISO8601_DURATION : 'P' (DIGIT+ [yY])? (DIGIT+ [mM])? (DIGIT+ [wW])? (DIGIT+[dD])? ('T' (DIGIT+[hH])? (DIGIT+[mM])? (DIGIT+ ('.'DIGIT+)?[sS])?)? ;

// ------------------- special word symbols --------------
SYM_TRUE  : [Tt][Rr][Uu][Ee] ;
SYM_FALSE : [Ff][Aa][Ll][Ss][Ee] ;

// ---------------------- Identifiers ---------------------

VERSION_ID          : DIGIT+ '.' DIGIT+ '.' DIGIT+ ( ( '-rc' | '-alpha' ) ( '.' DIGIT+ )? )? ;

// --------------------- composed primitive types -------------------

TERM_CODE_REF : '[' NAME_CHAR+ ( '(' NAME_CHAR+ ')' )? '::' NAME_CHAR+ ']' ;  // e.g. [ICD10AM(1998)::F23]; [ISO_639-1::en]

// URIs - simple recogniser based on and
fragment URI_AUTHORITY : ( URI_USER '@' )? URI_HOST ( ':' NATURAL )? ;
fragment URI_PATH   : ( '/' URI_XPALPHA+ )+ ;
fragment URI_QUERY  : URI_XALPHA+ ( '+' URI_XALPHA+ )* ;

fragment IPV6_LITERAL : HEX_QUAD (':' HEX_QUAD )* '::' HEX_QUAD (':' HEX_QUAD )* ;

fragment URI_XPALPHA : URI_XALPHA | '+' ;
fragment URI_SAFE   : [$@.&_-] ;
fragment URI_EXTRA  : [!*"'()] ;
fragment URI_RESERVED : [=;/#?: ] ;

fragment NATURAL  : [1-9][0-9]* ;

// According to IETF[RFC 1034] and[RFC 1035],
// as clarified by[RFC 2181] (section 11)
fragment NAMESPACE : LABEL ('.' LABEL)+ ;


ALPHA_UC_ID : ALPHA_UCHAR WORD_CHAR* ;           // used for type ids
ALPHA_LC_ID : ALPHA_LCHAR WORD_CHAR* ;           // used for attribute / method ids

// --------------------- atomic primitive types -------------------

fragment E_SUFFIX : [eE][+-]? DIGIT+ ;

STRING : '"' STRING_CHAR*? '"' ;
fragment STRING_CHAR : ~["\\] | ESCAPE_SEQ | UTF8CHAR ; // strings can be multi-line

CHARACTER : '\'' CHAR '\'' ;
fragment CHAR : ~['\\\r\n] | ESCAPE_SEQ | UTF8CHAR  ;

fragment ESCAPE_SEQ: '\\' ['"?abfnrtv\\] ;

// ------------------- character fragments ------------------

fragment NAME_CHAR     : WORD_CHAR | '-' ;
fragment WORD_CHAR     : ALPHANUM_CHAR | '_' ;

fragment ALPHA_CHAR  : [a-zA-Z] ;
fragment ALPHA_UCHAR : [A-Z] ;
fragment ALPHA_LCHAR : [a-z] ;

fragment DIGIT     : [0-9] ;
fragment HEX_DIGIT : [0-9a-fA-F] ;



  1. [Anderson_1996] Ross Anderson. Security in Clinical Information Systems. Available at

  2. [Baretto_2005] Barretto S A. Designing Guideline-based Workflow-Integrated Electronic Health Records. 2005. PhD dissertation, University of South Australia. Available at

  3. [Beale_2000] Beale T. Archetypes: Constraint-based Domain Models for Future-proof Information Systems. 2000. Available at .

  4. [Beale_2002] Beale T. Archetypes: Constraint-based Domain Models for Future-proof Information Systems. Eleventh OOPSLA Workshop on Behavioral Semantics: Serving the Customer (Seattle, Washington, USA, November 4, 2002). Edited by Kenneth Baclawski and Haim Kilov. Northeastern University, Boston, 2002, pp. 16-32. Available at .

  5. [Beale_Heard_2007] Beale T, Heard S. An Ontology-based Model of Clinical Information. 2007. pp760-764 Proceedings MedInfo 2007, K. Kuhn et al. (Eds), IOS Publishing 2007. See

  6. [Booch_1994] Booch G. Object-Oriented Analysis and Design with applications. 2nd ed. Benjamin/Cummings 1994.

  7. [Browne_2005] Browne E D. Workflow Modelling of Coordinated Inter-Health-Provider Care Plans. 2005. PhD dissertation, University of South Australia. Available at

  8. [Cimino_1997] Cimino J J. Desiderata for Controlled Medical vocabularies in the Twenty-First Century. IMIA WG6 Conference, Jacksonville, Florida, Jan 19-22, 1997.

  9. [Eiffel] Meyer B. Eiffel the Language (2nd Ed). Prentice Hall, 1992.

  10. [Elstein_1987] Elstein AS, Shulman LS, Sprafka SA. Medical problem solving: an analysis of clinical reasoning. Cambridge, MA: Harvard University Press 1987.

  11. [Elstein_Schwarz_2002] Elstein AS, Schwarz A. Evidence base of clinical diagnosis: Clinical problem solving and diagnostic decision making: selective review of the cognitive literature. BMJ 2002;324;729-732.

  12. [Fowler_1997] Fowler M. Analysis Patterns: Reusable Object Models. Addison Wesley 1997

  13. [Fowler_Scott_2000] Fowler M, Scott K. UML Distilled (2nd Ed.). Addison Wesley Longman 2000.

  14. [Gray_reuter_1993] Gray J, Reuter A. Transaction Processing Concepts and Techniques. Morgan Kaufmann 1993.

  15. [Hein_2002] Hein J L. Discrete Structures, Logic and Computability (2nd Ed). Jones and Bartlett 2002.

  16. [Hnìtynka_2004] Hnìtynka P, Plášil F. Distributed Versioning Model for MOF. Proceedings of WISICT 2004, Cancun, Mexico, A volume in the ACM international conference proceedings series, published by Computer Science Press, Trinity College Dublin Ireland, 2004.

  17. [Ingram_1995] Ingram D. The Good European Health Record Project. Laires, Laderia Christensen, Eds. Health in the New Communications Age. Amsterdam: IOS Press; 1995; pp. 66-74.

  18. [Kifer_Lausen_Wu_1995] Kifer M, Lausen G, Wu J. Logical Foundations of Object-Oriented and FrameBased Languages. JACM May 1995. See See

  19. [Kilov_1994] Kilov H, Ross J. Information Modelling - an object-oriented approach. Prentice Hall 1994.

  20. [Maier_2000] Maier M. Architecting Principles for Systems-of-Systems. Technical Report, University of Alabama in Huntsville. 2000. Available at

  21. [Martin] Martin P. Translations between UML, OWL, KIF and the WebKB-2 languages (For-Taxonomy, Frame-CG, Formalized English). May/June 2003. Available at as at Aug 2004.

  22. [Meyer_OOSC2] Meyer B. Object-oriented Software Construction, 2nd Ed. Prentice Hall 1997

  23. [Müller_2003] Müller R. Event-oriented Dnamic Adaptation of Workflows: Model, Architecture, and Implementation. 2003. PhD dissertation, University of Leipzig. Available at

  24. [Object_Z] Smith G. The Object Z Specification Language. Kluwer Academic Publishers 2000. See .

  25. [Rector_1994] Rector A L, Nowlan W A, Kay S. Foundations for an Electronic Medical Record. The IMIA Yearbook of Medical Informatics 1992 (Eds. van Bemmel J, McRay A). Stuttgart Schattauer 1994.

  26. [Rector] Rector A L. Clinical terminology: why is it so hard? Methods Inf Med. 1999 Dec;38(4-5):239-52. Available at .

  27. [Richards_1998] Richards E G. Mapping Time - The Calendar and its History. Oxford University Press 1998.

  28. [Sowa_2000] Sowa J F. Knowledge Representation: Logical, philosophical and Computational Foundations. 2000, Brooks/Cole, California.

  29. [Sottile_1999] Sottile P.A., Ferrara F.M., Grimson W., Kalra D., and Scherrer J.R. The holistic healthcare information system. Toward an Electronic Health Record Europe. 1999. Nov 1999; 259-266.

  30. [Van_de_Velde_Degoulet_2003] Van de Velde R, Degoulet P. Clinical Information Systems: A Component-Based Approach. 2003. Springer-Verlag New York.

  31. [Weed_1969] Weed LL. Medical records, medical education and patient care. 6 ed. Chicago: Year Book Medical Publishers Inc. 1969.



  1. [bfo] Institute for Formal Ontology and Medical Information Science (IFOMIS). Basic Formal Ontology (BFO). .

  2. [FMA] .

  3. [Horrocks_owl] Patel-Schneider P, Horrocks I, Hayes P. OWL Web Ontology Language Semantics and Abstract Syntax. See .

  4. [IAO] Information Artefact Ontology. .

  5. [OBO] The Open Biological and Biomedical Ontologies. See .

  6. [OGMS] Ontology for General Medical Science (OGMS). .


  1. [cov_contra] Wikipedia. Covariance and contravariance. See .

e-Health Standards

  1. [ENV_13606-1] ENV 13606-1 - Electronic healthcare record communication - Part 1: Extended architecture. CEN/ TC 251 Health Informatics Technical Committee.

  2. [ENV_13606-2] ENV 13606-2 - Electronic healthcare record communication - Part 2: Domain term list. CEN/TC 251 Health Informatics Technical Committee.

  3. [ENV_13606-3] ENV 13606-3 - Electronic healthcare record communication - Part 3: Distribution rules. CEN/TC 251 Health Informatics Technical Committee.

  4. [ENV_13606-4] ENV 13606-4 - Electronic Healthcare Record Communication standard Part 4: Messages for the exchange of information. CEN/ TC 251 Health Informatics Technical Committee.

  5. [Corbamed_PIDS] Object Management Group. Person Identification Service. March 1999.

  6. [Corbamed_LQS] Object Management Group. Lexicon Query Service. March 1999.

  7. [HL7v3_ballot2] JL7 International. HL7 version 3 2nd Ballot specification. Available at

  8. [HL7v3_data_types] Schadow G, Biron P. HL7 version 3 deliverable: Version 3 Data Types. (2nd ballot 2002 version).

  9. [hl7_v3_rim] HL7. HL7 v3 RIM. See .

  10. [ICD10AM]. WHO / ACCD. International Classification of Diseases, 10th Revision, Australian Modifications. See

  11. [IHTSDO] International Health Terminology Standards Development Organisation (IHTSDO).


  13. [NLM_UML_list] National Library of Medicine. UMLS Terminologies List.

  14. [SNOMED_CT] IHTSDO. Sytematised Nomenclature for Medicine. See

  15. [WHO_ICD] World Health Organisation (WHO). International Classification of Diseases (ICD). See: .

  16. [ISO_18308] Schloeffel P. (Editor). Requirements for an Electronic Health Record Reference Architecture. (ISO TC 215/SC N; ISO/WD 18308). International Standards Organisation, Australia, 2002.

  17. [ISO_20514] ISO. The Integrated Care EHR. See .

  18. [UCUM] Schadow G, McDonald C J. The Unified Code for Units of Measure, Version 1.4 2000. Regenstrief Institute for Health Care, Indianapolis. See

e-Health Projects

  1. [CIMI] Clinical Information Modelling Initiative (CIMI) Project. See

  2. [EHCR_supA_14] Dixon R, Grubb P A, Lloyd D, and Kalra D. Consolidated List of Requirements. EHCR Support Action Deliverable 1.4. European Commission DGXIII, Brussels; May 2001 59pp Available from

  3. [EHCR_supA_35] Dixon R, Grubb P, Lloyd D. EHCR Support Action Deliverable 3.5: "Final Recommendations to CEN for future work". Oct 2000. Available at

  4. [EHCR_supA_24] Dixon R, Grubb P, Lloyd D. EHCR Support Action Deliverable 2.4 "Guidelines on Interpretation and implementation of CEN EHCRA". Oct 2000. Available at

  5. [ Lloyd D, et al. EHCR Support Action Deliverable 3.1&3.2 “Interim Report to CEN”. July 1998. Available at

  6. [GEHR_del_4] Deliverable 4: GEHR Requirements for Clinical Comprehensiveness. GEHR Project 1992

  7. [GEHR_del_7] Deliverable 7: Clinical Functional Specifications. GEHR Project 1993

  8. [GEHR_del_8] Deliverable 8: Ethical and legal Requirements of GEHR Architecture and Systems. GEHR Project 1994

  9. [GEHR_del_19_20_24] Deliverable 19,20,24: GEHR Architecture. GEHR Project 30/6/1995

  10. [GeHR_AUS] Heard S, Beale T. The Good Electronic Health Record (GeHR) (Australia). See .

  11. [GeHR_Aus_gpcg] Heard S. GEHR Project Australia, GPCG Trial. Available at

  12. [GeHR_Aus_req] Beale T, Heard S. GEHR Technical Requirements. See

  13. [Synapses_req_A] Kalra D. (Editor). The Synapses User Requirements and Functional Specification (Part A). EU Telematics Application Programme, Brussels; 1996; The Synapses Project: Deliverable USER 1.1.1a. 6 chapters, 176 pages.

  14. [Synapses_req_B] Grimson W. and Groth T. (Editors). The Synapses User Requirements and Functional Specification (Part B). EU Telematics Application Programme, Brussels; 1996; The Synapses Project: Deliverable USER 1.1.1b.

  15. [Synapses_odp] Kalra D. (Editor). Synapses ODP Information Viewpoint. EU Telematics Application Programme, Brussels; 1998; The Synapses Project: Final Deliverable. 10 chapters, 64 pages.

  16. [synex] University College London. SynEx project. .

General Standards

  1. [OCL] The Object Constraint Language 2.0. Object Management Group (OMG). Available at .

  2. [IANA] IANA.

  3. [IEEE_828] IEEE. IEEE 828-2005: standard for Software Configuration Management Plans.

  4. [ISO_8601] ISO 8601 standard describing formats for representing times, dates, and durations. See e.g. and

  5. [ISO_2788] ISO. ISO 2788 Guide to Establishment and development of monolingual thesauri.

  6. [ISO_5964] ISO. ISO 5964 Guide to Establishment and development of multilingual thesauri.

  7. [Perl_regex] Perl Regular Expressions. Available at .

  8. [Protege] Stanford University. See .

  9. [rfc_2396] Berners-Lee T. Universal Resource Identifiers in WWW. Available at This is a World-Wide Web RFC for global identification of resources. In current use on the web, e.g. by Mosaic, Netscape and similar tools. See for a starting point on URIs.

  10. [rfc_2440] RFC 2440: OpenPGP Message Format. See and

  11. [rfc_3986] RFC 3986: Uniform Resource Identifier (URI): Generic Syntax. IETF. See .

  12. [rfc_4122] RFC 4122: A Universally Unique IDentifier (UUID) URN Namespace. IETF. See .

  13. [rfc_2781] IETF. RFC 2781: UTF-16, an encoding of ISO 10646 See

  14. [rfc_5646] IETF. RFC 5646. Available at

  15. [sem_ver] Semantic Versioning. .

  16. [Xpath] W3C Xpath 1.0 specification. 1999. Available at

  17. [uri_syntax] Uniform Resource Identifier (URI): Generic Syntax, Internet proposed standard. January 2005. see .

  18. [w3c_owl] W3C. OWL - the Web Ontology Language. See .

  19. [w3c_xpath] W3C. XML Path Language. See .


  1. [Template_Designer] Template Designer. Ocean Informatics.

openEHR Resources

  1. [openehr_18308] The openEHR Foundation. Conformance of openEHR architecture to ISO TS 18308, "Requirements for EHR Architectures". See

  2. [openEHR_ADL_workbench] The openEHR Foundation. The openEHR ADL Workbench. .

  3. [openehr_am_overview] The openEHR Foundation. The openEHR Archetypes Technical Overview. See

  4. [openehr_am_adl14] The openEHR Foundation. Archetype Definition Language 1.4 (ADL1.4). Available at

  5. [openehr_am_aom14] The openEHR Foundation. Archetype Object Model 1.4 (AOM1.4). Available at

  6. [openehr_am_adl2] The openEHR Foundation. Archetype Definition Language 2 (ADL2). Available at

  7. [openehr_am_aom2] The openEHR Foundation. Archetype Object Model 2 (AOM2). Available at

  8. [openehr_am_identification] The openEHR Foundation. Archetype Identification specification. Available at

  9. [openehr_am_def_pri] The openEHR Foundation. Archetype Definitions and Principles. (deprecated) Available at

  10. [openehr_am_sys] The openEHR Foundation. The Archetype System. (deprecated) Available at

  11. [openehr_am_oap] The openEHR Foundation. The openEHR Archetype Profile. .

  12. [openehr_CKM] The openEHR Clinical Knowledge Manager (CKM). See

  13. [openehr_odin] The openEHR Foundation. Object Data Instance Notation (ODIN). Available at

  14. [openehr_overview] The openEHR Foundation. The openEHR Architecture Overview. See

  15. [openehr_query_aql] The openEHR Foundation. The openEHR Archetype Query language (AQL). See

  16. [openehr_rm_data_types] openEHR. Data Types Information Model. See

  17. [openehr_rm_data_structures] openEHR. Data Structures Information Model. See

  18. [openehr_rm_common] openEHR. Common Information Model. See

  19. [openehr_rm_ehr] The openEHR Foundation. The EHR Information Model.

  20. [openehr_rm_ehr_extract] The openEHR Foundation. The EHR Extrct Information Model.

  21. [openehr_rm_integration] The openEHR Foundation. The Integration Information Model.

  22. [openehr_rm_support] openEHR. Support Information Model. See

  23. [openehr_terminology] The openEHR Foundation. The openEHR Terminology{term_release}/SupportTerminology.html.

  24. [openehr_terminology_resources] The openEHR Foundation. The openEHR Terminology project (GitHub)