openEHR 2.x RM proposals - lower information model
Skip to end of metadata
Go to start of metadata

Introduction

This page is describes changes proposed to the openEHR release 1.0.2 Reference Model (RM) in response to the many lessons learned over the years since its publication. The issues driving the changes are recorded on the Jira SPECPR issue tracker .

There are two possible flavours of proposals here. The first is for changes that have acceptable impact on the growing number of openEHR-based production systems and data. The second is for 'ideal' next generation models that don't necessarily take account of impact on existing data and systems. However, even 'blue-sky' suggestions need to be aware of the 'community memory' that exists, including people's current understanding of names and design ideas within the openEHR models. Is creating a completely new Reference Model still 'openEHR'? We need to be clear on what we understand as being an 'openEHR RM' versus something else, such as the 13606 RM, also based on the openEHR archetype design concept.

For the purpose of clarity, we suggest that this page and its children address only the 'openEHR Reference Model', not other reference models that simply use the archetype-based methodology.


Current state - Release 1.0.2

Models

The current data structure models are shown below.



Problems

Problems / irritations with the above models appear to include the following:

  • few archetypes use anything but ITEM_TREE because it appears that one 'can never know' if some more detail will be needed later
    • TB: but what about things like Apgar, Barthel etc - surely they are linear lists?
    • also, it appears that some form of 'table' structure is still needed
  • ITEM_TREE, ITEM_LIST etc cannot be nested inside each other arbitrarily.
  • the structures complicate the software unnecessarily, without adding much value (this would clearly be true if no/limited use is being made of ITEM_LIST, ITEM_SINGLE) - [question: by "use is being made", do you mean use of the class methods in software or use of the structuring possibilities? The structuring possibilities will remain if a structure_type variable is used.]
    • Depending on how you write and divide/distribute software functionality, having ITEM_STRUCTURE subclasses may just complicate class structure and not add any value at all in server/backend/query code and storage. Storing the same structure/presentation info in a structure_type variable will still give GUI code what it needs for validation and presentation but a handful of classes less to implement and maintain e.g. on the server side. Some implementations (or parts of implementations) handle openEHR structures mainly as documents, not objects, thus only stored attributes, not object methods, are used - in those cases the methods of ITEM_STRUCTURE subclasses bring absolutely no value and a structure_type variable would be easier to handle than having to store or infer object type info.
    • When learning and presenting openEHR, there will be fewer classes and one level of nesting less to consider, making the design less cluttered. 
    • "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." - Antoine de Saint-Exupery
  • archetype paths are made longer and more complex ... MORE INFO REQUIRED - WHAT'S THE PROBLEM?** Paths are used e.g. in AQL queries - shortening or simplification of paths make queries easier to read, write and understand. 
    • Having fewer nesting levels to traverse in hierarchical database backends (e.g. network-DBs and XML-DBs) or ORM mapping frameworks when fetching data from queries would likely improve performance.
    • Shorter paths also means less to parse and translate for the query processing software, but without measuring the impact of this it is not possible to say if it matters very much for performance in practice.
  • a clear solution to the pizza problem (multi-value items & UI) is needed
  • add a type that is a mixture of CLUSTER and ELEMENT, i.e. has a value and also children, to allow for the fractal nature of data, with a 'summary' value, plus underlying detail

Below, various simplified models are proposed, each with an impact analysis.


Candidate A - make ITEM_STRUCTURE inherit from CLUSTER

Proposal - Thomas Beale

Status

under construction

Design concept

  • Keep ITEM_STRUCTURE and children, but just make them a variety of CLUSTER. Impact remaining model as little as possible.

Changes

  • ITEM_STRUCTURE now inherits from CLUSTER.
  • all static declarations in the remainder of the RM for ITEM_STRUCTURE changed to CLUSTER. In theory they should stay ITEM_STRUCTURE, but the problem is that with no static declarations anywhere for CLUSTER or ITEM, CLUSTER archetypes have no place to go, and the CLUSTER type is not detected by my current 'type closure' detecting algorithm. This should probably be changed.
  • the types ITEM_TREE, ITEM_SINGLE, ITEM_TABLE, ITEM_LIST could be kept as programming types for manipulating the specific kinds of data structure - they provide a formalisation of the respective constraints on contents (i.e. ITEM_LIST can only contain ELEMENTs, and so on)
  • Interior nodes of an ITEM_TREE (aka CLUSTER) can now be other ITEM_STRUCTURE subtypes.

Diagram

Impact Analysis

Component

Impact

On RM

 

On existing archetypes

 

On archetype tooling

 

On existing RM-1.0.2 based software

 

On existing RM 1.0.2 data

 

Discussion

The structure_type attribute of the CLUSTER class is slightly redundant with respect to the ITEM_STRUCTURE descendant types, but makes sense in terms of backward compatibility with existing data. A system that already has ITEM_STRUCTURE + subtypes, + existing instances of those types might be changed to only create CLUSTER-based data in the future, where only the structure_type attribute was used to mark the intended logical structure of a given CLUSTER subtree. Assuming this attribute is used for anything but 'tree', then the result is software that has to implement the same logic as the original ITEM_STRUCTURE descendants, but without having any explicit types to which to attach it.

The second obvious comment one can make on this above model is that ITEM_STRUCTURE is technically redundant (i.e. if building such a model from scratch, it would not be needed). We have left it in here, so that existing static declarations of type ITEM_STRUCTURE in the Release 1.0.2 openEHR RM will remain valid. Getting rid of it would require changing such static references to CLUSTER.


Candidate A.1 - Add VALUE_CLUSTER, Remove ITEM_STRUCTURE types

Proposal - Thomas Beale / Ian McNicoll

Status

This particular model uses 'diamond' multiple inheritance, and is not intended for a real proposal, since most languages don't support this, and it isn't really necessary anyway. Here it is used just to illustrate the concept.

Design Concept

In this model, a new class is added that combines CLUSTER and ELEMENT. This reflects the fractal nature of reality. Initially you think you have just an ELEMENT, but later on, people want to start recording more fine detail. In the other direction, information users often want a 'summary' data point for a collection of details. No ITEM_STRUCTURE classes are included at all.

This model is not intended as a 'final solution', just to show what is needed (a CLUSTER-with-value idea), and one way to model it. The technical needs we are trying to meet here are:

  • retain CLUSTER and ELEMENT classes, since they remain useful, are already defined, and map cleanly to 13606-1
  • support efficient, shortest possible path to a summary data item (e.g. 'smoking status' = Smoker)
  • ensure that if the summary item is added at runtime, e.g. by converting a CLUSTER to the new CLUSTER+value type, the paths of the underlying detailed items don't change
  • ensure that if an ELEMENT is converted to a CLUSTER+value at runtime, the path of the ELEMENT.value does not change
  • make it so that a specialised archetype can convert a CLUSTER or ELEMENT to a CLUSTER+value type

Changes

A new VALUE_CLUSTER, inheriting from ELEMENT and CLUSTER provides the semantics of both: a node which can itself have a value (like an ELEMENT), but may still hvae substructure. By inheriting from both CLUSTER and ELEMENT, it means that where either of these two are currently specified in the RM or archetypes, VALUE_CLUSTER could be substituted at runtime.The downside of this model is that there is no way to force a node to be just an ELEMENT or CLUSTER, since the new type is always substitutable.

Diagram

Impact Analysis

Component

Impact

On RM

 

On existing archetypes

 

On archetype tooling

 

On existing RM-1.0.2 based software

 

On existing RM 1.0.2 data

 

Discussion

Questions/thoughts from Erik Sundvall:

  • The above VALUE_CLUSTER sugestion is an interesting change, and if flexibility is what is sought for, then perhaps the simplification can be taken even further...
  • Now the current ITEM+ELEMENT+CLUSTER follows the composite design pattern (see c2 wiki and wikipedia http://en.wikipedia.org/wiki/Composite_pattern). But since there are not many common operations/methods shared by ELEMENTs and CLUSTERs (except the ones already in PATHABLE/LOCATABLE) then perhaps the composite design pattern is not needed/helpful in this part of the openEHR structure. (Also see discussions at CompositeConsideredHarmful and maybe this.)
  • If the contents of both ELEMENT and CLUSTER are pushed up to ITEM then we get the same functionality as proposed in VALUE_CLUSTER, but with fewer classes. (ITEM_STRUCTURE will not be needed, see the "Middle and Lower IM"-suggestion further down on this page, but perhaps ITEM_STRUCTURE would be a better name than ITEM for this new super-ITEM with VALUE_CLUSTER capabilities). And one (debatable) way of looking at the ITEM/ITEM_STRUCTURE family of classes is to consider them as being just for structuring and naming nodes internally in a hierarchy and considering the the DATA_VALUE classes to be the real leafs. (Yes, debatable...)
  • Perhaps what is mentioned as a "downside" above (not being able to force ELEMENT or CLUSTER) is achievable (if wanted) by archetyping a new super-ITEM to have 0 items (forcing ELEMENT-functionallity) or 1..* items (forcing CLUSTER-functionality)? Also, perhaps "value" can be archetyped as disallowed if you rally want to force value-less CLUSTER behaviour.

Candidate A.2 - Modify CLUSTER to have local value

Proposal - Thomas Beale / Ian McNicoll

Status

Under development

Design Concept

The design intent of this solution is the same as for Candidate A.1 above. However, in this version, we want to add a value attribute to the CLUSTER class. Since ELEMENT already has this, we can move it to ITEM, the common parent. The technical needs we are trying to meet here are:

  • retain CLUSTER and ELEMENT classes, since they remain useful, are already defined, and map cleanly to 13606-1
  • support efficient, shortest possible path to a summary data item (e.g. 'smoking status' = Smoker)
  • ensure that if the summary item is added at runtime, e.g. by populating the new CLUSTER.value attribute, the paths of the underlying detailed items don't change
  • ensure that if an ELEMENT is changed to a CLUSTER at archetype design time, the path of the value attribute does not change - means that AQL queries are preserved.

Changes

The value properties from ELEMENT are moved to ITEM.

Diagram

The following shows the adjusted CLUSTER/ELEMENT part of the model.

Impact Analysis

Component

Impact

On RM

 

On existing archetypes

 

On archetype tooling

 

On existing RM-1.0.2 based software

 

On existing RM 1.0.2 data

 

Discussion

xxx


Candidate A.3 - Integrated model 1 - preserve current archetypes

Proposal - Thomas Beale

Status

Under development

Design Concept

The design integrates Candidate A (ITEM_STRUCTURE becomes a child of CLUSTER) and A.2 (ELEMENT.value & null_flavour move to ITEM). The effects of this should be as follows:

  • with respect to CLUSTER/ELEMENT, same as for A.2, i.e. CLUSTERs now get optoinal values & null_flavour as well
  • ITEM_STRUCTURE is retained because it is used ubiquitously in the openEHR RM and archetypes; therefore the current archetypes will not break.

Changes

  • The value properties from ELEMENT are moved to ITEM.
  • ITEM_STRUCTURE becomes child of CLUSTER
  • DATA_STRUCTURE class removed

Diagram

The following shows the result.


Candidate A.4 - Make ITEM the focal 'data structure' class

Proposal - Thomas Beale

Status

Under development

Design Concept

This version assumes that where ITEM_STRUCTURE is referenced in the model, we will now just use ITEM.

Note that the ITEM_STRUCTURE + subclasses could in theory be moved to another part of the spec, to do with implementation (I would have made the classes another colour here if the tool had allowed it). I do think they will help implementers when non-tree data structures are encoded as CLUSTER / ELEMENT hierarchies, because with no guidance they will all invent their own structures, and the data will be a mess. Standardised rules for encoding tables and lists as CLUSTER / ELEMENT trees will directly influence how archetyping tools represent structures like table (of various kinds) and list that may be presented in the UI of a modelling tool.

Changes

In addition to Candidate A.3 changes:

  • Convert references in RM to ITEM_STRUCTURE to ITEM
  • Optionally removed ITEM_STRUCTURE (it is shown as retained here)
  • keep ITEM_STRUCTURE descendants, providing a standardised programming interface to tree, list, table etc arrangements of CLUSTER/ELEMENTs

Impact

This will break all RM-based software, most openEHR archetypes today, and is not directly compatible with existing openEHR data. However the costs may be reasonable:

  • although  the RM will break, the semantics of ITEM and ITEM_STRUCTURE are not that different, and the changes should generally be simplications / removal;
  • archetypes could be automatically processed to make the change. Almost all real archetypes use ITEM_TREE, which has the 'items' attribute which is the same as for CLUSTER.
  • existing data would either have to be migrated to the new form (assessment required) or converted on the fly to the new form during querying.

Diagram

The following shows the result.


Candidate B - Remove ITEM_STRUCTURE

Proposal - Pablo Pazos

Status

under construction

Design concept

  • Remove ITEM_STRUCTURE and use ITEM for structures without losing meaning/semantics/modeling capabilities.

Changes

  • Removed ITEM_STRUCTURE and children.
  • Added attribute structure_type:CODE_PRHASE to CLUSTER (as in 13606 model)
  • Added method is_root() to ITEM
  • ITEM inherits from DATA_STRUCTURE
  • Added backguards relationship "parent" from ITEM to CLUSTER

Diagram

I have the source of this diagram if anyone wants it, it's a .dia file (http://live.gnome.org/Dia)

Impact Analysis

Component

Impact

On RM

RM change

On existing archetypes

RM change

On archetype tooling

RM change

On existing RM-1.0.2 based software

RM change

On existing RM 1.0.2 data

transformation needed


Candidate C - simplification and class renaming for easier explanation and implementation

Proposal - Erik Sundvall

Status

Now updated to include the suggested "Candidate A.2" ITEM/CLUSTER/ELEMENT change.

Design Concept

Due to archetyping the model could actually be allowed to be simpler than the 1.0.2 spec is without losing any significant expressiveness. The intention is primarily to make learning and usage simpler for archetype authors, but hopefully also for implementers. Below is an initial suggestion based on some previous mail threads

Changes

See comments in diagram.

Diagram

'UML' image above produced by pasting the "diagram sourcecode" below to http://yuml.me/diagram/scruffy/class/draw2 (initially by Erik Sundvall)

The yellow stuff is what I guess could be in a 13606-1(a?) "healthcare a-specific" update and the rest in a new 13606-6 or 13606-1b healthcare-specific part.

I have likely missed some details (and did not have time to add datatypes to all attributes, but they are in the openEHR specs).

Impact Analysis

Component 

Impact 

On RM 

 

On existing archetypes

 

On archetype tooling

 

On existing RM-1.0.2 based software

 

On existing RM 1.0.2 data

 

Discussion

xxx

  1. Conceptually/Semantically alternatives A and B are equivalent, in B the ITEM_STRUCRURE and subclasses are implicit in the model but are present in the specs as rules/constraints on ITEM/ELEMENT/CLUSTER.

  2. Yes, indeed. Now these two proposals look identical after Thomas changed the UML class diagram of the proposal A =)

    Why History and ItemStructure inherit from DataStructure? What's the value of the abstract class ItemStructure? Wouldn't it be better to have History directly descent from Locatable? This way, we can get rid of the abstract class DataStructure that doesn't seem to do a lot.

  3. The new A.2 (Modify CLUSTER to have local value) by Tom looks very promising since it will keep query paths stable even when refinements in new archetypes are made. Good thinking!

    I have now modified Candidate C to include/subsume A.2 

    If adding a structure_type attribute to CLUSTER (or possibly ITEM) the whole ITEM_STRUCTURE family can become an optional (or implementation specific) thing available for those that need the methods offered by ITEM_TABLE and its siblings. They do not contribute sufficiently to be mandatory in an openEHR implementation.

    Since we are talking 2.0 not 1.x I also think the entry classes should be simplified as outlined in Candidate C (The names and usages of EVALUATION and ADMIN_ENTRY seem to have confused people during archetyping without the separation adding much value. The word "evaluation" can still be used in an archetype name if that kind of marker is wanted.

  4. Hi Erik,

    "since it will keep query paths stable even when refinements in new archetypes are made." - that was exact aim of the proposal. Very often when creating an archetype we are faced with doubt about how much granularity might be needed at any particular node. The current RM forces us to either over-simplify and risk a later path-breaking change or create perhaps unnecessary structure which is never used. The changes should allow us to defer the decison until the requirements are clear. I also think this pattern may have some value in the context of complex post-coordination binding.

    I have some sympathy with simplification of the ENTRY classes and I would certainly be happy to accept the structural fusion of EVALUATION and ADMIN_ENTRY (plus losing protocol as an attribute). However, we know from other clinical modelling environments that it is helpful to be able to categorise entries. I suppose what I am suggesting is a looser coupling between the structure and ontology, so that we can continue to classify an archetype as an EVALUATION without having to make that firm structural or even ontological commitment. We choose the structure that fits, then label it onotologically (which may change). This is not dissimilar to what is being proposed for the sub-classes of ITEM_STRUCTURE which will become intended/suggested design patterns.

    I am not sure how far we can push this e.g to bridge the similar difficulties with OBSERVATION/EVALUATION where dates are a particularly difficult issue.

    Ian

  5. It seems like A.4 picks up the main points I had regarding ITEM_STRUCTURE provided that ITEM_STRUCTURE and its siblings can be made an _optional_ "standardised programming interface" used only when needed.

    Switching to 2.0 together with 13606- and CIMI-discusions will be a good timing for the RM-breaking changes. There is likely less openEHR-based real patient data needing re-formatting now than in the future :-) and as stated by Tom the conversions are likely possible to automate.

  6. Ian, I don't think we need to "push this e.g to bridge the similar difficulties with OBSERVATION/EVALUATION", especially if what used to be EVALUATION gets called e.g. CARE_ENTRY instead. A simple observation-like note without any detailed timing can then be modeled as a CARE_ENTRY, but when you need structured timing you turn to the OBSERVATION class. OBSERVATION could the be called CARE_ENTRY_WITH_STRUCTURED_TIMING, but that is plain ugly and I guess most such things are observations of some kind anyway - thus I think we can keep the OBSERVATION name.

    What did you mean by "losing protocol as an attribute"?

  7. I would not be in favour of making ADMIN_ENTRY and EVALUATION the same thing, just because their data structures are (today) the same. The intent in such distinctions is to a) give developers something to hang on to, because the programming of Administrative data objects will almost certainly differ from that for Evaluations and b) to make it easier to understand what the data are intended to be. Today we can do an AQL query for all ADMIN_ENTRYs in a time period and have an instant administrative view of a patient hospital stay. With no ADMIN_ENTRY, you can't write an ADMIN_ENTRY archetype. You could obviously write many different kinds of 'administrative' archetypes (presumably based on some generic ENTRY class), but now your queries get more complex in trying to find them all. So then you create a parent 'admin' archetype .... so now you are back to square one, with all your 'real' admin archetypes having to inherit from this archetype. Or we could just provide ADMIN_ENTRY in the model.

    All this could obviously be achieved by coding, but all we are doing in that case is moving the ontology definition problem to another place... where will the ontology of clinical information types be agreed? OGMS? Not a chance? SNOMED CT? It's idea of information types is pretty hopeless.Maybe there is some other place, but I think for the moment it is a good compromise just to do it in openEHR.

  8. So the real requirement is to be able to differentiate administrative data from other data (in queries and possibly GUIs) at the RM-level without having to look at any archetype info? 

    In that case perhaps an optional attribute like "entry_type" (a CODE_PHRASE?) in the ABSTRACT_CARE_ENTRY might be a marker just as useful as using class names as markers - it would certainly queryable in AQL. An optional marker attribute (initially only used to mark administrative entires) would also allow for new or changed "ontological" perspectives on entry types later without RM-redesign. In figure 18 of the AO http://www.openehr.org/releases/1.0.2/html/architecture/overview/Output/design_of_ehr.html#1165607 a rich ontology of 17 entry types is presented. As an implementer I am very glad those 17 entry types are not all different marker classes with a majority having identical attributes. The archetyping approach allows us have a small stable RM without too many separate classes. It is not obvious why any entry class that does not need extra object attributes should be a separate class. The reason to join several ontological entry types (currently under the EVALUATION class) could also be applied to ADMIN_ENTRY.

    If the "entry_type" marker is added in the ABSTRACT_CARE_ENTRY, then also INSTRUCTION+ACTION entry combinations could be reused for some administrative purposes where the instruction state machine would be of help.

    P.s. A side effect of the current practice of having marker classnames instead of marker attributes is that compact serialization formats in some cases will need to store class type (that could otherwise have been inferred) if the class attribute signatures are identical for several classes that may be expected at the same nesting level.

  9. Regarding the proposal A.2, would it be really needed in that case to maintain the ELEMENT and CLUSTER classes at all? Now there would be little difference between the concept of a "data holder" class and a "container class" that can also hold a data value. Those classes do not add new semantics or support different kinds of information and we could just talk of the ITEM class.

    Obviously to keep them allows to maintain the compatibility of past archetypes, but if the idea is to allow that future archetypes can evolve seamlesly when they need more granularity, the incompatibility will be back. Not at the paths, that will be maintained as it has been said, but yes at the archetype structure. We will have to change the archetype definition from using an ELEMENT to now use a CLUSTER.

  10. @Erik I like this idea. Disconnecting the useful, but tricky, requirement to create an ontology of clinical record types, from the equally tricky but absolute need to create re-useable clinical record structures is worth serious consideration. I am not sure how far we can push the idea through all the ENTRY sub-classes but I like the principle.

    @David I can see the attraction of what you are suggesting, but I think we would still need to be able to define a particular node as fixed and, as you have said, we would have some significant issues with backwards compatibility. Perhaps worthlooking at in the future but I this time around I would prefer to retain ELEMENT and see how the new construct works out in paractice.

  11. @David: there is not much difference remaining in this class model it is true - but ELEMENT is still a terminal node type, and CLUSTER is not. But we also have to think of downstream software in both tooling and runtime - it is likely to have more differences for ELEMENT versus CLUSTER, e.g. to do with displaying nodes detected as terminal nodes, path processing can detect true 'terminal paths' and so on. This won't be visible in specification level models like these above, but if we throw out the distinct classes, I think it will create havoc for developers downstream.

  12. @Erik: rather than 'entry_type (a CODE-PHRASE)', wouldn't it be more useful to call it associated_structure which would link to a structure archetype defined in a template? Then it could be a code_phrase, a drop-down, a widget or whetever where the ENTRY is used in a GUI, or an ADMIN_ENTRY.

  13. Hi!

    I think the discussion and referenced paper at http://wolandscat.net/2012/09/03/ontologies-in-health-ready-for-prime-time-iao-versus-openehr/ makes it interesting to look closer at decoupling "ontology class" from "software implementation class" by adding something like "entry_type" to ABSTRACT_CARE_ENTRY (Candidate C above). I mentioned this in my comment from Mar 27 above and Ian supported it in his comment Mar 30 above.

    That entry_type could point to IAO or whatever the ontology of preference will become.  Multiple implementation classes with identical structure used as "markers" is a path I'd like to avoid if an attribute will serve just as well as a marker. Attributes play nicely in many (de)serialization, storage and querying technologies, as opposed to OO-marker-classes that often introduce a bit more trouble.

    @Colin, june 21: I'm not sure I understand what you mean. Is it not the structure_type in the CLUSTER class you are talking about? The entry_type was intended more for ontological than structural description.