Problem statement
In the specifications of IHTSDO, and indeed in any other documents talking about terminologies, including academic papers, as far as I know, there is no standard way of referring to coded terms in a way that is unambiguous. In some IHTSDO documents there is a convention of using bar characters ('|'). So far I have not seen any explicit convention for this, so I would like to propose a) that a convention be written and documented and b) an actual convention.
Why is this important? Because it is often difficult in written texts, including specifications and papers, to know whether the author is using words, e.g. 'Discharge Summary' to mean the general notion in the real world, or a coded 'concept' in some terminology.
Proposal
Based on what I have seen previously in SNOMED related documents, I would like to suggest a convention such as the following:
| reference to | syntax | example |
|---|---|---|
| code and term (terminology already known in context) |
code|term| | J01|Acute sinusitis| (in ICD10) 36971009|sinusitis| (in Snomed CT) |
| term only (terminology already known in context) |
|term| | |Acute sinusitis|, |sinusitis| |
| code, term and terminology | terminology_id::code|term| | ICD10::J01|Acute sinusitis| SNOMEDCT::36971009|sinusitis| |
| code, term and terminology, including release | terminology_id(release_id)::code|term| | SNOMEDCT(20090731)::36971009|sinusitis| |
| term and terminology (with or without release) |
terminology_id(release_id)::|term| | ICD10::|Acute sinusitis| SNOMEDCT(20090731)::|sinusitis| |
| term within hierarchy | |parent/parent/ ..... /term| -- or maybe '^' or '>' might be safer than '/' |
|sinusitus / sinusitus frontal| |Reference set / Language| SNOMEDCT::|Reference set / Language| etc |
| terminology and code | terminology_id(release_id)::code | SNOMEDCT::36971009 SNOMEDCT(20090731)::36971009 |
This syntax allows us to write sentences like:
"The following patient transfer concepts can be used in the national discharge summary content framework: |discharge summary|, |referral|, etc."
It also allows us to do useful things like compare two such strings and determine if they are from the same terminology, but different releases. This brings us to the thorny issue of human-readable terminology identifiers.
Syntax Definition
A set of production rules for the above syntax is:
ref ::= terminology_part code_part
| code_part
terminology_part ::= terminology_id release_part '::'
release_part ::= -- nothing is ok
| '(' release_id ')'
code_part ::= concept_id term_part
| concept_id
| term_part
term_part ::= '|' text '|'
I had not thought about it, but it is an excellent point. It would currently be hard to get a standard, since the release ids are created by the publishers not IHTSDO or anyone independent. But if we could assume ISO861 date strings like 20090101 or 2009-01-01 etc, then we should consider using ISO8601 for an open-ended range, i.e. 20090101/. Now if we use the openEHR syntax, which is the same as the proposal above, i.e using parentheses around the release id, then the trailing '/' should be safe fro being swallowed by unwary parsers. Also, since '2009' is a legal 8601 string (which I really detest - if the specifiers had bothered to write any parsers, they would have realised that 2009???? would be much safer, but that's another argument), you can say what you want with: SNOMEDCT(2009/)::3434211009.
Hi Thomas,
this short note is a sort of alert not a proposal
The concept of a 'release' and a 'release_ID' presupposes a number of things, many or all may turn out to be tractable.
The distributed working envisaged in the future world of Web2 say, does not lend itself so well to the centralised distribution model, so while SNOMED CT may have formerly been the 'international core release 20100131' there may be tools which allow 'release' by agencies other than the central agency, and for these the ID for their release may refer to some combination of (in the SNOMED CT world of RF2 say) a set of 'Modlues'.
The second assumption is that all terminology schemes routinely issue a releaseID, this is worth testing.
regards
Certainly the distributed release part is correct; I wonder if IHTSDO has thought about standardising release identifiers for precisely the reasons discussed above. I will bring it up in the Technical Committee meeting next week in Copenhagen.
Thomas, did you get an answer to this question?
My understanding is that a release has an effectiveTime and its contents can be characterised by the set of modules it comprises, and that a Module Dependency Reference Set can (should) be use to do so and it identifies the modules with a moduleId and effectiveTime. Thus, I believe, you should be able to characterise a release by the refsetId and effectiveTIme of the relevant Module Dependency Reference Set.
Good stuff Thomas - I think something like this is needed. A lot of the NEHTA and IHTSDO docs I've seen would benefit from a consistent usage pattern like this.
One thing that I've been grappling with is specifying releases in terminology expressions. In particular how to specify the minimum release in which a concept was introduced. I realise this is
not applicable to the particular use case you have here - but was wondering if you had any
thoughts about how it might be done in a way that is also consistent with this grammar.
So for instance, I want to talk about a medication that was introduced in SNOMED in 2009. However, I don't want to pin the expression to the exact SNOMED release of 2009 as that may not be available.
So maybe
SNOMEDCT(2009+)::3434211009
Thoughts??