Archetype Query Language Description
Skip to end of metadata
Go to start of metadata

Overview

Developer

Ocean Informatics Team: Chunlan Ma MD, Heath Frankel, Thomas Beale

Existing Query Languages

Currently, the available query languages, such as SQL, XQuery, or Object-Oriented Query Language, have dependencies on particular system data structure and working environment. Users must know the persistence data structure of an EHR in order to write an appropriate query. The query statement cannot be used by other systems which have different data store. Consequently, none of the these languages meet the aforementioned requirements and none of these can be used directly as the query language required by integrated care EHRs

What is AQL?

Archetype Query Language (AQL) is a declarative query language developed specifically for expressing queries used for searching and retrieving the clinical data found in archetype-based EHRs. It is applied to the openEHR EHR Reference Model (RM) and the openEHR clinical archetypes, but the syntax is independent of applications, programming languages, system environment, and storage models. The minimum requirement for data to be querying with AQL (including with archetype structures and terminology) is for the data to be marked at a fine granularity with the appropriate archetype codes and terminology codes. This may be native openEHR-structured data, or legacy system data to which the relevant data markers (mainly archetype paths and terminology codes) have been added. Unlike other query languages, such as SQL or XQuery, AQL expresses the queries at the archetype level, i.e. semantic level, other than at the data instance level. This is the key in achieving sharing queries across system boundaries or enterprise boundaries.

AQL has the following distinctive features:

  1. the utilization of openEHR archetype path syntax in AQL. openEHR path syntax is used to locate clinical statements and data values within them using Archetypes. This path syntax is used to represent the query criteria and returned results. It allows setting query criteria using archetype and node identifiers, data values within the archetypes, and class attributes defined within the openEHR RM. It also allows the returned results to be top-level archetyped RM objects, data items within the archetypes or RM attribute values.
  2. the utilization of containment mechanisms to indicate the data hierarchy and constrain the source data to which the query is applied.
  3. the utilization of ADL-like operator syntaxes, such as matches, exists, in, negation. 
  4. Neutral expression syntax. AQL does not have any dependencies on the underlying RM of the archetypes. It is neutral to system implementation and environment.
  5. Supporting queries with logical time-based data rollback.

AQL has some other features which can be found from other query languages:

  1. Supporting naming returned results.
  2. Supporting query criteria parameters.
  3. Supporting arithmetic operations (such as count, addition, subtraction, multiplication, and division), relational operations (>, >=, =, !=, <=, <) and Boolean operations (or, and, xor, not).
  4. Supporting some functions that are supported in XQuery, such as current-date().
  5. Users can specify their preference on the retrieved data, such as ordering preferences, or total number of retrieved results.
  6. Supporting queries for individual clinical subjects at the point of care, administrative purposes and clinical research purposes.

Structure & Example

Like SQL, AQL has five clauses: SELECT, FROM, WHERE, ORDER BY, and TIMEWINDOW. The SELECT clause specifies the data elements to be returned. The FROM clause specifies the result source and the corresponding containment criteria. The WHERE clause specifies data value criteria within the result source. The ORDER BY clause indicates the data items used to order the returned result set. TIMEWINDOW clause is to constrain the query to data that was available in the system within the specified time criteria.

Here is an example of AQL statement. This statement returns all blood pressure values where systolic value is greater or equal to 140 or diastolic value is greater or equals to 90 within a specified EHR. AQL syntax is a synthesis of SQL structural syntax and openEHR path syntax.

AQL Syntax Description

Overview

AQL has SELECT, FROM, WHERE and ORDER BY  clauses. openEHR path syntax is used to locate any node or data values within archetypes. The SELECT clause uses openEHR path syntax to indicate expected archetypes, elements, or data values that are need to be returned. FROM clause uses containment mechanisms indicating the scope of the query data source. WHERE clause utilises archetype path to set query criteria on archetypes or any node within the archetypes. The following sections introduce the AQL syntax in detail.

Reserved words and characters

 AQL has the following reserved keywords and characters:

  • select, from, where, order by, top, and, or
  • contains, max, now, current-date, current-date-time
  • timewindow (need to be confirmed)
  • matches, in, exists, not in
  • '"' and ''': double quote characters and single quote characters are used to delimit string values;
  • '|': bar characters are used to delimit intervals;
  • []: brackets are used to delimit coded terms, archetype id values or openEHR reference model class attribute values.
  • {}: brackets are used with MATCHES delimiting matches criteria
  • <, >, =, %, !, $

openEHR path syntax

The openEHR path syntax has been described in a detail (see the overview specification). AQL utilises the basic openEHR path syntax to represent both coarse-grained and fine-grained archetype nodes and openEHR class instances and attributes. The syntax details are not repeated here. A set of openEHR path examples are provided.

openEHR path examples - archetype path

Archetype path is the path referring to any nodes within an archetype. The picture below is an ADL workbench screen shot of apgar observation node map. The table below shows some examples.

Archetype RM type

Archetype node name

Path syntax

OBSERVATION

Apgar score

/

OBSERVATION

1 minute

/data[at0002]/events[at0003]

OBSERVATION

Heart rate

/data[at0002]/events[at0003]/data[at0001]/items[at0005]

openEHR path examples - path to RM class attributes

Another type of openEHR path is the path pointing to an attribute of openEHR Reference Model classes. The examples shown below uses the attributes from COMPOSITION class.

OpenEHR RM attribute name

Path syntax

COMPOSITION.category

/category

COMPOSITION.context.start_time

/context/start_time

COMPOSITION.uid.value

/uid/value

AQL variables

AQL variables are used to refer a specific archetype or openEHR class defined within an AQL statement. It has following featueres:

  • AQL variable names must be defined in an AQL FROM clause.
  • A variable name must be unique within an AQL statement.
  • Not all classes or archetypes defined within an AQL statement need a variable name. It is required only when other clauses within the AQL statement need to refer to the class or archetype defined in the FROM clause. For instance, the example doesn't define a variable name for EHR class. It has a variable name - c - indicating the composition archetype with 'openEHR-EHR-COMPOSITION.encounter.v1' as the archetype id, and varialbe name - o - referring to the observation archetype with archetype id 'openEHR-EHR-OBSERVATION.blood_pressure.v1'. Variable name o is used by both WHERE clause and SELECT clause.
  • Variable names are used by other clauses within the AQL statement along with openEHR path to form a complete AQL path - Identified path. Refer back to Section 2.2. example, o/data[at0001/events[at0006]/data[at0003]/items[at0004]/value/value] points to the systolic value node of the observation with 'openEHR-EHR-OBSERVATION.blood_pressure.v1'.
  • Except for the reserved words and characters, and space, any letters, Alphanumeric, or underscore can be used for a variable name. It is not case sensitive. Digit numbers are not recommenced to be used at the beginning of a variable name even though no restriction on this at the current stage.

Parameters

AQL syntax supports parameterisation. AQL parameters are used to indicate criteria, which will be substituted with values at run time. Supporting parameterisation is one of the mechanisms used to realise sharing AQL statements within a system or across system boundaries.

The parameters can be EHR specific (such as the parameter of EHR identifier or EHR creation date time), archetype specific (such as an archetype identifier, or archetype constraints), or template specific (such as a template identifier or template constraints).

When parameters are needed

Parameters are needed when the same AQL query statement is used with different criteria values. This AQL example is to return all abnormal blood pressure values for a single specific EHR. This query has a parameter $ehrUid. This parameter will be substituted by a specific EHR.ehr_id value at run time. Consequently, this query can be reused for all EHRs either within an EHR system or by other EHR systems (assuming these EHR systems support AQL).

Where and how parameters are resolved

AQL query parameters can be resolved at application level, or EHR system level. It depends on what the query parameters are used for and the design/implementation of the system or components. Query parameters would be normally resolved outside of a query engine. 

If a query needs to be reusable across different EHR systems, any query parameters normally need to be registered in these EHR systems so that they can be resolved with real values from each environment. A typical example of this type of query parameter is $ehrUid. If the query is only used within an application, then the query parameters would be resolved by the application, such as the parameter of healthcare facility identifier or template identifier. 

There are no specific guidelines on how to resolve query parameters. Generally speaking, a parameter name is used as a key (or a key is associated with a parameter name) and the key needs to be unique within the boundary where the parameters are resolved. The EHR system or application needs to have the API functions to get the real value with a given parameter name or key.

Parameter syntax 

A parameter always starts with a dollar sign $ followed by the parameter name, e.g. $ehrUid. The parameter name can consist of letters, digits and underscores _. It cannot have spaces and it cannot be an AQL reserved word.

A parameter can be used for any criteria values within an AQL statement, e.g.:

  • within a predicate

    [$archetypeId], [at0003, $nameValue], [ehr_id/value=$ehrId]
    
  • outside a predicate

    o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value > $systolicCriteria
    

Predicates

AQL has three types of predicates: the standard predicate, the archetype predicate, and the node predicate.

Standard predicate

Standard predicates always have a left operand, operator and right operand, e.g. [ehr_id/value='123456'].

  • The left operand is normally an openEHR path, such as ehr_id/value, name/value.
  • The right operand is normally a criterion value or a parameter, such as '123456', $ehrUid. It can also be an openEHR path (based on the BNF), but we do not have an example of this situation yet.
  • The operator can be one of the following:

    >, >=, =, <, <=, !=
    

Archetype predicate

An archetype predicate is a shortcut of a standard predicate, i.e. the predicate does not have the left operand and operator. It only has an archetype id, e.g. [openEHR-EHR-COMPOSITION.encounter.v1]. The archetype predicate is a specific type of query criterion indicating what archetype instances are relevant to this query. It is used to scope the the data source from which the query expected data is to be retrieved. Therefore, an archetype predicate is only used within an AQL FROM clause, for example,

FROM EHR [ehr_id/value='1234'] CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.encounter.v1] CONTAINS
OBSERVATION o [openEHR-EHR-OBSERVATION.blood_pressure.v1]

Node predicate

A node predicate is also a shortcut of a standard predicate. It has the following forms:

  • Containing an archetype node id (known as atcode) only, e.g.:

    [at0002]
    
  • Containing an archetype node id and a name value/criterion, e.g.:

    [at0002 and name/value=$nameValue]
    [at0002 and name/value='real name value']
    
  • Containing an archetype node id and a shortcut of a name/value criterion, e.g.:

    [at0002, $nameValue]
    [at0002, 'real name value']
    
  • The above three forms are the most common node predicates. A more advanced form is to include a general criterion instead of the name/value criterion within the predicate. The general criterion consists of left operand, operator, and right operand, e.g.:

    [at0002 and value/defining_code/terminology_id/value=$terminologyId]
    

A node predicate defines criteria on fine-grained data. It is only used within an identified path.

Operators

Basic operators

The table below shows the basic AQL operators, meaning and example.

Operator

Meaning

Example

=

Equal

name/value = $nameValue

>

Greater than

o/data[at0001]/.../data[at0003]/items[at0004]/value/value >140

>=

Greater than or equal to

o/data[at0001]/..../data[at0003]/items[at0004]/value/value >=140

<

Smaller than

o/data[at0001]/.../data[at0003]/items[at0004]/value/value <160

<=

Smaller than or equal to

o/data[at0001]/.../data[at0003]/items[at0004]/value/value <=160

!=

not equals to

c/archetype_details/template_id/value != ''

Advanced operators

matches

matches operator is used in WHERE clause. This operator needs left operand and right operand. Left operand is an AQL identified path. Right operand is enclosed within curly braces. Right operand of matches operator has three forms:

  1. type of constraint ADL (cADL): this type of right operand can be a value list or a value range. Value range is not supported in the current AQL grammar. Below is an example with matches a string value list:

  2. URI: can be either terminology URI or EHR URI, or some other URIS. Only terminology URI is supported in the current AQL grammar. An example with matches a terminology URI is shown below:

  3. embedded AQL query: means an AQL query statement, i.e. embedded query, within the curly braces. Embedded query is not supported in the current AQL grammar. It looks like:

EXISTS

EXISTS is a unary operator, which is always followed by a single operand. The single operand is an AQL identified path. This operator return boolean results: true means the data associated with the specified path exists, false otherwise. EXISTS expression is a single identified expression. The latest grammar only allows this operator in WHERE clause.It may be required as part of openEHR path in future.

The example below means retrieving the identifiers of the EHRs which do not have discharge admin entry instance.

This example below means from a particular EHR, retrieving all composition uid values and the contained instruction instances if the instruction instances have links with particular EHR_URI value.

NOT

NOT is a unary boolean operator and it is always followed by either a single identified expression or boolean identified expression. It returns boolean results: true means the followed expression (single or boolean expression) is false.

The example below uses NOT operator followed by a single identified expression

The example below uses NOT operator followed by a boolean identified expression.

The above example is equivallent to the two expressions shown below:

These advanced operators are not yet supported by the grammar. The operator syntax is borrowed from ADL specifications. These are proposed to improve the richness and flexibility of AQL so that AQL syntax supports more complicated query scenarios.

AQL identified paths

Usage

Identified paths are used to locate data items within an archetype RM class. Except for FROM clause, an identified path could appears in any clauses of an AQL statement. For example, it can be used to indicate the data to be returned in a SELECT clause, or the data item on which query criteria are applied in a WHERE clause.

Syntax

AQL identified path has the following forms:

  • consisting an AQL variable name defined within the FROM clause, followed by an openEHR path, e.g.

    o/data[at0001]/.../data[at0003]/items[at0004]/value/value
    
  • consisting an AQL variable name followed by a predicate, e.g.

    o[name/value=$nameValue]
    
  • consisting an AQL variable name followed by a predicate and an openEHR path, e.g.

    o[name/value=$nameValue]/data[at0001]/.../data[at0003]/items[at0004]/value/value
    

Leaf Data

Primitive Types

String Data

All strings are enclosed in double quotes or single quotes. It doesn't support line breaks.

TODO list: 

  • allows double quotes and single quotes in the string.
Integer data

 Integers are represented as numbers, such as 1, 2, 365. Commas or periods for breaking long numbers are not allowed. Hex integer is not supported.

Real data

 Real numbers are the numbers with a decimal, such as 3.1415926. Commas or periods for breaking long numbers are not allowed. Hexadecimal is not supported.

Boolean data

 Boolean values can be indicated as true or false. It is case-insensitive.

Dates and Times

No matter the date/date time is complete or partial, it is allowed in AQL as long as it is in ISO8601 date/date time format. Quotes (either single quotes or double quotes) are required to enclose the date/date time value. In AQL grammar, it is treated as a date value if the value is a complete calendar date represented in ISO8601 extended format and if single quotes are used. If the value is in other ISO8601 date/date time format, AQL treats it as a string value, but the AQL query engine processes it differently from a normal string value. It is a temporary solution. In future, a date/date time value should be identified as date/date time value if it is a valid ISO8601 date/date time. Examples are as below:

Other Built-in Types 

URI

URI leaf data is enclosed within curly braces after "matches" operator. URI can be expressed as a normal URI described in  http://www.ietf.org/rfc/rfc3986.txt. URI is not case sensitive. In AQL, a URI can be either terminology URI or EHR URI.

Terminology URI 

A terminology URI consists of the following components:

  1. terminology: the URI schemes value;
  2. terminology service: the URI authority value, such as SNOMED-CT;
  3. terminology function name: the URI path, e.g. "hierarchy" is the function name in the example shown below;
  4. argument values required by the terminology functions: URI queries;

This is an example of a terminology URI:

EHR URI

to be continued... 

Value List

Value list leaf data is enclosed within curly braces after matches operator. Each item is separated by comma if there are multiple items in the list. Value list can be string value list, date time value list, integer value list, and real value list. Quotes are required for string value list and date time value list.

A value list is only used in AQL WHERE clause when the criteria is to match one item of the list. The relationships among these value list items are "OR".

Some examples of value list:

Value Range

It is not supported in the current AQL grammar.

Value range leaf data is enclosed within curly braces after matches operator. It is only used in AQL WHERE clause to specify a criteria within a range. The syntax is similar as intervals of ordered primitive types detailed in Section 4.5.2, adl.pdf. Some examples:

Query structure

Overview

AQL structure has been briefly introduced in Section 2.2. This section describes the syntax in more formal detail.

AQL structure has the following clauses and these clauses must be listed in the same order as the list below in an AQL statement. An AQL statement must contain SELECT and FROM clauses. 

  • SELECT (mandatory)
  • FROM (mandatory)
  • WHERE (optional)
  • ORDER BY (optional)
  • TIMEWINDOW (optional)

FROM

The FROM clause utilises class expressions and a set of containment criteria to specify the data source from which the query required data is to be retrieved.  Its function is similar as the FROM clause of an SQL expression.

FROM Syntax

 A simple FROM clause consists of three parts: keyword - FROM, class expression and/or containment constraints, e.g.

FROM EHR e [ehr_id/value=$ehrId] CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1]
Class expressions

Two examples of a class expression are shown below:

EHR e [ehr_id/value=$ehrId]                                          \\EHR class, class identifier/variable, and a standard predicate
COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1]   \\COMPOSITION class, class identifier/variable, and an archetype predicate


Class expressions are used for two purposes:

  1. indicating the constraints on RM classes so as to scope the data source for the query. For instance, EHR e[ehr_id/value='123456'] indicates that the required data must be from a specific EHR with ehr_id value '123456'; while "COMPOSITION c[openEHR-EHR-COMPOSITION.report.v1] " indicates the required data must be from or must be associated with a composition instance with archetype id - openEHR-EHR-COMPOSITION.report.v1.
  2. defining a RM class variable that may be used by other clauses to indicate the required data or data items on which query criteria are applied. The example below uses the class expression to define a variable e which is used by the SELECT clause indicating all relevant ehr_id values are retrieved, and a variable c used by WHERE clause indicating that the query criteria is set on the composition template id value.

    SELECT e/ehr_id/value
    FROM EHR e CONTAINS COMPOSITION c[openEHR-EHR-COMPOSITION.referral.v1]
    WHERE c/archetype_details/template_id/value=$templateId
    

Class expressions syntax include three parts. A class expression must have part one and at least one of part two or part three.

  1. part one (mandatory): openEHR RM class name, such as EHR, COMPOSITION, OBSERVATION etc.
  2. part two (optional): AQL variable name
  3. part three (optional): a standard predicate or an archetype predicate.
Containment

Since archetypes are in hierarchical structure, AQL has a containment constraint which specifies the hierarchical relationships between parent and child data items. FROM clause utilises this hierarchical constraint along with class expression to determine the data source to which the AQL query is applied.

The syntax of containment constraint is very simple: using keyword CONTAINS between two class expressions. Left class expression is the the parent object of the right class expression, e.g.

EHR e CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1]


Boolean operators (AND, OR, NOT) and parentheses are used when multiple containment constrains are required, e.g.

EHR e CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1] AND
COMPOSITION c1 [openEHR-EHR-COMPOSITION.report.v1]
EHRR e CONTAINS COMPOSITION c [openEHR-EHR-COMPOSITION.referral.v1] CONTAINS
(OBSERVATION o openEHR-EHR-OBSERVATION-laboratory-hba1c.v1 AND
OBSERVATION o1 openEHR-EHR-OBSERVATION-laboratory-glucose.v1)

WHERE

An AQL WHERE clause is used to represent further criteria applied to the data items within the objects declared in the FROM clause. A WHERE clause expresses the query criteria that cannot be represented in other AQL clauses, such as criteria on archetype id, composition committal date time, and the criteria on in which order the returned results should be listed.

Syntax

The WHERE clause syntax has the following parts (in order): keyword WHERE and identified expression(s). Boolean operator (AND, OR, NOT) and parenthesis can be used to represent multiple identified expressions. Examples:

WHERE c/name/value=$nameValue AND c/archetype_details/template_id/value=$templateId
WHERE (c/name/value=$nameValue OR c/archetype_details/template_id/value=$templateId) AND
o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140
Identified expression

Identified expression specifies the criteria within WHERE clause. It consists of left operand, operator and right operand.

  1. Left operand is normally an identified path,
  2. Operator is normally the basic operator,
  3. Right operand is the criteria value, which normally is primitive type, such as string, integer, boolean, double, or float. When it is a string value, single quotation marks or double quotation marks are required. Right operand can also be a parameter or an identified path as well.

Examples:

  • Left operand is an identified path and right operand is a real criteria value - primitive type:

    o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140
    c/archetype_details/template_id/value='health_encounter'
    
  • Left operand is an identified path and right operand is a parameter:

    c/archetype_details/template_id/value=$templateParameter
    
  • Both left operand and right operand are an identified path (this is an advanced feature):

    o/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >
    o1/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value
    

SELECT

A SELECT clause specifies what data is to be retrieved by the AQL query. The data can be any types from openEHR RM and any primitive data types. In addition, SELECT clause uses TOP to indicate the number of result setS that should be returned and name alias to rename the retrieved data.

The SELECT syntax always starts with keyword SELECT, followed by TOP (optional), and identified path(s) or variable name(s) defined in the FROM clause. Each variable name or an identified path may have a name alias renaming the associated data.

If variable name(s) is(are) required, the full object of the type associated with the variable is retrieved, such as a COMPOSITION, an OBSERVATION object etc. If identified paths are required, the path-associated data items are returned. Multiple identifiers or identified paths are separated using a comma.

Some examples are shown below.

Example 1: retrieve all compositions' name value, context start time and composer name from a specific EHR.

SELECT c/name/value AS Name, c/context/start_time AS date_time, c/composer/name AS Composer
 FROM EHR e[ehr_id/value=$ehrUid] CONTAINS COMPOSITION c

Example 2: Retrieve all composition objects of a specific EHR.

SELECT c
FROM EHR e[ehr_id/value=$ehrUid] CONTAINS COMPOSITION c
TOP

TOP syntax was borrowed from SQL language for representing the number of result sets that should be returned by the AQL query. It uses BACKWARD and FORWARD to indicate the direction where to start to get the number of results to be returned.

It starts with keyword TOP, followed by an integer number and/or the direction (i.e. BACKWARD, FORWARD), e.g.

SELECT TOP 10 c/name/value AS Name, c/context/start_time AS date_time, c/composer/name AS Composer
FROM EHR e[ehr_id/value=$ehrUid] CONTAINS COMPOSITION c
Name alias

Like SQL, AQL supports using name alias to rename the retrieved data. It starts with keyword - AS, followed by the name which conforms to the syntax rule of AQL variable. 

ORDER BY

The ORDER BY clause is used to sort  the returned results. It starts with the keyword ORDER BY, followed by an identified path and the keyword DESC, DESCENDING, ASCE, or ASCENDING, e.g.

ORDER BY c/name/value

TIMEWINDOW 

In addition to FROM clause, TIMEWINDOW helps to scope the data source from which the required data is retrieved. TIMEWINDOW is an addition query clause used in AQL to constrain the query to data that was available in the system (also know the data committal time) within the specified time criteria. This supports a timebased logical system rollback allowing a query to be executed as though it was performed at that specified time, which is essential for medico-legal reporting.

It starts with the keyword - TIMEWINDOW, and followed by a string compatible with the ISO 8601 representation of time interval.

The first example below constrains the query source to data committed to the system before 2006-01-01.

TIMEWINDOW /2006-01-01

The second example constrains the query source to data committed within the period of two years before 2006-01-01.

TIMEWINDOW P2Y/2006-01-01

Result structure

The result structure of an AQL query is described here as part of the EHR service definition.

How to write an AQL query statement manually

Ocean Informatics has implemented an AQL query builder which can be used to generate AQL query statements based on archetypes automatically. We strongly recommend to use this tool to generate an AQL query statement because it can be very complicated. This section is for the people who either do not have the access to AQL query builder or for users who want to have more understanding on AQL so that they can edit an existing AQL query manually.

Query scenario 

An example is used to illustrate how to write an AQL statement. The query scenario is:

"Get all abnormal blood pressure values that are recorded in a health encounter for a specific patient."

Step 1. Write FROM clause

A FROM clause is to scope the data source for the query. Normally the first step is to shape FROM clause, which has EHR class expression and archetype class expressions.

  • EHR class expression. You need to determine whether this query is applied to a single EHR or all EHRs. The latter is called population queries. If it is for all EHRs, you don't need to specify ehr_id/value in FROM clause. Otherwise, you need to either specify a specific value or use a query parameter for ehr_id/value in FROM clause. For this particular scenario, it is for a single EHR. Parameter called ehrUid is used for the ehr_id/value, I would have this AQL query statement:

    FROM EHR [ehr_id/value=$ehrUid]
    
  • Archetype expressions.
    • To write archetype expressions in the FROM clause, we need to determine what archetypes are required in the scenario. Since archetypes are used to represent clinical concepts. The easy way to identify archetypes from a scenario is to identify clinical concepts mentioned in the scenario. Each clinical concept is generally associated with one archetype. Two clinical concepts are mentioned in the aforementioned scenario: blood pressure and health encounter. These concepts are used to identify two archetypes used in query: an observation archetype with archetype id - openEHR-EHR-OBSERVATION.blood_pressure.v1, and a composition archetype with archetype id - openEHR-EHR-COMPOSITION.encounter.v1.
    • Determine whether we need a variable name for each archetype class expression or not. We may need one if the reference to the archetype class expression is required by other AQL clauses. At the moment, I don't put any variable names for the archetype expressions. The two archetype expressions are:

      COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
      OBSERVATION [openEHR-EHR-OBSERVATION.blood_pressure.v1]
      
  • Containment expression. openEHR RM is used to identify the hierarchical relationships among the found archetypes so as to define the containment expression. For example, a composition archetype is the parent of an observation archetype, so the AQL query would look like:

    FROM EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
    CONTAINS OBSERVATION [openEHR-EHR-OBSERVATION.blood_pressure.v1]
    

Step 2. Write WHERE clause

WHERE clause represents all query criteria that cannot be represented in the FROM clause. To write WHERE clause expression, you need to:

  1. Find the criteria. The criteria required by the scenario are abnormal blood pressure values. Different guidelines may have different definitions for abnormal blood pressure values. Here we interpret that abnormal blood pressure means 1) the systolic pressure value is greater than or equal to 140; OR 2) diastolic pressure value is greater than or equal to 90.
  2. Write an identified expression for criterion 1).
    1. Write the Identified pathfor systolic data value. We need a class variable name as the reference to the blood pressure class expression defined in the FROM clause, so a variable name - obs - is added into the FROM clause. A path to systolic data value is also required to form the identified path. Two tools can be used to get the path: openEHR ADL WORKBENCHor Ocean Template Designer. The path to the systolic data value is /data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value.  
    2. Get appropriate operator: '>=' is chosen for criterion 1.
    3. Get the criteria value - 140. The query statement including the identified expression for criterion 1) is shown below:

      FROM EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
      CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
      WHERE obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140
      
  3. Write an identified expression for criterion 2). Following the previous instruction to write the identified expression for criterion 2) which is shown as:

    obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/value >= 90
    
  4. Find the appropriate boolean operator to join criterion 1) and 2) - OR. The query statement looks like:

    FROM EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
    CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
    WHERE obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/value >= 140 OR
    obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/value >= 90
    

Step 3. Write SELECT clause

Writing the SELECT clause needs to:

  1. Find what data is required. The aforementioned scenario requires abnormal blood pressure values are returned, including both systolic pressure and diastolic pressure.
  2. Write the appropriate identified path to the required item (variable name is required if the class object is required). Multiple identified path is separated using comma. For this particular scenario, two identified paths are found for systolic and diastolic data value by following Step 2. The completed query statement looks like:

    SELECT
    obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude,
    obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude
    
    FROM EHR [ehr_id/value=$ehrUid] CONTAINS COMPOSITION [openEHR-EHR-COMPOSITION.encounter.v1]
    CONTAINS OBSERVATION obs [openEHR-EHR-OBSERVATION.blood_pressure.v1]
    
    WHERE obs/data[at0001]/events[at0006]/data[at0003]/items[at0004]/value/magnitude>= 140 OR
    obs/data[at0001]/events[at0006]/data[at0003]/items[at0005]/value/magnitude>= 90
    

Current status

AQL features not yet supported in the Ocean Informatics AQL Parser

This is the list of all features that are not yet supported in the AQL Parser implemented by Ocean Informatics (Nov 2012):

  • TOP
  • XOR. Currently AND, OR, NOT, EXISTS are supported in WHERE clause.
  • All functions are not supported.
  • Embedded query is not supported
  • ORDER BY
  • TIMEWINDOW

AQL features that are proposed, but not finalised

Arithmetic functions

 It has been proposed that AQL would support some basic arithmetic functions, such as addition (plus) , subtraction (minus) . Multiplication, and division may be supported as well. We do not have any scenario requiring for these two functions yet.

Most cases, it is required that the left operand and the right operand must be the same type in order to do the calculation, e.g. numeric types. openEHR RM supports addition and subtraction for the types of: DV_QUANTITY, DV_COUNT, DV_INTERVAL, DV_PROPORTION, DV_DATE_TIME, DV_DATE, and DV_TIME. Again, it is required that the left and right operands must be the same type. However, one exception to this is that openEHR RM also supports subtraction between all subtypes of DV_TEMPORAL and type of DV_DURATION. AQL would support these arithmetic functions that openEHR RM supports.

The example below shows a scenario which requires that a composition report (c1) is issued in the last year. It illustrates that a type of ISO 8601 date string (current-date()) subtracts a type of ISO 8601 interval string (P1Y).

c1/context/other_context/items[at0006]/items[at0013]/value > current-date() - PIY

Other functions

It is proposed that AQL may also support other functions, such as:

  • current-date(): a build-in function returning the current date value in ISO date string format. 
  • current-date-time(): a build-in function returning the current date time value in ISO date/time string format. 
  • max: a build-in function returning the max value out of an expression.

Further discussions

Ocean products have been using AQL to search and retrieve openEHR-conformed EHR data for couple of years. The major usages of AQL are:

  • Retrieving candidate compositions with given criteria on EHR identifier and composition meta data, such as composition type (i.e. archetype id), healthcare facility, event start time, composer and so on.
  • Retrieving EHR data from a single EHR for dedicated purposes, such as risk factor calculations 
  • Reporting, in which case AQL is used to retrieve EHR data that are cross multiple EHRs, i.e. population AQL query. 
  • to be continued...

The existing AQL grammar and syntax maybe further enhanced in the following areas:

  • Reduce the length of the query statement. The use of archetype path in AQL query makes the query lengthy and hard to read. Local variables with meaningful names that are assigned with path can be used to reduce the length of the query as well as improve readability of the query. One example is shown below:

  • To be continued
  • No labels
  1. Can someone explain the use case for me where we need words like; 'data', 'items' and 'value' in the query? It seems to me that the nodeids will take care of location within a given archetype.  I'm sure that your experieince with this so far has shown a need but I just don't see it.

    Thanks,

    Tim

  2. Hi Tim,

    One reason of having data, or items in the query is because it is part of openEHR archetype path. Another reason is that it has some meanings in the query. For example, "SELECT observation/data" means a result set with openEHR HISTORY objects would be returned.

  3. I can't find any examples of what the result of an AQL query should look like. Is it a set of rows as per SQL query results?

    For example, what would the result of the query shown here look like? And lets say the query was modified to add stuff from the "Postural change" event? As there can be many of these such events per Blood Pressure, and there can also be many "any events" per Blood Pressure, how would the results look? An SQL-type query result would give a "cross product" (eg: if the query found a Blood Pressure instance that contained 3 "Any" events and 4 "Postural Change" events, then 12 rows would be returned). But I assume this is not what is intended to happen?

    Or is the result in dADL?

    Thanks,

    John Ryan-Brown

  4. Hi John,

    Thanks for asking this question.

    At the moment, the structure of AQL query results are not standardized yet in the AQL specifications. There are ongoing discussions on this. Two basic principles are that 1) the results representation should be neutral to system environment; 2) the structure should be flexible enough to support any required query results representation, e.g results may be structured using relational tables, or represented in a hierarchical structure.

    Ocean AQL query engine represents the query results using a generic ResultSet, which has similar structure as a table, e.g. it has columns and rows properties. Each row has a list of objects. The objects can be either primitive types or openEHR RM objects. The query SELECT clause statements determine the object types of each field in each row.

    How to design an AQL query is a state of art. For different data requirement, you may design AQL query in a different style. For instance, if you only want to get a list of systolic and diastolic, you may choose to specify the paths pointing to systolic and diastolic data value in AQL SELECT clause. However, if you want more context information around the blood pressure, such as patient's position, exercise level, or device information, or the example that you gave in your message, then in stead of specifying all paths pointing to each required data or each required event, it would be better to use AQL query to retrieve all blood pressure observation data instances and then have further processes from there.

    Hope it helps. I am looking forward to more feedbacks or discussions around this area.

    Thanks,

    Chunlan

  5. I asked the same question as John a while ago on the technical-mailing list: http://www.openehr.org/mailarchives/openehr-technical/msg03950.html The thread showed this to still be an open issue. What I was after there was the use case of detailed quering of a specific EHR system, that would allow e.g. GUI-components to be reused between different backend systems if the backend systems were cleverly designed. (An option to get responses in JSON would of course also be nice in this use case.)

    The other use case of requesting complete extracts is partly drafted in http://www.openehr.org/releases/1.0.2/architecture/rm/ehr_extract_im.pdf and I suppose the response format in an XML setting could follow what is available as XSD at http://www.openehr.org/releases/1.0.2/its/XML-schema/index.html

    // Erik Sundvall

  6. Hi,
    I want to ask Will AQL queries retrieve data from ADL?What is the data source of querying ?

    Regards
    Shelly

  7. Hi Shelley

    AQL retrieves data from EHR repositories which follow the openEHR specification. A simple way to think about this if you are used to SQL is that the Archetypes are like tables (the structures that contain data) and the archetype paths are like column labels. So you will get the same data back from the same path each time.

    AQL will return objects which are displayed according to their ToString function so

    •  if you ask for a dv_quantity with the heart rate you will get 100/min (the magnitude and the units)
    • if you ask for a dv_quantity\magnitude you will get 100 and
    • if you ask for a dv_quantity\units you will get /min

    I hope that helps (it is an over simplification)

  8. QUERY regarding AQL

    Hi,
    I have three questions:
    1.HOw to get access to Archetype Query Language(AQL ) Query builder? Does it requires licence?
    2. When will the specification document for AQL released?
    3.Is AQL tested on real time EHR data? If yes, from where can I see the data instances?

    Thanks and Regards
    Shelly

  9. Hi Shelly,

    Re question 1: we only have an AQL builder prototype developed. This prototype only can be used to generate simple queries. It is part of Ocean Template Designer.

    Re question 3: AQL has been used in real time data. Our products uses data table like format to represent AQL query results which can be adapted to other formats as required. In the real system, one adapted type of the AQL result is csv file.

  10. Dear Dr Chunlan,

    Thanks for the reply.

    1)As mentioned that AQL does not depend on RM of archetypes but if we see the 'FROM' clause of AQL we have a containement mechanism , which specifies the data source hierarchy. The hierarchy that EHR consists of COMPOSITIONS may be organised by FOLDER, COMPOSITION consists of ENTRY may be organised by SECTIONS,ENTRY consists of ELEMENTS is contained within RM.The healthcare professional cannot write the AQL query until he knows this RM hierarchy.So, the knowledge of underlying RM hierarchical structure is required, as in SQL we need to know the schema structure to query the database. Please elaborate.  

    2) As mentioned in examples(in FROM clause), the query scenario for Blood Pressure query requires the COMPOSITION category archetype named as encounter whereas the Query scenario for Body mass index requires the archetype named as report from COMPOSITION category, I think that we can use the report archetype for Blood pressure query and then the BP archteype from OBSERVATION category. How will user decide which archetype will be used from COMPOSITION category whether report or encounter ? As in the template the BP may be included in both the report or in the enconter.

    Regards,

    Shelly 

  11. Dear Shelly,

    Thanks for your questions.

    Re 1): Precisely the argument should be that AQL doesn't ( and shouldn't) have dependencies on a particular RM, but it is inevitable to use the archetypes underlying RM in the query. All archetype paths actually use the attribute names which are from the corresponding RM, but health professionals don't need to worry about it because it is generated by the tool. Similarly, the FROM clause containment also can be auto-generated from a query builder. Sharing archetypes means sharing the same underlying RM. These happen at the semantic level, which is theoretically different from sharing a low level persistence data model, e.g. relational database schema. The whole point of AQL is to assist data interoperability at semantic level.

    Re 2): Sure, BP can be in different types of compositions. If users don't care which compositions the retrieved BP values come from, the composition archetype id predicate can be removed from the FROM clause, the query would look like:

    SELECT bp/...

    FROM EHR e [ehr_id/value=$ehrUid] CONTAINS COMPOSITION c CONTAINS OBSERVATION bp[openEHR-EHR-OBSERVATION.blood_pressure.v1]

    Hope it helps (smile) ! More questions or discussions are welcome.

    Best regards,

    Chunlan

  12. Greetings,

    The text says that the advanced operators are not supported by grammar, but the latest version (EQL_v0.6.grm) seems to contatin "exists", "not" and "matches" 

    Does that line need updating?

  13. Regarding Archetype Predicate, is it possible to specify a partial match on the archetype id?

    e.g. [openEHR-EHR-OBSERVATION.blood_pressure.*] (matching all blood pressure versions)

    or

    [openEHR-EHR-OBSERVATION.blood_pressure*] (matching all possible descendants of the blood pressure archetypes).

    Note that the latter would not work once we relax the rule about specialization in the Archetype Identification scheme. I assume that we probably would need to introduce some special purpose keyword to express inheritance related predicates. One possibility is to introduce a new "extends" operator and use it as part of the predicate, e.g. [extends openEHR-EHR-OBSERVATION.blood_pressure.v1]

  14. Since it is a subsumption concept, I would suggest in the future supporting the IHTSDO constraint syntax << operator e.g. [<< openEHR-EHR-OBSERVATION.blood_pressure.v1]. Then there is no worry about lexical matching. See the middle of this page for a bit of detail to do with slots - http://www.openehr.org/wiki/display/spec/Towards+a+definition+of+%27slot%27+semantics - its essentially the same matching concept.

  15. I agree. To be exact, we probably should include the namespace as well so the archetype predicate would look like this:

    [<< org.openehr.clinical::openEHR-EHR-OBSERVATION.blood_pressure.v1]

  16. Yes, has to be the case of course. If the namespace were 'org.openehr.clinical' or some other global sort of namespace, we can think of [<< org.openehr.clinical::openEHR-EHR-OBSERVATION.blood_pressure.v1] as more or less being like [<< openEHR-EHR-OBSERVATION.blood_pressure.v1], since in that namespace, 'blood_pressure' should be a globally agreed concept. The same argument works down any hierarchy of namespaces, where e.g. some cancer plan archetype in a national Swedish namspace would be treated as 'the' cancer plan model for everyone in Sweden.

  17. Hi,

    What is the current status of support from the arithmetic operations addition, subtraction, multiplication, and division in AQL?

    In this page the section "What is AQL?" writes:
    "[...]
    AQL has some other features which can be found from other query languages: [...]
    3. Supporting arithmetic operations (such as count, addition, subtraction, multiplication, and division), [...]"

    But the section "Current status" writes:
    "[...] It has been proposed that AQL would support some basic arithmetic functions, such as addition +, subtraction -. Multiplication, and division may be supported as well. [...]"

    These pieces of information seem to be contradicting for me.

    Greetings,
    Mikael

  18. These functions will certainly be supported, but some questions remain on the syntax, because there are implications for performance. If the 'result' of a query (i.e. the Select bit) asks for something like a Count(), or a Max() etc, then we generally want these to be server-side processed, if the source data is multiple patients, or multiple Compositions. However, it may be that in some cases the function should be client-side processed, e.g. if it is on a small number of Compositions that have already been accessed and cached on the client. There are some thoughts that these two cases should be distinguished syntactically.

    In addition, the where clause needs to support the usual predicate logic syntax to enable value comparisons. This part already works for operators like +, -, *, /, but not yet for 'functions' like Count() and so on. If we enable a function like Count() in the Where part, the syntax needs to be compatible with how it appears in the Select part. This is the issue that needs resolution.

  19. Hi Seref, thanks. I will keep the wiki page updated.

    Rong, re your questions of supporting partial match in archetype predicate, we are supporting it, but in RegEx fashion. I will think about Thomas's suggestions, which simplify the syntax for some situations. I will also update the wiki with the partial match of archetype id.

  20. Hello,

    I have questions if current version of AQL Parser implemented by Ocean Informatics supports AND,OR, NOT, EXISTS in Where clause?

    Because I have problem to write an AQL query with these features.

    Thanks

    Michal