A Semantic Web model for the integration of multiple models.
Never complain, never explain
IRI |
|
Preferred Label |
Supermodel |
Definition |
A Supermodel is an enterprise data model that provides a structure for the creation of models for specialised datasets that allow for their uniqueness to be preserved but also provide for their deep integration. |
Created |
2021-12-10 |
Modified |
2025-11-14 |
Issued |
- |
Creator |
|
Publisher |
This model is not yet officially adopted by a government organisation. |
Provenance |
This model was created in response to the needs of several projects that require both specialised models for different datasets and their integration. The projects were: Geoscience Australia’s Sites, Samples Surveys modelling and Linked Data systems upgrade in 2022, FSDF operationalisation 2022, Queensland Spatial Information Addressing & Cadastral modelling 2022 - 2025, the Indigenous Data Network catalogue modelling, 2021-2023, the Geological Survey of WA’s Knowledge Graph, 2025 and the EIA Demonstrator, 2025. |
Status |
Stable |
Version |
|
Code Repository |
|
License |
|
Copyright |
© Nicholas J. Car, 2021-2024, KurrawongAI, 2025 |
Machine-readable form (RDF) |
A Supermodel is an enterprise data model that provides a structure for the creation of models for specialised datasets that allow for their uniqueness to be preserved but also provide for their deep integration.
This model contains two parts: a Methodology and a Model which describe how to approach supermodel modelling and how to formulate technical models, respectively.
I always say, even I don’t wake up looking like Cindy Crawford
The Supermodel approach to modelling provides a structure to the methods of creation and technical formulation of sets of models that both represent specialised datasets and allow them to be integrated. It does this by ensuring that the modelling of the specialised datasets, while implementing custom model elements for their unique content, also reuses model elements from background models and controlled terms from vocabularies and reference datasets, as much as possible.
An informal overview of the GSWA Supermodel and its various parts.
The figure above shows the parts of the Geological Survey of Western Australia's Supermodel of their geological and resource sector regulation data. It shows models for specialised geological datasets such as Boreholes, Samples and Mining Authorisation as well as more generic Background Models such as SOSA (for observations and measurements), GeoSPARQL (for spatial features) and so on. It also indicates some of the vocabularies of controlled terms used by both Foreground and Background Models, such as Minerals and Commodities.
The complete documentation for the GSWA Supermodel is online.
EXAMPLE
Say you have two environmental datasets that you want to integrate: Field Measurements of Trees and Aerial Vegetation Heights. To integrate them, you will need to find as many integration points as possible between them. Clearly spatial location is an easy integration point as imagery and measurements can be located, however we need more than this. We need integration on aspects of the tree & vegetation height information as well, to get commensurate data. For this, we need a model of metrology - measurement science - to tease out the procedures of measurement, the properties of things observed as well as units of measure. For this we can use SOSA as a background model as it provides defined relations for these things and indicates how similar and different procedures and so on can be identified.
Using SOSA, we can record the Observed Properties, Procedures and Results of Observations and link individual datasets to different instances of them. Even though the Procedures and other aspects of the data will be different, we’ve achieved a common data pattern integration point.
With this pattern applied to the datasets, we may be able to relate values in one to the other and therefore "deeply" integrate them.
All the formal relations, model roles and the Model given here are conceptualised as part of the formal, machine-readable, Internet of distributed resources known as the Semantic Web. The Semantic Web is a model-driven and yet flexible way of representing concepts and the relations between them that acts as a giant, distributed, database across the Internet.
Semantic Web resources - data, models & vocabularies - are the best form of digital resources known for reuse since they require all elements to be strongly defined - nothing is left to interpretation - and they are optimised for distribution and referencing.
While many useful models are not Semantic Web models, all models can be realised as such and the Semantic Web formulation of models and data in the Web Ontology Language is necessary for supermodels so that many existing models can be used and so things created for a particular supermodel themselves can be maximally reused.
Elements in a Supermodel must be identified and related in a formal (precise, machine-readable, model-defined) way, which requires modelling of the supermodel content according to the Model section below. This allows for more than just a generally organised way of arranging information that an informal guide allows for, such as automated data validation, dependency mapping and aggregation of supermodels into even larger supermodels. These benefits are enormous when supermodels get large in terms of breadth and scale of data.
Critical to creating formal supermodel elements relations are element identification, declarations of dependence and conformance claims. Establishing these are detailed in the Methodology section below.
One aspect of formal supermodel element descriptions are the roles that elements play within a supermodel scope. Particular models may be for the representation of the data within datasets to be integrated are called - Foreground Model role - or they may be used by other models as reference - Background Model role - or perhaps both (see the GSWA overview figure at the top).
This section describes how to create a supermodel. In outline:
-
Identify Datasets
-
Identify Dataset Models
-
assume one will be implemented unless all elements of the model can be modelled using Background Models or other Dataset’s Foreground Models
-
-
Extract controlled terms
-
within the datasets and models and term these Vocabularies
-
-
Look for modelling patterns
-
within the Foreground Models. Taken from general-purpose models
-
-
Identify general-purpose models
-
as Background Model, for those used for modelling patterns used in Foreground Models
-
-
Formalise Relations
-
the dependence of Foreground Models on Background Models, roles and so on, as per the Model section
-
This step can be straightforward: it may be wellknown which datasets are within scope for a supermodel.
Datasets may be formulated according to any logic that makes sense to the supermodel creators. Some options are:
-
by governance grouping
-
consider all the data governed as a distinct part as a dataset
-
-
by subdomain
-
consider all the data within a subdomain of the supermodel as a dataset
-
These usually provide better datasets criteria than data by time or space as these dimensions of data can be modelled within a larger dataset.
For each dataset, a formal and separate model must be either created or borrowed.
Assume that each dataset will have a Foreground Model, but expect and aim to deprecate any you can for shared models and Background Model reuse as this leads to better integration.
Lookup tables in databases, taxonomies and flat lists of controlled terms need to be extracted from datasets and models for them and placed into stand-alone vocabularies. This allows for reuse of their elements across datasets creating more integration points.
This is the hardest step.
It is critical to look within the models for each dataset and define small modelling patterns that can be reused in them. This pattern reuse teases out aspects of the data that can be linked across different datasets - turned into integration points - but identifying parts that can be placed into vocabularies, common classes of information and so on.
It can be very hard to understand what makes for a sensible pattern without extensive knowledge of lots of reference models within a domain relevant to the supermodel at hand.
Some approaches that may assist here are:
-
searching across standards bodies' models
-
the World Wide Web Consortium's fundamental models for observations & measurement, time, organisations, statistical data cubes
-
-
looking within the various domain portals
-
using schema.org
-
schema.org is a well-known, simple, general-purpose model
-
Searching for words in your domain of interest in the above locations will lead on to further resources elsewhere, as part of the Semantic Web.
All general purpose models that have had their whole or parts reused by Foreground Models must be included as Background Models within the supermodel.
Once the above steps have been carried out, a formal description of the supermodel must be made, according to the Model below.
This means defining a Supermodel object, all the Foreground & Background Models and Vocabularies within it, as well as each’s role and relations between them.
We don’t wake up for less than $10,000 a day
The Model for a Supermodel is summarised as follows:
-
all elements of the supermodel are modelled as instances of well-known Semantic Web model classes
-
relations between elements are modelled using a fixed set of predicates
In addition to those main points, supermodel elements should be documented with basic annotation predicates such as schema:name.
The figure below shows expected relationships between expected classes.
Defined relationships between Supermodel element classes
Prefixes in the table below for IRI are used to indicate the identity of elements within this model.
| Prefix | Namespace | Description |
|---|---|---|
|
Supermodel |
|
|
Dublin Core Terms |
|
|
Generic examples |
|
|
OGC GeoSPARQL |
|
|
Web Ontology Language ontology |
|
|
Profiles Vocabulary |
|
|
RDF Schema ontology |
|
|
Sensor, Observation, Sample, and Actuator ontology |
|
|
schema.org model |
|
|
Simple Knowledge Organization System (SKOS) ontology |
|
|
XML Schema Definitions ontology |
An example prefixed Semantic Web object is the SKOS model’s Concept Scheme class used to represent taxonomies (vocabularies) of concepts. It’s full IRI is http://www.w3.org/2004/02/skos/core#ConceptScheme and in prefixed form is skos:ConceptScheme
An example of a real supermodel instance - that of Queensland Spatial Information's Queensland Spatial Information Supermodel online at https://linked.data.gov.au/def/qsi-supermodel - abbreviated for clarity by hiding the vocabularies and some profile relations, is as follows:
The source data for the image above, in the Turtle format, is as follows:
<https://linked.data.gov.au/def/qsi-supermodel>
a prof:Profile ;
schema:name "Queensland Spatial Information Supermodel" ;
schema:hasPart
<https://linked.data.gov.au/dataset/qld-addr> ,
<https://linked.data.gov.au/dataset/qld-place-names> ,
<https://linked.data.gov.au/dataset/qld-road-names> ;
.
<https://linked.data.gov.au/dataset/qld-addr>
a schema:Dataset ;
schema:name "Queensland Addresses Dataset" ;
dcterms:conformsTo <https://linked.data.gov.au/def/addr> ;
.
<https://linked.data.gov.au/dataset/qld-place-names>
a schema:Dataset ;
schema:name "Queensland Place Names Dataset" ;
dcterms:conformsTo <https://linked.data.gov.au/def/gn> ;
.
<https://linked.data.gov.au/dataset/qld-road-names>
a schema:Dataset ;
schema:name "Queensland Road Names Dataset" ;
dcterms:conformsTo <https://linked.data.gov.au/def/roads> ;
.
<https://linked.data.gov.au/def/addr>
a owl:Ontology ;
schema:name "Address Model" ;
prof:isProfileOf
<https://linked.data.gov.au/def/cn> ,
<http://www.opengis.net/doc/IS/geosparql/1.1> ;
.
<https://linked.data.gov.au/def/gn>
a owl:Ontology ;
schema:name "Geographical Names Model" ;
prof:isProfileOf
<https://linked.data.gov.au/def/cn> ,
<http://www.opengis.net/doc/IS/geosparql/1.1> ;
.
<https://linked.data.gov.au/def/roads>
a owl:Ontology ;
schema:name "Road Names Model" ;
prof:isProfileOf
<https://linked.data.gov.au/def/cn> ,
<http://www.opengis.net/doc/IS/geosparql/1.1> ;
.
<https://linked.data.gov.au/def/cad>
a owl:Ontology ;
schema:name "Cadastre Model" ;
prof:isProfileOf
<https://linked.data.gov.au/def/cn> ,
<https://linked.data.gov.au/def/csdm> ;
.
<https://linked.data.gov.au/def/csdm>
a owl:Ontology ;
schema:name "Cadastral Survey Data Model" ;
prof:isProfileOf <http://www.opengis.net/doc/IS/geosparql/1.1> ;
.
<https://linked.data.gov.au/def/cn>
a owl:Ontology ;
schema:name "Compound Names Model" ;
prof:isProfileOf
<https://linked.data.gov.au/def/lifecycle> ,
<https://schema.org> ;
.
<https://schema.org>
a owl:Ontology ;
schema:name "schema.org Model" ;
prof:isProfileOf <http://www.w3.org/2002/07/owl> ;
.
<http://www.opengis.net/doc/IS/geosparql/1.1>
a owl:Ontology ;
schema:name "GeoSPARQL Model" ;
prof:isProfileOf <http://www.w3.org/2002/07/owl> ;
.
<https://linked.data.gov.au/def/lifecycle>
a owl:Ontology ;
schema:name "Lifecycle Model" ;
prof:isProfileOf <http://www.w3.org/2002/07/owl> ;
.
<http://www.w3.org/2002/07/owl>
a owl:Ontology ;
schema:name "Web Ontology Language Model" ;
.Use instances of schema:Dataset for elements with the role Dataset.
The definition of this class is taken from [SDO].
<https://linked.data.gov.au/dataset/qld-addr>
a schema:Dataset ;
schema:name "Queensland Addresses Dataset" ;
dcterms:conformsTo <https://linked.data.gov.au/def/addr> ;
.Use an instance of prof:Profile for the supermodel.
The definition of this class is taken from [PROF].
Use instances of owl:Ontology for elements with the role Model.
The definition of this class is taken from [OWL].
Use instances of skos:ConceptScheme for elements with the role Vocabulary.
The definition of this predicate is taken from [SKOS].
Use instances of owl:Class for class definitions for elements with the role Model.
The definition of this predicate is taken from [OWL].
Use instances of sh:Shape for graph pattern matching elements within resources with the role Validator.
The definition of this predicate is taken from [SH].
Use instances of skos:Concept for elements with the role Vocabulary.
The definition of this predicate is taken from [SKOS].
The definition of this predicate is taken from [DCTERMS].
Dependence should be indicated whenever one model reuses any elements from another.
The definition of this predicate is taken from [PROF].
Use the predicate rdfs:member to indicate that:
The definition of this predicate is taken from [RDFS].
Use the predicate rdf:type to indicate that an object is an instance of a Class.
The definition of this predicate is taken from [RDF].
Use the predicate prof:hasResource to indicate that a Profile contains an Ontology or a Concept Scheme.
The definition of this predicate is taken from [PROF].
NOTE::`sm:targets` is a superproperty of sh:targetNode, sh:targetClass, sh:targetSubjectsOf and sh:targetObjectsOf.
The definition of this predicate is taken from this specification.
Use the predicate dcterms:requires to indicate that a Shape needs a Concept or a Concept Scheme in a constraint on a Class.
The definition of this predicate is taken from [DCTERMS].
Use the predicate skos:inScheme to indicate that a Concept is within a Concept Scheme.
The definition of this predicate is taken from [SKOS].
With this model is supplied a validator that can be used to test conformance claims of data to it. The validator implements a number of rules ensuring that Classes and Predicates are formulated correctly and roles assigned appropriately. The validator is online at:
Tools such as Zazuko’s SHACL Playground or KurrawongAI’s SHACL Validator can be used with the validator file above to test supermodel data online or the pySHACL package can be used to validate data within Python code.
These are the roles that elements within a supermodels can play. They cannot be interpreted as classes of object because roles for datasets, models and so on may vary between supermodel instances and thus are not an eternal type for that object.
The roles are a vocabulary of concepts in this hierarchy:
This roles vocabulary is available in RDF at https://linked.data.gov.au/def/supermodel/roles.ttl.
Acting as a managed aggregation of foreground or reference Data.
Acting as a collection of foreground Data.
Acting as a controlled collection of terms, usually with a hierarchy but sometimes just a list, used as Reference Dataset by Datasets.
This is a child term of Reference Dataset.
In general in the Semantic Web, vocabularies are indistinguishable from simple Models but within supermodels, vocabularies should be created as lists or hierarchies of terms used to classify or categorise other objects, not as collections of complex objects.
See the VocPub profile as a model of classification vocabularies that can be followed for supermodels.
Acting as an abstract information object that organises elements of data and standardises how they relate to one another and to the properties of real-world entities.
Acting as the definitional object of the Supermodel and container of all its elements.
Acting as a Model for a Dataset of Foreground Data.
This is a child term of Model.
Acting as a Model for a Dataset of Reference Dataset.
This is a child term of Model.
This is a child term of Model.
Acting as a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning.
Acting as Data that is the main subject of the supermodel that may or may not also be used as Reference Data.
Acting as Data that is not the main subject of the supermodel but is linked to by such data.
There are 3 billion women in the world who don’t look like supermodels and only 8 that do
- [DCTERMS]
-
DCMI Usage Board, DCMI Metadata Terms, RDF Vocabulary (20 January 2020). https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
- [OWL]
-
World Wide Web Consortium, OWL 2 Web Ontology Language, W3C Recommendation (11 December 2012). https://www.w3.org/TR/owl2-overview/
- [PROF]
-
World Wide Web Consortium, The Profiles Vocabulary, W3C Working Group Note (18 December 2019). https://www.w3.org/TR/dx-prof/
- [RDF]
-
World Wide Web Consortium, RDF 1.1 Concepts and Abstract Syntax, W3C Recommendation (25 February 2014). http://www.w3.org/TR/rdf11-concepts/
- [RDFS]
-
World Wide Web Consortium, RDF Schema 1.1, W3C Recommendation (25 February 2014). http://www.w3.org/TR/rdf11-schema/
- [SDO]
-
schema.org Consortium, schema.org, Ontology (14 November 2025). https://schema.org
- [SH]
-
World Wide Web Consortium, Shapes Constraint Language (SHACL), Ontology (20 July 2017). https://www.w3.org/TR/shacl/
- [SKOS]
-
World Wide Web Consortium, SKOS Simple Knowledge Organization System Reference, W3C Working Group Note (19 August 2009). https://www.w3.org/TR/skos-reference/