Playing FAIR

Mark Wharton, IOTICS’ co-founder & inventor, looks at how FAIR principles enable Digital Twins to operate in ecosystems

Educational 15 Oct 2021 by Mark Wharton

FAIR data principles were first expounded by a consortium of leading scientists and organisations
in Scientific Data magazine in 2016. Their goal was to ensure that scientific data sets could be found
and used by machines, with minimal human intervention. FAIR stands for Findable, Accessible,
Interoperable and Reusable.

Five years later, how can these principles help to create cooperative, data-sharing ecosystems up
and down supply chains – in complex operational landscapes such as transport or enabling
servitisation models in multi-party ownership scenarios such as utilities? In parallel we have seen the
evolution of the digital twin. While digital twins mean different things to different people, the widely
accepted interpretation is a digital data twin, where the twin is a coupled virtual representation of
something in the real world, not merely its shadow or model.

One of the first uses of digital twins was to make a ‘single pane of glass’ abstraction of an asset –
which could be a thing, a person or a place. Data for assets is inevitably stored in incompatible ways
in disparate systems for multiple, unrelated purposes. So, by focusing access to all this data through
the digital twin, you create a single point where authorised people, applications – and even other
twins – can go to find out about the asset, its current state and even subscribe to its updates. A big
advantage of this model is that the owner of the asset stays in control of what is seen through the
‘pane of glass’ and who’s allowed to look at it.

This approach works for purpose-built applications that have knowledge about a twin’s data built
into their logic. For example, they may know that the dimensions of the asset are recorded in the
twin as centimetres and the weight in kilograms. But this means that the data must be programmed
by people; it’s then fixed to that type of twin, and it’s difficult for the parties with whom you want to
share the data to understand it and gain maximum value from it.

This approach only partially addresses the problem of multiple sources and has moved the
interpretation of the data into application logic. Imagine the application is data-centric, in that it
reacts to data and metadata – data about data. In our example, the weight data would ‘say’ it was in
kilograms and the application would use this to interpret the data and respond accordingly.

Now imagine how the ‘browser’ model, that we use every day interacting with the web, would work
with digital twin data. You could search for twins and allow the application to react to metadata and
display the twin data in the most appropriate way. You could show data from more than one twin
and compare them. You could write code snippets to do this automatically. You could even take data
from multiple twins, run synthesising algorithms on it and publish the results as more twins.

But this is only possible if digital twins are made FAIR. ‘Search’ implies that the twins are
made Findable by their creator. ‘Choose’ implies that the twins are Accessible – if you’re authorised.
‘React and Compare’ imply that the data received is understandable and hence Interoperable.
Code snippets, synthesis and algorithms all imply that the data can be used for reasons that it was
not originally created – hence Reusable.

Making data FAIR

In the internet world, HTML is used to ‘mark-up’ data to tell browsers how to render it. For example,
tags like <table>, <tr>, <td>, etc tell the browser that this is tabular data. The FAIR principles don’t
stipulate what method is to be used to specify metadata, but they do say that: “(Meta)data use a
formal, accessible, shared and broadly applicable language for knowledge representation”

There aren’t too many of these languages. RDF (Rich Data Format) and the Semantic
Web technologies are the de-facto standards. But the digital twin browser application we imagined
is not the end of the story. The originators of the FAIR principles had autonomous machine
interoperability as one of their goals.

If we apply this thinking to digital twins in an ecosystem, twins’ agents – the behavioural part of a
twin – could search for twins near or related to them, interact with their data and then maybe drop
the connection when they’ve moved on. For example, the twin of a train could search for nearby
twins of pollen count data as it was moving, because pollen clogs the filters when the engine is
running. Twins of engines on the train would know when they were running and update their
metadata to reflect that they have been affected. Service engineers can look at the twins of the
engines in the train to see when the filters need to be changed.

FAIR data principles build on the foundations of each other. You can’t reuse if you can’t
interoperate; you can’t interoperate if the data is inaccessible; you can’t try to access data if you
can’t find it.

Saying that the FAIR principles are for scientific datasets, is like saying Amazon is for books. Five
years ago, the originators of the FAIR principles must have had a good idea that their principles
applied to many things – including digital twin ecosystems. The best principles work like that. They
give you yardsticks and guidance but don’t limit where you apply them.

* Examples,
TechUK &amp; CDBB:


Join Our Community

We enable the world’s data to interact safely and securely with other data, of all types, in all places, dynamically.