Introduction to the FAIR organisational maturity model

Version: 1.0 Date: 2024-04-19

1 Why a FAIR organisational maturity model? Problem statement
2 FAIR: What do we mean with Findable, Accessible, Interoperable and Reusable ?
3 Who worked on this model? FAIR implementation project, Best Practice Working group
4 Guiding Questions for the creation of the FAIR maturity model
5 Who is the FAIR maturity model for?
6 Intended use of the FAIR maturity model
7 Building a FAIR maturity model - guidelines
8

Why a FAIR organisational maturity model? Problem statement

Organizations can be at very different stages in implementing FAIR data principles at a given time. This variance presents challenges, among others, for their leadership to assess, qualify, measure and manage progress towards FAIR implementation. Benchmarking across organizations or even within one department can be very hard. Organizations spend significant time clarifying situations, defining possible actions for desired outcomes, and road mapping. This further complicates the identification of stakeholders, resources needed, as well as internal and external partners required to implement FAIR data principles to produce desired outcomes.

While there are multiple FAIR data maturity models and metrics, there is no simple, agreed-upon, sector-wide maturity assessment model for implementing the FAIR data principle at the organizational level in the life science sector.

FAIR: What do we mean with Findable, Accessible, Interoperable and Reusable ?

We refer to the FAIR data principles.

"Findable" refers to the ease and methods with which data can be located and identified by both humans and machines. Findability involves ensuring that data sets are described with rich metadata, including information about their content, context, and conditions of use. This metadata should be standardized and easily accessible, allowing users to search for and discover relevant datasets using various search tools and platforms. Additionally, each dataset should be assigned a unique and persistent identifier, such as a Digital Object Identifier (DOI), to enable unambiguous identification and citation. By making data findable, humans and machines can efficiently locate and access the information they need, promoting collaboration, reproducibility, and the reuse of data across different disciplines and research projects.

"Accessible" refers to the ability of individuals or IT systems to obtain and retrieve data once it has been located. Access involves ensuring that data is available for use by both humans and machines, subject to appropriate controls and permissions. This includes implementing mechanisms for authentication, authorization, and data security to regulate who (or what) can access the data and under what conditions. Accessible data should be available in a standard format and through standardized and accessible protocols to facilitate seamless retrieval and use. Additionally, access encompasses providing clear and transparent information about how data can be accessed, including any usage restrictions or licensing terms. Ultimately, ensuring access to data is one of the enablers for its reuse across various domains.

“Interoperable" refers to the ability of different systems, tools, and datasets to work together seamlessly. Interoperability involves structuring data in a standardized format using standardized or at least commonly accepted vocabularies, ontologies, and data models, enabling it to be integrated, exchanged, and combined with other datasets. This standardized approach allows data to be interpreted and processed consistently across various platforms and applications, regardless of their underlying technologies or environments. The exchange and integration of data from diverse sources enables more comprehensive analyses, insights, and collaborations. Interoperable data supports reproducibility, scalability, and the efficient reuse of information across different research domains and disciplines.

"Reusable" refers to the suitability of data for use in different contexts and by different stakeholders, both humans and machines. Reusability involves providing clear and comprehensive documentation, including metadata, about the data's content, structure, and usage permissions. This documentation should be easily understandable and accessible to facilitate the effective reuse of the data by others. Additionally, data should be formatted in a standardized and interoperable manner, allowing it to be integrated with other datasets and analyzed using various tools and methods. By making data reusable, researchers can efficiently leverage existing datasets for new analyses and investigations, accelerating scientific progress and innovation. Moreover, clear usage licenses and permissions should accompany the data to specify how it can be reused, ensuring legal and ethical compliance.

Who worked on this model? FAIR implementation project, Best Practice Working group

In February 2023, the Best Practice Working group of the Pistoia Alliance’s FAIR implementation project started creating a cross-sector, organizational-level maturity model for FAIR implementation stages in the life sciences. This model is intended to assist decision makers in evaluating the stage of a given organization (or department) in terms of FAIR maturity, assess the options to achieve higher maturity levels and identify relevant resources that may be required to do so.

Guiding Questions for the creation of the FAIR maturity model

Here are some examples of the guiding questions used when drafting the model:

How do we establish a common shared understanding of the stage at which a given organization finds itself along a plausible FAIR implementation journey?
Where and how do we initiate a FAIR implementation journey?
What key road mapping stages can we recognise as a sector-wide group, i.e., the observable situations, based on the group experience?
What hurdles and benefits can we harvest along the way?
Which components, tools, and resources exist that we can refer to regarding FAIR implementation maturity?
What could one do at a given stage to improve and move to the next one?
How can we simplify and streamline communication to align different stakeholders?

Who is the FAIR maturity model for?

Stakeholders and intended user groups for the FAIR Maturity Matrix include:

Leadership and Managers of life-science organizations: those responsible for resource allocation and budgeting, even if they need to gain expertise in FAIR data principles.
FAIR Data Experts: individuals well-versed in the intricacies of FAIR data principles who play a crucial role in guiding and executing the implementation process.
Implementation Partners: This encompasses Contract Research Organizations (CROs), service providers, consulting firms and academic partners, which contribute essential support and expertise.
Regulatory Authorities and Funding Agencies: Involving regulatory bodies to ensure alignment with compliance standards and regulations, fostering a comprehensive and compliant FAIR data implementation.

Intended use of the FAIR maturity model

The Pistoia FAIR maturity model intends to provide an actionable tool for (self-)assessment of an organization's implementation of the FAIR data principles.

The model is descriptive rather than prescriptive. It should enable multiple stakeholders to reach similar conclusions based on observations of a specific organization at a given time. There are different data models, and the reader can also refer to the EDMC’s DCAM, DAMA’s DMBoK and NIST’s Research Data Framework (RDaF).

The structure of this first model is a matrix. We see each matrix element in relation to its nearest neighbors for the “maturity axis” and with all the elements in the same level, or the “dimensions axis”. That should, in turn, indicate concrete directions for improvement.

Should an organization aim at the highest possible maturity level? Not necessarily. That depends on the organization's goals, business, and use cases.

Building a FAIR maturity model - guidelines

The aim is to provide an actionable (self) assessment tool. The model should capture as much of the required complexity as possible without too much detail, striking a balance between the inherent complexities and details and still extracting general features to describe the maturity qualitatively but with sufficient accuracy to enable stakeholder alignment.

Descriptive rather than prescriptive, the model should enable multiple stakeholders to reach similar conclusions based on observation of a given, specific organization at a given point in time. Each organization is a complex entity in a complex ecosystem. It may require a specific action to improve if it concludes it has a case to do so given its capability, needs, means and strategic intention.

The document should be authoritative and created by practitioners who have agreed on its content. It should refer to and align with existing frameworks, whether public or generated in private organizations.

As self-consistent and actionable as possible, the interpretation of the content should be in the context of lower and higher maturity levels and different dimensions.

As much as possible, we use commonly understood and ideally referenced terminology. We provide examples of relevant practices, cases or implementations as much as possible. Where possible, we use references to known definitions.

The intention is to provide an initial instrument to the FAIR community. It is unlikely the first version will comply with the FAIR data principles. This model is and will likely never be perfect, but it could hopefully be “good enough” to enable better and more effective implementation of FAIR. It should be updated and improved in subsequent iterations.