Why a FAIR Maturity model? Problem statement
Organizations can be at very different stages in implementing FAIR data principles at a given time. This variance presents challenges, among others, for their leadership to assess, qualify, measure and manage progress towards FAIR implementation. Benchmarking across organizations or even within one department can be very hard. Organizations must spend significant time clarifying situations, defining possible actions for desired outcomes, and road mapping, which further complicates the identification of stakeholders, resources needed, and internal and external partners to implement FAIR data principles to produce desired outcomes.
While there are multiple FAIR data maturity models and metrics, there is no simple, agreed-upon, sector-wide maturity assessment model for implementing the FAIR data principle at the organizational level in the life science sector.
FAIR: What do we mean with Findable, Accessible, Interoperable and Reusable ?
We refer to the FAIR data principles.
"Findable" refers to the ease and methods with which data can be located and identified by both humans and machines. Findability involves ensuring that data sets are described with rich metadata, including information about their content, context, and conditions of use. This metadata should be standardized and easily accessible, allowing users to search for and discover relevant datasets using various search tools and platforms. Additionally, each dataset should be assigned a unique and persistent identifier, such as a Digital Object Identifier (DOI), to enable unambiguous identification and citation. By making data findable, humans and machines can efficiently locate and access the information they need, promoting collaboration, reproducibility, and the reuse of data across different disciplines and research projects.
"Accessible" refers to the ability of individuals or IT systems to obtain and retrieve data once it has been located. Access involves ensuring that data is available for use by both humans and machines, subject to appropriate controls and permissions. This includes implementing mechanisms for authentication, authorization, and data security to regulate who (or what) can access the data and under what conditions. Accessible data should be available in a standard format and through standardized and accessible protocols to facilitate seamless retrieval and use. Additionally, access encompasses providing clear and transparent information about how data can be accessed, including any usage restrictions or licensing terms. Ultimately, ensuring access to data is one of the enablers for its reuse across various domains.
“Interoperable" refers to the ability of different systems, tools, and datasets to work together seamlessly. Interoperability involves structuring data in a standardized format using standardized or at least commonly accepted vocabularies, ontologies, and data models, enabling it to be integrated, exchanged, and combined with other datasets. This standardized approach allows data to be interpreted and processed consistently across various platforms and applications, regardless of their underlying technologies or environments. The exchange and integration of data from diverse sources enables more comprehensive analyses, insights, and collaborations. Interoperable data supports reproducibility, scalability, and the efficient reuse of information across different research domains and disciplines.
"Reusable" refers to the suitability of data for use in different contexts and by different stakeholders, both humans and machines. Reusability involves providing clear and comprehensive documentation, including metadata, about the data's content, structure, and usage permissions. This documentation should be easily understandable and accessible to facilitate the effective reuse of the data by others. Additionally, data should be formatted in a standardized and interoperable manner, allowing it to be integrated with other datasets and analyzed using various tools and methods. By making data reusable, researchers can efficiently leverage existing datasets for new analyses and investigations, accelerating scientific progress and innovation. Moreover, clear usage licenses and permissions should accompany the data to specify how it can be reused, ensuring legal and ethical compliance.
Who worked on this model? FAIR implementation project, Working group Best Practices
In February 2023, the Working group Best Practices of the Pistoia Alliance’s FAIR implementation project started creating a cross-sector, organizational-level maturity model for FAIR implementation stages in the life sciences. This model intends for decision makers to evaluate the stage of a given organization (or department) in terms of FAIR maturity, assess the options to achieve higher maturity levels and identify relevant resources.
Guiding Questions for the creation of the FAIR maturity model
Here are some examples of the guiding questions used when drafting the model:
How do we establish a common shared understanding of the stage at which a given organization finds itself along a plausible FAIR implementation journey?
Where and how do we initiate a FAIR implementation journey?
What key road mapping stages can we recognise as a sector-wide group, i.e., the observable situations?
What hurdles and benefits can we harvest along the way?
Which components, tools, and resources exist that we can refer to regarding FAIR implementation maturity?
What could one do at a given stage to improve and move to the next one? How can we simplify and streamline communication to align different stakeholders?
Who is the FAIR maturity model for?
Stakeholders and intended user groups for the FAIR Maturity Matrix include:
Leadership and Managers of life-science organizations: those responsible for resource allocation and budgeting, even if they need to gain expertise in FAIR data principles.
FAIR Data Experts: individuals well-versed in the intricacies of FAIR data principles who play a crucial role in guiding and executing the implementation process.
Implementation Partners: This encompasses Contract Research Organizations (CROs), service providers, and consulting firms, which contribute essential support and expertise.
National Regulatory Authorities and Funding Agencies: Involving regulatory bodies to ensure alignment with compliance standards and regulations, fostering a comprehensive and compliant FAIR data implementation.
Intended use of the FAIR maturity model
The Pistoia FAIR maturity model intends to provide an actionable tool for (self-)assessment of an organization's implementation of the FAIR data principles.
The model is descriptive rather than prescriptive. It should enable multiple stakeholders to reach similar conclusions based on observations of a specific organization at a given time. There are different data models, and the reader can also refer to the EDMC’s DCAM (DMBoK) and NIST’s Research Data Framework (RDaF).
The structure of this first model is a matrix. We see each matrix element in relation to its nearest neighbors for the “maturity axis” and with all the elements in the same level, or the “dimensions axis”. That should, in turn, indicate concrete directions for improvement.
Should an organization aim at the highest possible maturity level? Not necessarily. That depends on the organization's goals, business, and use cases.
Building a FAIR maturity model - guidelines
The aim is to provide an actionable (self) assessment tool. The model should capture as much of the required complexity as possible without too much detail, striking a balance between the inherent complexities and details and still extracting general features to describe the maturity qualitatively but with sufficient accuracy to enable stakeholder alignment.
Descriptive rather than prescriptive, the model should enable multiple stakeholders to reach similar conclusions based on observation of a given, specific organization at a given point in time. Each organization is a complex entity in a complex ecosystem. It may require a specific action to improve if it concludes it has a case to do so given its capability, needs, means and strategic intention.
The document should be authoritative and created by practitioners who have agreed on its content. It should refer to and align with existing frameworks, whether public or generated in private organizations.
As self-consistent and actionable as possible, the interpretation of the content should be in the context of lower and higher maturity levels and different dimensions. The structure of the first model is a matrix. We see each matrix element in relation not only to its nearest neighbors for the “maturity axis” but with all the elements in the same level, or the “dimensions axis”. That would, in turn, indicate concrete directions for improvement.
As much as possible, we use commonly understood and ideally referenced terminology. We provide examples of relevant practices, cases or implementations as much as possible. Where possible, we use references to known definitions.
The intention is to provide an initial instrument to the FAIR community. It is unlikely the first version will comply with the FAIR data principles. This model is and will likely never be perfect, but it could hopefully be “good enough” to enable better and more effective implementation of FAIR. It should be updated and improved in subsequent iterations.
FAIR maturity matrix: Dimensions
The working group identified the 7 dimensions to articulate the FAIR maturity model. The dimensions are not hierarchical; each provides a different and complementary perspective on a given facet of a complex environment.
Dimension | Key notions |
FAIR data | Data, metadata, data products, cf existing models |
FAIR leadership | Types of leadership necessary for FAIR implementation |
FAIR strategy | Now, the next FAIR stage and how to get there |
FAIR roles | What kind of (human) roles are necessary for implementation of FAIR data principles |
Processes for FAIR | Which processes must we explicitly implement |
FAIR knowledge | What needs to be known (by humans and machines) for FAIR implementation |
FAIR tools and infrastructures | From persistent identifiers to controlled vocabularies to semantic models |
Table - The 7 dimensions of the FAIR Maturity Matrix.
Combining all these factors is required to describe a given maturity level. It is also possible for organizations to reach various levels at a given point depending on the granularity (e.g. ecosystem/ enterprise/sector/department, etc.) considered. It is also possible for maturity to be not completely in sync for all dimensions. While FAIR implementation journeys may be similar, they are very context-dependent: the various dimensions intend to provide a broad frame to describe the situation accurately.
FAIR data
This dimension concerns metadata, data, and data products. Several frameworks to evaluate the maturity of FAIR data exist publicly and in private organizational environments. The intention of this dimension is to provide a qualitative pragmatic indication, not to replace any of the existing models. Especially at the initial levels, it should help identify the FAIR data principles in focus when starting a FAIR data implementation journey. This dimension closely connects to FAIR tools, infrastructures, processes, and roles.
(Back to the FAIR Matrix).
FAIR leadership
This dimension deals with types and leadership levels required to implement FAIR principles in a life science organization. Ultimately, leadership "owns" the vision of FAIR implementation, or the “why”. Leadership roles are also necessary to ensure that strategies can be defined, enabled, implemented, and executed. Leadership ensures the resources (financial, time, priority) are set and available. Ultimately, leadership, at various levels in and outside the boundaries of the organizations, is accountable for implementing FAIR data principles. This requires a sufficiently deep level of understanding of the FAIR principles, of the costs (time, financial, opportunity) associated with data practices that are not FAIR and the skills needed to implement FAIR data principles.
(Back to the FAIR Matrix).
FAIR strategy
Strategies are frameworks for making decisions related to FAIR implementation, from business case to capability-building to running operations. After setting the "why" of FAIR implementation journeys and what "will success look like,” "how and with which priorities will the organization evolve given the current status?” This dimension is also concerned with deciding what not to do (e.g., only some data may need to be FAIR, “FAIR Enough”), identifying metrics, organizational sectors (beyond R&D) involved and the cultural change required.
(Back to the FAIR Matrix).
FAIR roles
We address the roles required in an organization to implement FAIR principles in this dimension. What are the jobs to do? Who would ensure that happens? Who are the people responsible for FAIR implementation? These roles will ensure project execution to build capabilities and operational roles maintaining FAIR processes.
(Back to the FAIR Matrix).
Processes for FAIR
FAIR data principles implementation requires underlying processes connecting the necessary metadata, data, tools, roles and knowledge. Some of these processes may be implicit in the early stages of FAIR data implementation. Still, they will become more explicit and so ubiquitous that they will become transparent once we achieve the highest maturity levels.
(Back to the FAIR Matrix).
FAIR knowledge
The FAIR Knowledge dimension concerns the factual, conceptual, procedural knowledge required for FAIR implementation at the various stages. Knowledge is often associated with human expert roles and connects with the “FAIR roles” dimension.
(Back to the FAIR Matrix).
FAIR tools and infrastructures
This dimension concerns the essential information technology tools and infrastructures needed to implement FAIR data principles. It is closely related to the “FAIR data” dimension of the maturity model. The close relationship between FAIR implementation and semantic web technologies underscores the significance of the tools required to implement the FAIR principles. Beyond these aspects, additional frameworks that can enhance the implementation of FAIR data principles, broadening the scope of considerations for a comprehensive approach, are considered.
The FAIR tools and infrastructure dimension has an indicator for each element of FAIR. This indicator helps show the contribution the tool or infrastructure makes to the maturity level:
Findability The tool or infrastructure component contributes, enhances or enriches Findability.
Accessibility The tool or infrastructure component contributes, enhances or enriches Accessibility.
Reusability The tool or infrastructure component contributes, enhances or enriches Reusability.
Interoperability The tool or infrastructure component contributes, enhances or enriches Interoperability.
(Back to the FAIR Matrix).
FAIR maturity matrix: maturity levels
The working group decided to use 6 maturity levels, in alignment with other FAIR data frameworks, such as the FAIRplus data maturity model. The levels are labeled L0 to L5. The team also gave each level a nickname to help remember it in relation to one another. (Humans are not machines.) The AstraZeneca team members introduced the “marketplace” metaphors.
Level | Nickname | Marketplace metaphor |
0 | “life is unFAIR” | “Junkyard” |
1 | "Started the FAIR journey." | "Flea market" |
2 | "Getting FAIR" | "Street Market" |
3 | "Pretty FAIR" | "Specialized Local Markets |
4 | "Really FAIR " | "Hyper Market" |
5 | "FAIRest of them all" | "Digital Online Store" |
Table - The 6 levels of the FAIR Maturity Matrix model.
It is important to note that most capabilities and features in the levels are cumulative: usually, Level N encompasses the capabilities developed in Level N-1. There are some exceptions; for example, in the FAIR Process dimension, “FAIRification” of historical data does not exist at low levels, appears at intermediate levels and is reduced or may even disappear at the highest levels.
Level 0: “life is unFAIR”
Level | Nickname | Marketplace metaphor | Key Features | Picture |
0 | “life is unFAIR” | “Junkyard” | Lack of FAIR awareness, possibly acquiring awareness. |
Table - Level 0: “life is unFAIR” life is unFAIR.
2.8.1.1 Level 0: “life is unFAIR” summary
Data silos and inconsistency impede accessibility and integration, resembling a "wild west" scenario. Limited awareness and engagement hinder the adoption of FAIR principles, with minimal leadership involvement and a lack of strategic focus. A formal FAIR strategy is necessary to avoid reactive application and the absence of structured pathways. Resistance and cultural barriers impede FAIR adoption, accompanied by a lack of FAIR-related knowledge. The organization needs FAIR processes, and implicit processes may divert efforts from FAIR implementation.
Additionally, a significant deficit in tools and infrastructure for FAIR data management contributes to unstructured data capture. A missing inventory of licenses and access policies further complicates FAIR compliance efforts. Overall, the organization has yet to start and possibly resists the initiation of a FAIR data implementation journey. https://pistoiaalliance.atlassian.net/wiki/x/AQChy
(Back to the FAIR Matrix).
Level 1: "Started the FAIR journey"
Level | Nickname | Marketplace metaphor | Key Features | Picture |
1 | "Started the FAIR journey." | "Flea market" | FAIR Awareness started; first pilots for implementation may appear. |
Table - Level 1: "Started the FAIR journey"
Level 1: "Started the FAIR journey" summary
At this stage, data is siloed and may reside in a shared data platform. Data has heterogeneous characteristics, requiring specialized and specific technical knowledge for access and interpretation. Leadership awareness of FAIR emerges, fostering visionary proposals for FAIR implementation and some pilots. Some strategic plans begin to take shape. Roles such as curators and semantic experts surface, with external experts engaged for training. The organization recognizes the potential business value of FAIR data, prompting considerations for retrospective implementation and specific initiatives of retrospective “FAIRification”. Discussions on metadata centralization and reference data alignment emerge, indicating a shift toward systematic approaches. Awareness among stakeholders expands through workshops. Initial plans for tooling and infrastructure emerge. Pragmatic progress involves enhancing existing IT practices. The organization focuses on centralizing metadata and implementing findability measures, marking an initial, if localized, commitment to embedding FAIR principles in some organizational processes. https://pistoiaalliance.atlassian.net/wiki/x/CACey
(Back to the FAIR Matrix).
Level 2: "Getting FAIR"
Level | Nickname | Marketplace metaphor | Features | Picture |
2 | "Getting FAIR" | "Street Market" | FAIR Pilots for implementation are in place |
Table - Level 2: "Getting FAIR"
Level 2: "Getting FAIR" summary
The organization initiates data conforming processes to local models in a shared data platform and progresses to system-level controls. These processes include Tools for metadata, controlled vocabularies, and persistent identifiers. Data is more “findable” thanks to unique identifiers. Emerging metadata and controlled vocabularies make accessing data with less specific knowledge requirements possible. Leadership awareness grows, initiating initial FAIR projects and forming champions within the company. Vision, strategy, and role development follow, integrating FAIR as a key element in the broader data strategy. Designated roles emerge, fostering prototypes and showcasing value. Formal training frameworks are structured, and pilot projects and governance considerations accompany the initiation of culture change processes. Informal communities of practice begin to form, facilitating knowledge exchange. We establish Infrastructure for Proof of Concepts (POCs) d, leading to organization-wide plans, RFPs, and evaluations of FAIR tools and profiles. At this journey stage, an increasing commitment to FAIR data principles, encompassing leadership engagement, role development, and systematic infrastructure implementation, is present. https://pistoiaalliance.atlassian.net/wiki/x/BIDFy
(Back to the FAIR Matrix).
Level 3: "Pretty FAIR"
Level | Nickname | Marketplace metaphor | Features | Picture |
3 | "Pretty FAIR" | “Specialized Local Markets” | FAIR Transition to good and best practice |
Table - Level 3: "Pretty FAIR"
Level 3: "Pretty FAIR" summary
FAIR data sets, adhering to domain-level models with controls on data access, increasingly appear. Machine interpretation begins to be demonstrated locally (e.g., in a department). Leadership still plays a crucial role in setting expectations for FAIR in project budgets and establishing organizational metrics related to FAIR. A refined vision and strategy exist, with an organization-wide supported plan for FAIR implementation. Key roles, such as data standard experts and curators, emerge. A cultural shift towards a data-driven approach begins to appear. Integrating Formal training into organizational practices and FAIR practices becomes ingrained in workflows, at least in some functions or departments. The organization fosters broader communities of knowledge and practice practitioners and establishes domain knowledge expertise within each key department, establishing Processes for FAIR data management. FAIR data generation and interaction mechanisms are conceptually defined. Budget and human-resources capacity is allocated for organization-wide FAIR delivery, utilising COTS (commercial, off-the-shelf) -, or “Standard” tools when possible. At this stage of the journey, the commitment to FAIR data principles shows increasing outcomes and impact, at least at the local level in the organization. https://pistoiaalliance.atlassian.net/wiki/x/BADDy
(Back to the FAIR Matrix).
Level 4:"Really FAIR"
Level | Nickname | Marketplace metaphor | Features | Picture |
4 | "Really FAIR " | "Hyper Market" | FAIR Operational, best practice known at the time of writing. Internal organizational focus. Emerging cross-company focus. |
Table - Level 4:"Really FAIR"
Level 4:"Really FAIR" summary
This level describes the combination of currently known best achieved or achieved practices at an organizational level.
FAIR principles are pervasive across departments. Data, metadata, and identifiers conform to cross-domain standards, enabling enterprise-level interoperability and consistently implementing Globally unique, persistent identifiers (GUPRIs). Leadership mandates include FAIR budgets in all data projects and actively engage in the broader FAIR community. A comprehensive FAIR data strategy encompasses centralized and federated data backed by metrics and integrated into governance processes. Key roles, such as data standard experts and Citizen Data Scientists, play pivotal roles. Formalized training programs cover diverse roles, fostering a culture of proficiency. The organization demonstrates impact areas, engages in external leadership, and uses open-source tools. FAIR practices are embedded in workflows, emphasizing continuous improvement, impact measurement, and adherence to standards. We see the establishment of Cross-community collaboration and community of practice, highlighting shared learnings and real-world experiences. Business benefits from previous FAIR data implementation pilots are recognized, providing a qualitative framework and evaluation metrics for further initiatives. The organization utilizes automated tools, defined interaction mechanisms, and a registry of FAIRification tools, showcasing a commitment to advanced FAIR data management practices. https://pistoiaalliance.atlassian.net/wiki/x/EoDAy
(Back to the FAIR Matrix).
Level 5: "FAIRest of them all"
Level | Nickname | Marketplace metaphor | Features | Picture |
5 | "FAIRest of them all" | "Digital Online Store" | Largely aspirational: while conceivable, it still needs practical realization. Cross-organization standards and Interoperability. |
Table - Level 5: "FAIRest of them all"
Level 5:"FAIRest of them all" summary
This level describes the aspirational, currently considered achievable practices by an entire ecosystem at the sector (life-sciences) or ecosystem level.
At this stage, FAIR data is the norm and self-describing (FAIR) digital objects are ubiquitous for key data assets. Machine actionability and automated operations are possible, including machine-enabled AI and semantic solutions that act directly on data sets without human interpretation. The organization prioritizes adherence to FAIR data principles as a strategic and operational objective. It operates at an enterprise level, managing data granularly, focusing on data governance and master data management. It also operates at the ecosystem level, maintaining a critical set of interoperability resources with other ecosystem players (pharma, CROs, solutions providers, standardization bodies, regulatory bodies, industry associations, and academia). A minimal set of cross-organization standards platforms and tools designed to automate the creation of FAIR data, ensuring provenance capture and adherence to FAIR principles, are adopted. The organization promotes an organization-wide understanding of FAIR data, emphasizing training, defined roles, and recognition for FAIR data work. We see the integration of FAIR processes across all data processes, emphasizing value throughout the lifecycle. The organization actively shares learnings and engages in a cross-organization FAIR community of practice.
On the other hand, for most data citizens, FAIR is transparent and FAIR embeds in daily practice. Benefits are visible for key stakeholders, and a pervasive data-centric culture resists adopting application-centric solutions. The organization acts as a community leader, encouraging organization-wide adoption of FAIR data practices and platforms. At this stage, the role of complementary organizations in the ecosystem provides time, cost and qualitative benefits resulting from interoperability and data reuse. https://pistoiaalliance.atlassian.net/wiki/x/AgDEy
(Back to the FAIR Matrix).
https://pistoiaalliance.atlassian.net/wiki/x/DoCZy
https://pistoiaalliance.atlassian.net/wiki/x/AoCay
https://pistoiaalliance.atlassian.net/wiki/x/AQChy
https://pistoiaalliance.atlassian.net/wiki/x/CACey
https://pistoiaalliance.atlassian.net/wiki/x/BIDFy
https://pistoiaalliance.atlassian.net/wiki/x/BADDy
https://pistoiaalliance.atlassian.net/wiki/x/EoDAy
https://pistoiaalliance.atlassian.net/wiki/x/AgDEy
https://pistoiaalliance.atlassian.net/wiki/x/gICuy
https://pistoiaalliance.atlassian.net/wiki/x/HwC6y