Towards a FAIR Return On Investment Methodology  V1.0
FAIR return on investment (ROI) tool methodology V1.0 © 2024 by Pistoia Alliance is licensed under CC BY 4.0
Version: 1.0 Date: 202403
Contributors
Name  Affiliation at time of writing 
Valerie Morel  Ontoforce 
Hans Constandt  FAQIR foundation 
Giovanni Nisato  Pistoia Alliance 
Reviewers
Name  Affiliation at time of writing 
Gentiana Spahiu Pina  Pfizer 
   
Executive summary
The document presents a first step towards a FAIR Return on Investment (ROI) methodology, starting with financial modeling of pharmaceutical development projects. This work in progress is addressed to stakeholders in the life sciences industry involved in organizational efficiency, data management, investors and analysts interested in understanding the financial implications and ROI associated with FAIR data implementation.
The material is derived from work executed by the Vlerick Management School students and supported by Ontoforce in 2019. It provides a systematic starting point to calculate the baseline ROI of R&D and quantify the value of FAIR implementation investments through variations of the baseline.
The document presents an ROI model that combines a DecisionTree approach and a Discounted Cash Flow model, focusing on enhancing R&D pipeline productivity through the ‘pharmaceutical value equation’, which balances R&D efficiency and effectiveness. The model is designed to balance sophistication and simplicity, with flexible input parameters allowing customization based on industry averages or projectspecific data. Financial, R&D, and market assumptions are incorporated, ensuring adaptability to diverse scenarios.
In terms of core assumptions to FAIR benefits, the model currently quantifies the impact of implementing FAIR data principles mainly by accounting for : efficiency gains, time reduction, and cost savings. For example, it assumes reduced search time for data and estimates the associated cost reductions due to increased efficiency.
In terms of outcome metrics, Return on Investment (ROI) is defined as the increase in Net Present Value resulting from FAIR implementation actions, minus the associated costs. The 'Wins per Day" metric calculates the value of savings per day, emphasizing the impact on project value over time.
The document provides a systematic approach to calculate ROI and quantify the value of investments. Acknowledging the need for continuous refinement and updates to reflect evolving industry dynamics, the recommendations for further work include updating key references, updating and simplifying the calculator tool, connecting ROI methods to other FAIR resources such as the Fair Maturity Matrix, and illustrating the ROI tool usage in the frame of FAIR implementation usecases.
1. Introduction
How can the implementation of FAIR data principles improve the efficacy and efficiency of lifescience organizations, especially in pharmaceutical product development? This is a core question for leadership of pharmaceutical companies and organizations in their ecosystem. In order to address it, another question looms in the background: is it worth it? Return on investment (ROI) tools are commonly used to assess the business cases of projects and operational transformations. However, these models are internal, proprietary core assets, their outcomes depend on parameters and assumptions that are often not openly shared. How can we then assess the ROI of FAIR data implementation in ways that can be understood, shared and quantified at least at a high abstraction level by different stakeholders? It would be beneficial to have an openly accessible ROI model that is as simple but as realistic as possible to provide a relevant, shared methodology to assess the ROI of FAIR implementation projects. The purpose of this document is to introduce such a method .
This document is largely derived from work done in 2019 at Ontoforce and the Vlerick Business School and provides a foundation for ROI considerations in pharmaceutical companies. The discussion and outlook section provide indications on how this methodology could be revisited and updated moving forward to take into account important recent developments.
2. Methodology  FAIR ROI
The FAIR ROI model is based on a DecisionTree perspective and a Discounted Cash Flow model. The outcomes are placed in the context that adopting FAIR data principles hopes to affect the R&D pipeline productivity by understanding the 'pharmaceutical value equation.' Each of these items are detailed further below.
2.1 Pharmaceutical value equation method
Increasing R&D productivity continues to be the pharmaceutical industry’s primary challenge. R&D productivity can be simply defined as the relationship between the value (medical and commercial) created by a new medicine and the investment required to generate that medicine requirements (Paul et al., 2010). According to this paper, R&D productivity can be separated into two dimensions (Figure 1):
R&D efficiency: represents the ability of an R&D system to translate inputs (e.g., ideas, investments, effort) into defined outputs, generally over a specified period of time. Or, simply stated, inputs lead to outputs.
R&D effectiveness: the ability of the R&D system to produce outputs with certain intended and desired qualities. Or, simply stated, outputs lead to outcomes.
Figure 1. Dimensions of R&D Productivity. Source: Paul et al., 2010
In order to come up with a productivity relationship, Paul et al. (2010) developed a 'pharmaceutical value equation' (Figure 2). This includes the key elements that determine both the efficiency and effectiveness of the drug discovery and development process for any given pipeline, according to them.
Figure 2. Pharmaceutical Value Equation. Source: Paul et al., 2010
Whereby:
P = R&D productivity
WIP = workinprocess
p(TS) = probability of technical success
V = value
CT = cycle time
C = cost
Hence, in order to improve R&D productivity, one should try to increase WIP, p(TS), and V without substantially increasing CT or C.
2.2 Methods for valuation
2.2.1 DecisionTree Perspective
Pharmaceutical R&D can be considered as a sequence of several research stages, each defined by 3 major parameters: (projectspecific) cycle time (CT), cost (C), and probability of technical success (pTS). Whether or not a project will progress toward the next stage depends on how well the drug performs in each of those stages. When a (lead) molecule does not meet the limited requirements (e.g., too narrow therapeutic window), investment and hence proceedings to further stages must be canceled.
Hence, pharmaceutical R&D follows a typical decision tree process (Figure 3), offering flexibility at each stage for managers to decide whether to continue or abandon the R&D project. This is further utilized as aforementioned in the literature as the way in which Pharmaceutical companies choose to value their projects (Shockley et al., 2002).
Figure 3. R&D Decision Tree Framework. Source: Shockley et al., 2002
2.2.2 DiscountedCash Flow Method
Still considered the golden standard in the field of capital budgeting and valuation, the discounted cash flow methodology (DCF) was used to develop a model that allows the valuation of a pharmaceutical R&D project. Based on the decision tree rationale mentioned in the previous section, a commonly used additional sophistication compared to a conventional DCF model was used by considering the (cumulative) probabilities for a pharmaceutical R&D project in proceeding to the following stages leading to eventual market launch. Key to performing a DCF analysis is the determination of future free cash flows (FCFs), which can be approximated by applying the formula mentioned in Figure 4. The Free Cash Flow (FCF) may be estimated as a function of Earnings Before Interest and Taxes (EBIT), the tax rate, Depreciation and Amortization (D&A), Capital Expenditures (CAPEX) and net Operating Working Capital Requirement (DWCR)
Figure 4. Formula to estimate free cash flows (FCFs). Source: Shockley et al., 2002
This approach allows to calculate the net present value (NPV), depending on which starting position/stage the R&D project resides in the decision tree and, hence, which remaining phase transitions, each with their intrinsic probabilities, need to be completed before obtaining product launch. The latter allows to determine at which stage(s) the impact of applying FAIR data principles is most significant valuationwise, as well as at which stages one could focus further in the future in improving the impact even more.
As illustrated by the decision tree (Figure 3), R&D merely involves cash outflows (i.e., costs). The cash inflows are only obtained once the drug receives market approval. Due to the latter, valuation of pharmaceutical R&D projects focusing on challenging therapeutic indications (i.e., Alzheimer's disease; Calcoen et al., 2015) often results in a negative NPV, especially when the latter are still residing in early stages (e.g., HittoLead), despite the tremendous market potential in the final stages of the decision tree (Brous et al., 2011; Shockley at al., 2003).
3. Model Design and Input Assumptions
3.1 General Model Design
A key objective was to design the model in a balanced way to be sophisticated enough to allow for gaining reasonable estimates regarding Return on Investment (ROI) and Wins per Day, but not overly complex. This was implemented in an excel spreadsheet. (see Figures 5 and 6).
Stage Type  TargettoHit  HittoLead  Lead Optimization  Preclinical  Phase I  Phase II  Phase III  Registration  Commercialization
98% $120 $223 
Cycle Time (years) pTS Costs (mio)  0,83 72% $1  1,21 61% $3  2,00 75% $10  1,1 70% $6  1,9 58% $18  2,6 35% $45  2,8 66% $151  1,4 87% $40  
Free Cash Flow (mio)  $0  $0  $0  $0  $0  $0  $0  $0  
 
NPV @ Stage (mio)  $14  $20  $34  $40  $55  $80  $134  $35  $101 
Figure 5. Pharmaceutical R&D stages (base case). Source: ONTOFORCE Calculations
DCF Commercialized Drug  Launch  + 1 year  + 1 year  + 1 year  + 1 year  + 1 year  + 1 year 
Patent Year  14,0  15,0  16,0  17,0  18,0  19,0  20,0 
Commercial Life  0,1  1,1  2,1  3,1  4,1  5,1  6,1 
Annual Market Size (mio)  $800  $800  $800  $800  $800  $800  $800 
@ Launch  $84 






Market Penetration Rate  25%  30%  36%  43%  52%  60%  60% 
Revenue  $21  $240  $288  $346  $415  $480  $480 
COGS  $11  $122  $146  $176  $211  $244  $244 
SG&A  $7  $80  $96  $115  $138  $160  $160 
Depreciation  $20  $20  $20  $20  $20  $20  $20 
EBIT  $16  $19  $26  $36  $47  $57  $57 
Taxes  $3  $4  $5  $7  $9  $11  $11 
NOPAT  $13  $15  $21  $29  $38  $46  $46 
WCR  $2  $24  $29  $35  $41  $48  $48 
Change WCR  $2  $22  $5  $6  $7  $7  $0 
FCF  $4  $13  $36  $43  $50  $59  $66 
Value @ Launch  $4  $11  $28  $30  $31  $33  $32 
Terminal Value @ Launch  $53 
 
Total Value @ Launch  $223 
Figure 6. DCF spreadsheet (base case). Source: ONTOFORCE Calculations
A technique that significantly improves the translation of the model towards non finance professionals involves using relative instead of absolute input parameters for as many assumptions as possible. A simple example can most easily demonstrate the reasoning behind this. For instance, business development managers might be unable to come up with a good estimate for the cost of goods sold (COGS) in absolute dollar amounts. However, they will most likely be able to provide a more accurate estimate for the latter when expressed in a relative amount (e.g., percentage of sales). Merely for illustrative purposes, Figure 7 depicts the relative input parameters implemented with regard to the financial assumptions applied to the model, which will be discussed more elaborately in Section 3.3.
Financial Assumptions  
COGS (% of sales)  51% 
SG&A (% of sales)  33% 
WCR (% of sales)  10% 
Tax Rate  19% 
Product Launch Cost (% of peak sales)  25% 
PostPatent Annual Growth Rate  50% 
Annual Discount Rate  12% 
Figure 7. Financial input parameters (industry averages value set as standard).
Source: ONTOFORCE Calculations
The strategy to develop a userfriendly model was to integrate a high level of flexibility. The latter was considered crucial, as most, if not all, of the input parameters are very much projectspecific. The latter has indeed been confirmed both by literature (Calcoen et al., 2015) and by discussions with industry experts and ONTOFORCE customers. Hence, the aim was to develop the model in such a way so that the input parameters are preset with industry average values (based on consolidated literature data), with the option for the latter to be customized with projectspecific and/or inhouse company data if available, in order to obtain more accurate outcomes. A dropdown box was implemented to make this model's optionality userfriendly, allowing the user to select the desired model input (Figure 8).
Figure 8. Dropdown box to select standard (literature) or customized (inhouse) values.
Source: ONTOFORCE Model
3.2 R&D Assumptions
As mentioned in Section 3.1, one of the objectives is to obtain accurate industryaverage values for each of the parameters implemented in the model, as these will be used as standard input. In scope of R&D assumptions, we focused on the 3 most important parameters that characterize the different R&D stages: CT, C, and pTS (cf. Section 2). In order to obtain the latter, thorough literature research was conducted, focusing on toptier journals (i.e., A1), along with insights provided by industry experts. After data collection from several resources, the consolidated average value and standard deviation for each of the three R&D parameters and each R&D stage was calculated (Figure 9). The latter also allowed us to calculate the estimated number of highpotential molecules required per stage in order to achieve 1 molecule for product launch (Figure 10). This is within the same range compared to an earlier report (Paul et al., 2010). However, the calculations would imply that a slightly higher number of molecules is required based on the most recent R&D parameter values. This assumption is plausible as several reports can be found in literature describing a decline in R&D productivity despite higher levels of investment (DiMasi et al., 2016; Carter et al., 2016; Scannell et al., 2012).
Overview R&D Pipeline Specs  Cycle time (years)  pTS  Costs (mio)  
Value  Mean  STDEV  Mean  STDEV  Mean  STDEV 
TargettoHit  0,8  0,2  72%  12%  $1,0  #DIV/0! 
HittoLead  1,2  0,4  61%  14%  $2,5  #DIV/0! 
Lead Optimization  2,0  #DIV/0!  75%  15%  $10,0  #DIV/0! 
Preclinical  1,1  0,3  70%  10%  $5,5  $0,9 
Phase I  1,9  0,5  58%  10%  $18,2  $4,9 
Phase II  2,6  0,3  35%  7%  $45,4  $8,9 
Phase III  2,8  0,6  66%  11%  $150,9  $6,0 
Registration  1,4  0,1  87%  6%  $40,0  $0,0 
#DIV/0! = only 1 data point 
Figure 9. Calculated industry averages for each R&D parameter per stage.
Source: ONTOFORCE Calculations
Figure 10. R&D pipeline productivity based on updated literature averages.
Source: ONTOFORCE calculations.
3.3 Financial Assumptions
In line with the model design strategy (cf. Section 3.1), percentagebased input parameters have been used as much as possible. With regard to the financial assumptions of the model, the following parameters were considered to be highly significant in determining both ROI and Wins per Day: discount rate, COGS as percentage of sales, selling, general and administrative expense (SG&A) as percentage of sales, working capital requirement (WCR) as percentage of sales, tax rate, product launch cost as percentage of peak sales and growth rate after patent expiration (Figure 7).
In essence, annual reports of several Pharmaceutical and Biotechnology companies were analyzed to make the correct assumptions on cost and tax structure in the model. To conclude, making the financial assumptions and incorporating them into the Discounted Cash Flow model required averaging all percentages in the individual company reports. The averages found for COGS and SG&A were 50.8% and 33.2%. For WC, the literature value of 10% was placed in the model. Lastly, an effective tax rate was assumed, based on an average calculation, to be 19.3%.
3.4 Market Assumptions
The value of an R&D project highly depends on the extent of revenues that can be obtained once the drug hits market approval. Hence, indepth market research is required to assess whether or not to engage in highly innovative (bio)pharmaceutical endeavors. As observed in literature (Brous et al., 2011; Shockley et al., 2003), companies look at several indicators that will affect annual revenues. One is the expected annual market size (i.e., hypothetical revenues if a company would have 100% market share). The latter is determined by the target population's size, treatment duration, posology, and more.
Another indicator often investigated is expected market penetration. The latter refers to the extent to which a company can capture the market size, hence the level of market share it estimates to achieve. This depends on the competitive environment, geographical location of the target population, and political/regulatory environment among others. A company cannot always capture the full extent of its expected market share upon launch. Instead, it often foresees to penetrate the market more gradually over time before obtaining peak sales. Obviously, it's in the company's best interest to strive for a fast market penetration growth rate, as market share can decline rapidly once the patent of the drug has expired. This especially holds true for small molecule drugs, as the latter often comprise a very straightforward chemical synthesis that generic drug companies can easily execute. Experts referred to revenues annually declining by over 50% after patent expiration. Similar scenarios are less common for biologicals.
The main reason for the latter is due to the more highly complicated manufacturing process, taking generic drug companies significantly longer to obtain a biosimilar with equal quality standards as provided by the original drug manufacturer. This clearly is an incentive for pharma and biotech companies to focus and/or shift more towards biologicals than small molecule drugs, as the NPV of the former is usually higher. However, this incentive has not yet fully materialized in practice. (Carter et al., 2016). Realistic proxies for these market assumptions were obtained through discussions with business development managers of several pharmaceutical companies, which are summarized in Figure 11. As mentioned in previous sections, these assumptions are again highly projectspecific. Hence, we made it possible for the model to be customized with inhouse data if available.
Market Assumptions  
Annual Market Size (mio)  $800 
Market Penetration Rate @ Launch  25% 
Annual Market Penetration Growth  20% 
Maximum Market Penetration  60% 
Figure 11. Overview of standard market input parameters.
Source: ONTOFORCE Calculations
4. FAIR Impact Assumptions
To quantify the impact of implementing FAIR data principles, a model developed by ONTOFORCE has been used. In its original form, the model aimed to calculate the impact and ROI of implementing a semantic search platform, and more specifically, the DISQOVER platform, on a drug development pipeline. More details on the impact assumptions calculated by ONTOFORCE can be obtained by contacting the company.
Core assumptions for FAIR impact include:
It assumed that a researcher, for example a scientist in an R&D function, spends 1520% of their time searching for data. This stems from primary research and Interviews conducted with users of the DISQOVER platform.
Additionally, it was found through further discussions that roughly three databases must be scanned in order to find reliable information.
The FAIR ROI model takes an efficiency gain in searching of 25% as a parameter. User surveys revealed that DISQOVER offers a 67% time reduction in search time and thus offers cycle time reduction in each of the phases of drug discovery. (NB: These were not the only impacts as additional cost reductions not related to time reduction were identified).
It is assumed that employees performing the searches are salaried, it is implied that by reducing the time required, the amount of pay that will be allocated to those employees for this task will decrease. The FAIR ROI model provides parameters to calculate the costs related to employees searching.
More details on the impact assumptions calculated by ONTOFORCE can be obtained by contacting the company.
5. Defining outcome metrics
5.1 Defining Return on Investment (ROI)
To assess the true potential of adopting FAIR data principles, it was necessary to identify the best way to define ROI in this particular setting, i.e., what is considered the 'return' and how is the 'investment' defined?
As the model aims at calculating NPVs of R&D projects, it was decided to maintain that NPV perspective for defining ROI. As mentioned in previous sections, the absolute NPV of several R&D projects is often negative, especially when residing in early development stages (Brous et al., 2011; Shockley et al., 2003). Hence, instead of looking at the absolute value of these NPVs, the 'return' was defined as the increase in NPV. This can easily be calculated by subtracting the NPV of the base case (i.e., without FAIR) from the NPV of the chosen scenario. In a certain sense, this provides a similar 'relative' quantification and hence seems plausible to use for ROI, as the latter is a widely used relative financial parameter, not an absolute one (like NPV).
The final challenge in defining the ROI was determining the investment a company must make to reach the associated gain: the FAIR implementation cost. These can for example be fees associated with annual license and setup fee for FAIR implementation tools (e.g. DISQOVER at Ontoforce).. and are set as a userdefinable parameter in the FAIR ROI tool. The following formula was used in the original ONTOFORCE ROI tool (Figure 12).
where :
NPV = NPV FAIR case  NPV base case
FAIR cost = PV Annual fee per project
PV = CF /g (perpetuity)
Figure 12. ROI formula applied for ONTOFORCE. Source: ONTOFORCE
5.2 Defining Wins per Day
The final part of the model calculates the value of savings per day (also known as 'wins per day') that could be made by incorporating and using FAIR data principles in a given company's pipeline. The best way of illustrating the effectivity is by looking at the “Wins per day metric”:
Figure 13. definition of the “Wins per day” metric (e.g. USD/ day).
This was decided by the fact that when analyzing how a company should quantify the impact on its pipeline, it is by quantifying the impact on the project's value, not simply on costs saved in the moment. This is far more impactful as it utilizes the theory of the time value of money and connects the increase in money earned/saved in the particular phase to the phases that follow in the pipeline.
6. Discussion and outlook
The content of this document is derived from a report executed by the Vlerick Management School students and supported by Ontoforce in 2019. It provides a systematic starting point to calculate the ROI and quantify the value of FAIR implementation investments.
Having agreed guiding principles, methodologies, assumptions that can be tailored to specific cases will help as a starting point for the discussion on value and the ROI created by FAIR projects and initiatives.
However, since there are continuous developments in the dynamic world of biopharmaceutical and lifescience organizations, the following actions are recommended to further improve FAIR ROI calculator methodology:
Refresh the research done in 2019 to confirm which assumptions and methodologies are still relevant today and what needs to be modified. This may include the use of probability adjusted DCF to calculate ROI, the “rocket model” and phases of drug discovery compared to how life sciences work today with updated time cycles and probability of technical success. Financial assumptions on COGS, SG&A, WCR, Tax Rates, Product Launch Cost, Growth rates, market penetration rates can be tailored, and an update on the nominal values may help.
Update, simplify and disseminate the ROI FAIR (excel) calculator. This will likely involve a deep dive in the ROI Calculator excel of 2019 by a Pistoia Alliance working group to create an updated version. Keeping in mind the targeted user group, the calculator tool should be simplified as much as possible to improve usability. Finally, the group should develop and disseminate an open access version of the ROI calculator tool.
Connect the ROI methods to other FAIR resources, including the FAIR Maturity Matrix. clarify further the cost of FAIR implementation. For example, split FAIR costs up in data, technology and people (or roles) dimensions. Other dimensions from Pistoia’s FAIR Maturity Matrix may also be considered.
Showcase the ROI of FAIR implementation initiatives. This could be done for example by documenting and sharing a set of usecases from the FAIR community using the ROI calculator applied to realworld scenarios. Ultimately it’s important to test the model, its assumptions, and check if the ROI scenarios envisaged correspond to reallife FAIR implementation cases and how they delivered with respect to the expected ROI outcomes.
7. Abbreviations and Symbols
C: cost
CAPEX: capital expenditure
COGS: cost of goods sold
CF: cash flow
CT: cycle time
D&A: depreciation and amortization
DWCR: net Operating Working Capital Requirement
DCF: discounted cash flow
EBIT: earnings before interest and tax
FCF: free cash flow
g: (terminal) growth rate
NOPAT: net operating profit after tax
NPV: net present value
pTS: probability of technical success
PV : present value
R&D: research and development
ROI: return on investment
SG&A: selling, general and administrative expense
STDEV: standard deviation
V: value
WC: working capital
WCR: working capital requirement
WIP: work in process
8. References
Brous, P. A. (2011). Valuing an Early‐Stage Biotechnology Investment as a Rainbow Option. Journal of applied corporate finance, 23(2), 94103.
Calcoen, D., Elias, L., & Yu, X. (2015). What does it take to produce a breakthrough drug?. Nature Reviews Drug Discovery, 14(3), 161162.
Carter, P. H., Berndt, E. R., DiMasi, J. A., & Trusheim, M. (2016). Investigating investment in biopharmaceutical R&D.
DiMasi, J. A., Grabowski, H. G., & Hansen, R. W. (2016). Innovation in the pharmaceutical industry: new estimates of R&D costs. Journal of health economics, 47, 2033.
FDA (2018). The Drug Development Process. Retrieved from: https://www.fda.gov/ForPatients/Approvals/Drugs/default.htm.
Hall, W., Shadbolt, N., and BernersLee, T. (2006). The Semantic Web Revisited. IEEE Intelligent Systems at IEEEE Computer Society. 96101.
Paul, S. M., Mytelka, D. S., Dunwiddie, C. T., Persinger, C. C., Munos, B. H., Lindborg, S. R., & Schacht, A. L. (2010). How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nature reviews Drug discovery, 9(3), 203.
Scannell, J. W., Blanckley, A., Boldon, H., & Warrington, B. (2012). Diagnosing the decline in pharmaceutical R&D efficiency. Nature reviews Drug discovery, 11(3), 191.
Shockley Jr, R. L., Curtis, S., Jafari, J., & Tibbs, K. (2002). The option value of an early‐stage biotechnology investment. Journal of Applied Corporate Finance, 15(2), 4455.