Decision Log

  1. Definition of Success in the overall DataFAIRy project: to be able to answer business questions collected during the team meetings on February 11th and 14th 2020. These business questions are use cases that the completed DataFAIRy tool will address. On 2/28/2020 we have agreed to limit the pilot to only those questions where both Priority = 1 and Feasibility = 1. There are 14 such questions.
  2. Limit curation of new (not yet curated) papers in the pilot project to save effort
  3. Limit curation of non-open-access papers in the pilot project to save costs
  4. Curation of external data items in the pilot project will be limited to URL links to the appropriate data sources. Data will not be copied, only linked to.
  5. Build the data model on top of an already existing data model, such as one used in ChEMBL, PubChem, or CDD BioAssay Express. Determine which one fits our business questions better. Supplement the data model as needed.
  6. Embargo process shall not apply to pilot assays, to help engage others
  7. Do an RFI/RFP with vendors to gather market information (technologies, rates, etc)
  8. Hire qualified curators. Curators should be compensated at competitive rates: this aligns their interest with the project and provides us with pricing information
  9. Have a separate QC step in the curation work cycle
  10. Use a public resource such as ChEMBL or PubChem for long-term storage and presentation of the enhanced assay annotation. Do not rely on vendor-supplied infrastructure for this
  11. Do not track or annotate mutations in the protein targets in the pilot project
  12. As one source of assays and papers for annotation in the pilot project, use the queries for EGFRs developed by Thomas
  13. Do run a very small scale manual annotation experiment in order to assess the fit of the proposed BAO classes and generate ideas for the QC process. Use Excel for this small scale manual annotation. Do not use BioAssay Express tool. (Now this is complete)
  14. Group related assays (for instance, assays that belong to the same panel) together, and introduce a category in the data model for such grouping
  15. Record lessons learned from the pilot project as a white paper.
  16. Invite to RFP vendors with already mature technologies. Do not fund technology development. Do not allow vendors to iterate and improve their technologies (at our expense).
  17. Pilot project should be fixed scope and fixed duration. Extra scope permissible, if time allows.
  18. Require deposition of results into PubChem or ChEMBL as the key deliverable in the pilot project.
  19. In the next project phase past POC, focus on the vendor assay panels (similar to the Group 2 in the POC). The reason for this is that academic papers often quote commercial assays.
  20. Version assay annotations.
  21. In the POC, only deposit to PubChem those assay annotations that passed QC by the team.
  22. Deposit the resulting standard for the assay annotation (“minimal information standard”) into fairsharing.org
  23. Publish a paper and make a public announcement about the POC only after assays are deposited to PubChem.
  24. Do not publish annotators’ emails and names
  25. Deposit into PubChem all 100 of the QCed assay descriptions
  26. July 18th, 2022: open an RFP for the annotation labor services
  27. July 18th, 2022: do not modify the assay template now; instead, sort the assay protocols into those that fit the template, and those that do not (optionally, note the fields that do not fit)
  28. February 17th, 2023: pick additional assays for the next batch before renewing the BioHarmony Annotator software license