Issue Log

This will change quickly as the issues are resolved and the results are moved into the Decisions Log:

  1. Should we answer ALL 14 high-priority questions in the pilot project to call the pilot a success? If not, which questions to focus on? Or what % of questions are enough for success?
  2. Is the list of terms that we want to supplement the existing terms/ontologies with final? Can we iteratively add terms as we create annotations in case we miss some in the early analysis?
  3. Need to define the process for quality control (QC) in both full project and in the pilot. In the pilot, would a systematic random sample of 1% of assays be sufficient for QC?
  4. What conditions must assay annotations meet to be considered finished and ready for release?
  5. Need the definite list of assays that would be used in the pilot. What about an iterative approach for naming of assays?
  6. Need to agree on the process for naming of assays for annotation in the full-scale project
  7. Technical ability, prices, terms and conditions for annotation vendors are not fully known
  8. Technical conditions for deposition of enhanced assay annotations into ChEMBL are not known
  9. Other external groups (e.g. Medicines Discovery Catapult, Druggable Genome Project, Medicines for Malaria) are not yet engaged, and the scope of such potential engagement is not known
  10. About 90% of papers with assay descriptions are not in open access. Hence, we would need to negotiate with publishers.
  11. In the papers that are formally associated with assays in public data banks like ChEMBL and PubChem, in reality the assay annotations may be missing, or be extremely shallow, or be references to other papers or commercial assay kits.
  12. Annotation labor is much more expensive than initially expected. Solution: open RFP for labor services
  13. Despite 1.5 million assay records in PubChem, we are failing to quickly nominate interesting assays to annotate, and this leads to delays in the project
  14. Process for upload of assays in PubChem is very slow. Solution: ask PubChem about options to format the assay metadata in order to avoid having to re-format from the already available formats