Standards that enable a marketplace for AI/ML data sets

Interested, who to contact?

@Vladimir Makarov

The innovation team: projectenquiries@pistoiaalliance.org

The problem:

There is a desire to share, distribute, or exchange data sets used to create and validate AI and ML models. This may accelerate the pace of scientific discovery.

The proposal:

Develop a set of standards for referencing AI and ML data sets useful in biomedical sciences, and in particular, in drug discovery. In addition to data standards one may have to create standards for reporting of biomedically-relevant AI/ML models and a mechanism that would enable remote training and validation of models, by taking data to model, or model to data, whichever is technically and legally advisable.

This proposal dovetails with the already recorded Minimum Information About an AI/ML Model and the Registry of Medically-Relevant Artificial Intelligence Models proposals.

Competitors/Substitutes:

Other data marketplaces, such as Kaggle, exist and maybe way ahead. However, they do not provide model sharing or validation environments.

This project proposal has a sister proposal: https://pistoiaalliance.atlassian.net/wiki/spaces/IC/pages/1876197404