DS7: Where do I store predictions?

The storage of inference or of predictions is very often coming up in our discussion with data scientists who wish to make a model widely accessible in the organization or who want to monitor the performance of the model at scale.

 

  • Predictions are data and any solution in place to store predictions should follow the key FAIR principles

  • There two ‘extreme’ approaches that can be taken to store predictions for ML models deployed in an organization

    • Have a centralized store where predictions generated by ML models are stored and made available to other products or applications

    • A distributed storage of the predictions in different expert systems that are adding value to key business activities (similar to a data mesh approach)

  • It is important to think about the metadata associated with each prediction (date, model id and version used to generate them…etc). This can be particularly critical for cases where a full traceability is required for regulatory purposes

  • As for every modern data or software solutions, Identity and Access Management are important considerations when storing predictions

  • Performance of the systems are also very critical to make sure that data can be moved at scale