DS7: Where do I store predictions?
The storage of inference or of predictions is very often coming up in our discussion with data scientists who wish to make a model widely accessible in the organization or who want to monitor the performance of the model at scale.
Predictions are data and any solution in place to store predictions should follow the key FAIR principles
There two ‘extreme’ approaches that can be taken to store predictions for ML models deployed in an organization
Have a centralized store where predictions generated by ML models are stored and made available to other products or applications
A distributed storage of the predictions in different expert systems that are adding value to key business activities (similar to a data mesh approach)
It is important to think about the metadata associated with each prediction (date, model id and version used to generate them…etc). This can be particularly critical for cases where a full traceability is required for regulatory purposes
As for every modern data or software solutions, Identity and Access Management are important considerations when storing predictions
Performance of the systems are also very critical to make sure that data can be moved at scale