or: Toward a Machine Learning Deployment Environment.
Nowadays, big names in machine learning have their own data science analysis environments and in-production machine learning execution environment. The others have a mishmash of custom made parts or are lucky enough so that the existing commercially available machine learning environment fits their needs and they can use them. There are several data science environments commercially available, Gartner mentions the most known players (although new ones pop every week) in its Magic Quadrant for Data Science and Machine-Learning Platforms. However, most (if not all) of those platforms suffer from a limitation which might prevent some industries from adopting them. Most of those platforms starts with the premises that they will execute everything on a single cloud (whether public or private). Let see why this might not be the case for every use case.
Some machine learning models might need to be executed remotely. Let’s think for example of the autonomous vehicle industry. Latency and security prevents execution in a cloud (unless that cloud is onboard the vehicle). Some industrial use cases might require models to be executed in an edge-computing or fog-computing fashion to satisfy latency requirements. Data sensitivity in some industries may require the execution of some algorithms on customer equipment. There are many more reasons why you may want to execute your model in some other location than the cloud where you made the data science analysis.
As said before, most commercially available offerings do not cater to that requirement. And it is not a trivial thing that one may slap on top an existing solution as a simple feature. There are in some case some profound implications on allowing such distributed and heterogeneous analysis and deployment environment. Let’s just look at some of the considerations.
First one must recognize there is a distinction between the machine learning model and the complete use case to be covered, or as some would like to call it the AI. A machine learning model is simply provided a set of data and gives back an “answer”. It could be a classification task, a regression or prediction task, etc. but this is where a machine learning model stops. To get value from that model, one must wrap it in a complete use case, some calls that an AI. How do you acquire reliably the data it requires? How do you present or act on the answer given by the model? Those, and many more questions needs to be answered by a machine learning deployment environment.
Recognizing it, one of the first thing that is required to deploy a full use case is access to data. In most industries, the sources of data are limited (databases, web queries, csv files, log files, …) and the way to handle them is repetitive i.e. once I figured a way to do database queries, the next time most of my code will look the same, except for the query itself. As such, data access should be facilitated by a machine learning deployment environment which should provides “data connectors” which could be configured for the needs and deployed where the data is available.
Once you have access to data, you will need “rules” as to when the machine learning model needs to be executed: is it once a day, on request, … Again, there is many possibilities (although when you start thinking about it, a lot are the same), but expressing those “rules” should be facilitated by deployment environment so that you don’t have to rewrite a new “data dispatcher” for every use case, but simply configure a generic one.
Now we have data and we are ready to call a model, right? Not so fast. Although some think of data preparation as part of the model, I would like to consider it as an intermediary step. Why would you say? Simply because data preparation is a deterministic step where there should be no learning involved and because in many cases you will reduce significantly the size of the data in that step, data that you might want to store to monitor the model behavior. But I’ll come to this later. For now, just consider there might be a need for “data reduction” and this one cannot be generic. You can think of it as a pre-model which format the data in a way your model is ready to use. The deployment environment should facilitate the packaging of such a component and provides way to easily deploy them (again, anywhere it needs to be).
We are now ready for the machine learning execution! You already produced a model from your data science activities and this model needs to be called. As for the “data reduction”, the “model execution” should be facilitated by the deployment environment, the packaging and the deployment.
For those who have been through the loops of creating models, you certainly have the question: But how have you trained that model? So yes, we might need a “model training” component which is also dependant on the model itself. A deployment environment should also facilitate the use/deployment of a training component. However, this begs to another important question. From where comes the data used for training? And what if the model drift, is no longer accurate and needs re-training? You will need data… So, another required component is a “data sampling” component. I say data sampling because you may not need all the data, maybe some sample of it is sufficient. This can be something provided by the model execution environment and configured per use case. You remember the discussion about data reduction earlier? Well, it might be wise to store only samples coming from reduced data… You may also want to store the associated prediction made by the model.
At any rate, you will need a “sample database” which will need to be configured with proper retention policies on a use case basis (unless you want to keep that data for eternity).
As we said, models can drift, so data ops teams will have to monitor that model/use case. To facilitate that, a “model monitoring” component should be available which will take cues from the execution environment itself, but also from the sample database, which means that you will need a way to configure what are the values to be watched.
Those covers the most basics components required, but more may be required. If you are to deploy this environment in a distributed and heterogeneous fashion, you will need some “information transfer” mechanism or component to exchange information in a secured and easy fashion between different domains.
You will also need a model orchestrator which will take care of scaling in or out all those parts on a need basis. And what about the model life-cycle management, canary deployment or A/B testing… you see, there is even more to consider there.
One thing to notice is that even at this stage, you only have the model “answer” … you still need to use it in a way which is useful for your use case. Maybe it is a dashboard, maybe it is used to actuate some process… the story simply does not end here.
For my friends at Ericsson, you can find way more information in the memorandum and architecture document I wrote on the subject: “Toward a Machine Learning Deployment Environment”. For the rest of you folks, if you are in the process of establishing such an environment, I hope those few thoughts can help you out.
Cover photo by Frans Van Heerden at Pexels.