Where the F**k do I execute my model?

or: Toward a Machine Learning Deployment Environment.

Nowadays, big names in machine learning have their own data science analysis environments and in-production machine learning execution environment. The others have a mishmash of custom made parts or are lucky enough so that the existing commercially available machine learning environment fits their needs and they can use them. There are several data science environments commercially available, Gartner mentions the most known players (although new ones pop every week) in its Magic Quadrant for Data Science and Machine-Learning Platforms. However, most (if not all) of those platforms suffer from a limitation which might prevent some industries from adopting them. Most of those platforms starts with the premises that they will execute everything on a single cloud (whether public or private). Let see why this might not be the case for every use case.

Some machine learning models might need to be executed remotely. Let’s think for example of the autonomous vehicle industry. Latency and security prevents execution in a cloud (unless that cloud is onboard the vehicle). Some industrial use cases might require models to be executed in an edge-computing or fog-computing fashion to satisfy latency requirements. Data sensitivity in some industries may require the execution of some algorithms on customer equipment. There are many more reasons why you may want to execute your model in some other location than the cloud where you made the data science analysis.

As said before, most commercially available offerings do not cater to that requirement. And it is not a trivial thing that one may slap on top an existing solution as a simple feature. There are in some case some profound implications on allowing such distributed and heterogeneous analysis and deployment environment. Let’s just look at some of the considerations.

First one must recognize there is a distinction between the machine learning model and the complete use case to be covered, or as some would like to call it the AI. A machine learning model is simply provided a set of data and gives back an “answer”. It could be a classification task, a regression or prediction task, etc. but this is where a machine learning model stops. To get value from that model, one must wrap it in a complete use case, some calls that an AI. How do you acquire reliably the data it requires? How do you present or act on the answer given by the model? Those, and many more questions needs to be answered by a machine learning deployment environment.

Recognizing it, one of the first thing that is required to deploy a full use case is access to data. In most industries, the sources of data are limited (databases, web queries, csv files, log files, …) and the way to handle them is repetitive i.e. once I figured a way to do database queries, the next time most of my code will look the same, except for the query itself. As such, data access should be facilitated by a machine learning deployment environment which should provides “data connectors” which could be configured for the needs and deployed where the data is available.

Once you have access to data, you will need “rules” as to when the machine learning model needs to be executed: is it once a day, on request, … Again, there is many possibilities (although when you start thinking about it, a lot are the same), but expressing those “rules” should be facilitated by deployment environment so that you don’t have to rewrite a new “data dispatcher” for every use case, but simply configure a generic one.

Now we have data and we are ready to call a model, right? Not so fast. Although some think of data preparation as part of the model, I would like to consider it as an intermediary step. Why would you say? Simply because data preparation is a deterministic step where there should be no learning involved and because in many cases you will reduce significantly the size of the data in that step, data that you might want to store to monitor the model behavior. But I’ll come to this later. For now, just consider there might be a need for “data reduction” and this one cannot be generic. You can think of it as a pre-model which format the data in a way your model is ready to use. The deployment environment should facilitate the packaging of such a component and provides way to easily deploy them (again, anywhere it needs to be).

We are now ready for the machine learning execution! You already produced a model from your data science activities and this model needs to be called. As for the “data reduction”, the “model execution” should be facilitated by the deployment environment, the packaging and the deployment.

For those who have been through the loops of creating models, you certainly have the question: But how have you trained that model? So yes, we might need a “model training” component which is also dependant on the model itself. A deployment environment should also facilitate the use/deployment of a training component. However, this begs to another important question. From where comes the data used for training? And what if the model drift, is no longer accurate and needs re-training? You will need data… So, another required component is a “data sampling” component. I say data sampling because you may not need all the data, maybe some sample of it is sufficient. This can be something provided by the model execution environment and configured per use case. You remember the discussion about data reduction earlier? Well, it might be wise to store only samples coming from reduced data… You may also want to store the associated prediction made by the model.

At any rate, you will need a “sample database” which will need to be configured with proper retention policies on a use case basis (unless you want to keep that data for eternity).

As we said, models can drift, so data ops teams will have to monitor that model/use case. To facilitate that, a “model monitoring” component should be available which will take cues from the execution environment itself, but also from the sample database, which means that you will need a way to configure what are the values to be watched.

Those covers the most basics components required, but more may be required. If you are to deploy this environment in a distributed and heterogeneous fashion, you will need some “information transfer” mechanism or component to exchange information in a secured and easy fashion between different domains.

Machine Learning Execution Environment Overview.

You will also need a model orchestrator which will take care of scaling in or out all those parts on a need basis. And what about the model life-cycle management, canary deployment or A/B testing… you see, there is even more to consider there.

One thing to notice is that even at this stage, you only have the model “answer” … you still need to use it in a way which is useful for your use case. Maybe it is a dashboard, maybe it is used to actuate some process… the story simply does not end here.

For my friends at Ericsson, you can find way more information in the memorandum and architecture document I wrote on the subject: “Toward a Machine Learning Deployment Environment”. For the rest of you folks, if you are in the process of establishing such an environment, I hope those few thoughts can help you out.

Cover photo by Frans Van Heerden at Pexels.


Grow your own moustache!

It is done! I finished my Master study. The defence was held on April 26 and the thesis got published today. For those inclined in reading this kind of literature, here is my thesis: A Scalable Heterogeneous Software Architecture for the Telecommunication Cloud. It is an actor model based framework which can deploy on any cloud and be written with any programming language.

Let me tell you it is a big weight removed from my shoulders. In retrospective I would do it again. However I would not take two courses in the same session. Also I would have started the writing of the thesis and verified it with my thesis director earlier. But now it is done. On the bright side, I think what I have learned in the process really helped improve the end result the research done by my team.

I could not resist long and had to learn something new. Something has been on my back list for a while and I decided to give it a try. But let me put a little bit of context here. Four or five years ago I went to course about innovation in Stockholm. One of the exercises went as follow. In teams of two we had to point randomly in magazines and pick pictures and sentences, and give sense to them. I don’t recall each individual element, but I think we came up with a sentence going like this: “You have to grow your own moustache”. I still recall that sentence because out of the randomness of the sentences and images we picked we ended up with such a profound revelation!

It might not look like it, but “Growing your own moustache” is a really good metaphor for a lot of things in life. I will just show one of those things. As following a course and learning is, growing a moustache is a decision you have to make. Once that decision is made, it will take time, you cannot have it grown over night. Two people won’t grow the same moustache and it won’t grow at the same pace. When you learn, you might struggle more than someone else, but in the end, no matter the struggling, what you have learned is personal, what you retain depends on your background and how the moustache grew… everyone will get its own, there is things you can do to shape it the way you want, but some things you cannot control or change.

That being said, this exercise and many more went a great deal to start a friendship. The course I am following now is from an advice from that friend, Andreas S. who told me about it. So Andreas told me about that book which guide you through the process of building a computer. You start from Nand gates and build a computer from them, an OS, a language and eventually the Tetris game. It happens that the guys who wrote the book made available that course on Coursera: Build a Modern Computer from First Principles: From Nand to Tetris. It is a two parts course and the first part is available now. I finished the first part. Last week I completed the assignment for week five where you have to build a CPU and Memory and assemble them as a computer. This week I completed the 6th week assignment to write an assembler for that computer. It is all simulation, but you know it could work for real if you had the patience to build it physically as in the previous weeks we built every elements leading to this, from the Nand gate. Two transistors and a resistor and you have a physical implementation of a Nand gate. You would need a s**t load of them to build an actual physical version of it, but you get the full understanding with the course. By the way, someone did such a computer from individual transistors. You can get a view of it in this video.

When I did my bachelor degree (in electrical engineering) 25 years ago, I covered a lot of what is shown in this course. But still some pieces were missing. We built/simulated logical gates and from there went to Register and ALU but we didn’t assemble them as a CPU and a Computer. Other course showed us assembly and compilers but it was not linked in a coherent chain. That course bring you from the basic Nand gate up to writing a Tetris game will all the steps in between. You can be a perfectly good software engineer without knowing how a computer is built, but there is a lot you can gain by understanding it. Making it yourself ensures you have a deep understanding of the whole process. I recommend that course to everyone. Thanks Andreas!

Cloud-based Telecom Software Architecture

It has been two years since I started this journey of defining a Software Architecture for the Telecom domain in the Cloud. There are challenges in providing a Telecom platform in the Cloud you will not generally find in the IT world. Two years ago my team was tasked to look in the future and see to what extent the Cloud Programming Principles could be applied to the telecom world. We started from a white sheet and look at the potential building a small proof of concept prototype. Last year Ericsson unveiled its Software Model. You can find a video on YouTube on the Ericsson Software Model. You can also read the Ericsson Software Model press release. We looked at how our approach was supporting that model as well as taking it as a form of requirement specifications as what a Telecom Cloud platform should provide.

The Ericsson Software Model promotes a number of concepts which we looked at while developing our cloud-based software architecture:

Virtualization: The Ericsson press release states that “network evolution is increasingly driven through new software functionality with application virtualization”. With our research we wanted to go further and look at it from the angle of an application natively built for the cloud not solely virtualized. Thus the software architecture we propose is built for the cloud. Since from the point of view of a telecom vendor, telecom application should be deliverable on any operator owned cloud platform or mix of platforms it became obvious that the need to provide an hyper-heterogeneous cloud architecture enabling software to be deployed independently of the platform without having to rework it was a must.

Upgrades: The Ericsson press release also mentions the importance of easy regular upgrades. Building our research approach as a cloud platform we looked how far we could go with this. We came up with principles that allow upgrade of network software for a single subscriber, or even for a single service instance of a subscriber. Thus we could upgrade a network in a staged fashion while keeping an eye on key performance indications and instantaneously go back to a previous version if anything goes wrong. The high granularity of our upgrade approach means fewer risks while upgrading which enables more regular upgrades.

Better Resource Utilisation: The Ericsson Software Model video states better resource utilisation as a driver for the future. Virtualization helps in that matter, however since the scalability increments would still be in the thousands of subscribers at a time, it limits how small a network function can become. In the context of the Networked Society where billions of devices are connected, it becomes obvious that the usage patterns will be multiple and non-obviously predictable. Our highly granular approach allows for efficient resource utilisation since only the software processes are instantiated on a need basis anywhere in the cloud of available resources. Hence free resource can be used for any service scenario and are never tied to a specific one.

Performance: The Ericsson press release states that the Ericsson Software Model “builds on the software performance benefits by making it simpler and faster for operators”. Since our research approach is based on a cloud deployment it became obvious we needed to address availability. As such we build in our architecture ways to allow the application to cope with availability issues, being able to be deployed or migrated anywhere in the cloud if servers crash or becomes unavailable. These facilities are built to be transparent to the application thus making it simpler to the application developer and consequently to the operator as well.

Predictability: The Ericsson Software Model video states the need for predictability. Our approach also allows for a high degree of observability which is important to provide a predictable network and eventually to have a self-healing system.

First to Market: The Ericsson Software Model video states the need for operators to quickly have new functionality available to the customers. For this to be possible we need to look at the development cycle from the start and developers needs build their software based on software independency concepts if they are to deliver it as quick as possible. The software independency is built-in our approach and we think this is a necessity if a platform is to allow rapid development.

There would be much more to say as some more principles made it in the architecture, but if you think those are important principles I invite you to read the two papers we submitted to the S2CT.org conference and should shortly be published. We defined software architecture and build a proof of concept which address those business needs and could become a software platform for the future telecom products. The papers describe at a high level view of the architecture as well as some results from the proof-of-concept.

Preliminary versions of the papers are available on arXiv.org: Hyper Heterogeneous Cloud-based IMS Software Architecture: A Proof-of-Concept and Empirical Analysis and Micro Service Cloud Computing Pattern for Next Generation Networks.