What you should know before using Machine Learning in production

What you should know before using Machine Learning in production | Technology
What you should know before using Machine Learning in production

What you need to know before deploying ML in production

  • Use open source technologies for model training, deployment and fairness
  • Automate the end-to-end ML lifecycle with a machine learning pipeline
  • Choice of Pipeline Technology Depends on ML Scenario and Personality
  • MLflow is a versatile open source platform for managing the end-to-end machine learning lifecycle

What do you need to know before putting machine learning projects into production? There are four aspects to Machine Learning Operations, or MLoops, that everyone should be aware of first. These can help data scientists and engineers overcome limitations in the machine learning lifecycle and really see them as opportunities.

Need for MLOPS

MLOps is important for several reasons. First, machine learning models rely on huge amounts of data, and it is very difficult for data scientists and engineers to track it all. It is also challenging to keep track of the various parameters that can be changed in a machine learning model. Sometimes small changes can make a huge difference in the results you get from your machine learning model. You also need to take into account the characteristics with which the model works; Feature engineering is an important part of the machine learning lifecycle and can have a major impact on model accuracy.

Once in production, monitoring a machine learning model isn’t really the same as monitoring other types of software like web apps, and debugging a machine learning model is complicated. Models use real-world data to make their forecasts, and real-world data can change over time.

As it changes, it is important that you track the performance of your model and update your model as needed. This means you have to keep track of new data changes and make sure the model learns from them.

I’m going to discuss four key aspects before implementing machine learning in production: MLOPS capabilities, open source integration, machine learning pipelines, and MLflow.

MLOPS Capabilities

There are many different MLOPS capabilities to consider before deploying to production. First is the ability to build reproducible machine learning pipelines. Machine learning pipelines allow you to define repeatable and reusable steps for your data preparation, training and scoring processes. These steps should include building a reusable software environment for training and deploying models, as well as the ability to register, package, and deploy models from anywhere. Using pipelines allows you to frequently update models or roll out new models alongside your other AI applications and services.

You also need to track the associated metadata needed to use the model and capture governance data for the end-to-end machine learning lifecycle. In the latter case, lineage information may include, for example, who published the model, why changes were made at some point, or when different models were deployed or used in production.

It is also important to be informed and alerted about events in the machine learning lifecycle. For example, experiment completion, model registration, model deployment, and data drift detection. You also need to monitor machine learning applications for operational and ML related issues. Here it is important for data scientists to be able to compare model inputs from training-time versus estimate-time, detect model-specific metrics, and configure monitoring and alerts on the machine learning infrastructure.

Open Source Integration

Another aspect that you should know before implementing machine learning in production is open source integration. Here, there are three different open source technologies that are of utmost importance. First, there are open source training frameworks, which are great for accelerating your machine learning solutions. Then there are open source frameworks for explanatory and unbiased models. Finally, there are open source tools for model deployment.

There are many different open source training frameworks. The three most popular are PyTorch, TensorFlow and RAY. PyTorch is an end-to-end machine learning framework, and includes TorchServe, an easy-to-use tool for large-scale deployment of PyTorch models. PyTorch also has mobile deployment support and cloud platform support. Finally, PyTorch has C++ frontend support: a pure C++ interface to PyTorch that follows the design and architecture of the Python frontend.

TensorFlow is another end-to-end machine learning framework that is very popular in the industry. As for MLOps, it has a feature called TensorFlow Extended (TFX) which is an end-to-end platform for data preparation, training, validation and deploying machine learning models in large production environments. A TFX pipeline is a sequence of components specifically designed for scalable and high performance machine learning tasks.

RAY is a reinforcement-learning (RL) framework, which includes several useful training libraries: tun, rllib, train and dataset. Tune is great for tuning hyperparameters. RLlib is used for training RL models. Train is for distributed deep learning. Dataset is for distributed data loading. Ray has two additional libraries, Service and Workflow, which are useful for deploying machine learning models and distributed apps to production.

For creating interpretable and unbiased models, two useful frameworks are InterpretML and FairLearn. InterpretML is an open source package that includes several machine learning interpretability techniques. With this package, you can train interpretable Glassbox models and also interpret Blackbox systems. In addition, it helps you understand the global behavior of your model, or the reason behind individual predictions.

Fairlearn is a Python package that can provide metrics to assess which groups are negatively affected by a model and compare multiple models in terms of using fairness and accuracy metrics. It also supports multiple algorithms with different fairness definitions to reduce unfairness in a variety of AI and machine learning tasks.

Our third open source technology is used for model deployment. When working with different frameworks and tools, you need to deploy models according to the needs of each framework. To standardize this process, you can use the ONNX format.

ONNX stands for Open Neural Network Exchange. ONNX is an open source format for machine learning models that supports interoperability between different frameworks. This means that you can train a model in one of several popular machine learning frameworks, such as PyTorch, TensorFlow, or RAY. You can then convert it to OnX format and in different frameworks; For example, in ML.NET.

The ONNX Runtime (ORT) represents a machine learning model using a common set of operators, the building blocks of machine learning, and deep learning models, which allow the model to run on a variety of hardware and operating systems. ORT optimizes and accelerates machine learning inferencing, which can enable faster customer experiences and lower product costs. It supports models from deep learning frameworks such as PyTorch, and TensorFlow, but also classical machine learning libraries, such as scikit-learn.

There are many different popular frameworks that support conversion to ONNX. For some of these, such as PyTorch, ONNX format export is built-in. For others, such as TensorFlow or Keras, there are separate installable packages that can handle this conversion. The process is pretty straightforward: First, you need to train the model using any framework that supports export and conversion to ONNX format. You then load and run the model with the ONNX runtime. Lastly, you can tune the performance using different runtime configurations or hardware accelerators.

Machine Learning Pipeline

The third aspect you should know before implementing machine learning in production is how to build a pipeline for your machine learning solution. The first task in the pipeline is data preparation, which includes importing, validating, cleaning, transforming, and normalizing your data.

Next, the pipeline includes training configuration, including parameters, file paths, logging, and reporting. Then there are the actual training and validation tasks which are done in an efficient and repeatable manner. Efficiency can also come from specific data subsets, different hardware, computing resources, distributed processing, and progress monitoring. Finally, there is the deployment phase, which includes versioning, scaling, provisioning, and access control.

Choosing plumbing technology will depend on your particular needs; Typically these fall under one of three scenarios: model orchestration, data orchestration, or code and application orchestration. Each scenario is oriented around an individual who is the primary user of the technology and has a canonical pipeline, which is the specific workflow of the scenario.

In the model orchestration scenario, the primary personality is a data scientist. The canonical pipeline in this scenario is from data to model. In terms of open source technology options, Kubeflow Pipeline is a popular choice for this scenario.

For data orchestration scenarios, the primary personality is a data engineer, and the canonical pipeline is data to data. A common open source alternative for this scenario is Apache Airflow.

Finally, the third scenario is code and application orchestration. Here, the primary person is an app developer. The canonical pipeline code here is from plus model to service. A common open source solution for this scenario is Jenkins.

The figure below shows an example of a pipeline built on Azure Machine Learning. For each step, the Azure Machine Learning service calculates the requirements for hardware compute resources, OS resources such as Docker images, software resources such as Conda, and data input.

The service then determines dependencies between phases, resulting in a very dynamic execution graph. When each step in the performance graph runs, the service configures the required hardware and software environment. The stage also sends logging and monitoring information to its containing experiment object. When the stage is completed, its outputs are produced as inputs to the next stage. Finally, resources that are no longer needed are finalized and set aside.


The last tool you should consider before implementing machine learning in production is MLflow. MLflow is an open source platform that manages the end-to-end machine learning lifecycle. It consists of four primary components that are extremely important in this lifecycle.

The first is MLflow Tracking, which tracks experiments to record and compare parameters and results. MLflow runs can be recorded to a local file, to a SQLAlchemy compatible database, or remotely to a tracking server. You can log data for one run using Python, R, Java, or a REST API. MLflow allows you to group runs under experiments, which can be useful for comparing runs and also for comparing runs for example dealing with a particular task.

Next is the MLflow project, which packs ML code into a single project, a reusable and reproducible form, to be shared with other data scientists or transferred to a production environment. It mainly specifies a format for packaging data science code based on conventions. In addition, this component includes an API and command line tools for running projects, making it possible to chain multiple projects together in a workflow.

Then there is the MLflow model, which manages and deploys models from a variety of machine learning libraries to various model serving and inference platforms. A model packaging is a standard format for machine learning models that can be used in various downstream tools; For example, real-time serving via a REST API or batch estimation on Apache Spark. Each model is a directory containing arbitrary files, with a model file at the root of the directory that can define the many flavors in which the model can be viewed.

The final component is the MLflow Registry, a centralized model store, set of APIs, and a UI to manage the full lifecycle of the MLflow model in a collaborative manner. It provides a model lineage, model version, phase transition and annotation. Registry is extremely important if you are looking for a centralized model store and a separate set of APIs to manage the entire lifecycle of your machine learning model.


These four aspects — MLOps capabilities, open source integration, machine learning pipeline and MLflow — can help you create a streamlined and repeatable process for implementing machine learning in production. It gives your data scientists the ability to quickly and easily experiment with different models and frameworks. In addition, you can improve your operational processes for your machine learning systems in production, giving you the agility to quickly update your models when real-world data changes over time, reducing a threshold to a Turns it into opportunity.

Leave a Comment