Machine Learning Experiment Tracking Using MLflow

Avikumar Talaviya 22 Jan, 2024 • 7 min read


The area of machine learning (ML) is rapidly expanding and has applications across many different sectors. Keeping track of machine learning experiments using MLflow and managing the trials required to construct them gets harder as they get more complicated. This can result in many problems for data scientists, such as:

  • Loss or duplication of experiments: Keeping track of all the many experiments conducted can be challenging, which increases the risk of experiment loss or duplication.
  • Reproducibility of results: It might be challenging to replicate an experiment’s findings, which makes it challenging to troubleshoot and enhance the model.
  • Lack of transparency: It might make it difficult to trust a model’s predictions since it can be confusing to comprehend how a model was created.
 Photo by CHUTTERSNAP on Unsplash | Machine learning experiment | MLflow
Photo by CHUTTERSNAP on Unsplash

Given the above challenges, It is important to have a tool that can track all the ML experiments and log the metrics for better reproducibility while enabling collaboration. This blog will explore and learn about MLflow, an open-source ML experiment tracking and model management tool with code examples.

Learning Objectives

  • In this article, we aim to get a sound understanding of machine learning experiment tracking and model registry using MLflow.
  • Furthermore, we will learn how ML projects are delivered in a reusable and reproducible way.
  • Lastly, we will learn what a LLM is and why you need to track LLMs for your application development.

What is MLflow?

 MLflow logo (source: official site) | Machine learning experiment
MLflow logo (source: official site)

Machine learning experiment tracking and model management software called MLflow makes it easier to handle machine learning projects. It provides a variety of tools and functions to simplify the ML workflow. Users may compare and replicate findings, log parameters and metrics, and follow MLflow experiments. Additionally, it makes model packing and deployment simple.

With MLflow, you can log parameters and metrics during training runs.

# import the mlflow library
import mlflow

# start teh mlflow tracking 
mlflow.log_param("learning_rate", 0.01)
mlflow.log_metric("accuracy", 0.85)

MLflow also supports model versioning and model management, allowing you to track and organize different versions of your models easily:

import mlflow.sklearn

# Train and save the model
model = train_model()
mlflow.sklearn.save_model(model, "model")

# Load a specific version of the model
loaded_model = mlflow.sklearn.load_model("model", version="1")

# Serve the loaded model for predictions
predictions = loaded_model.predict(data)

Additionally, MLflow has a model registry that enables many users to effortlessly monitor, exchange, and deploy models for collaborative model development.

MLflow also allows models to be registered in a model registry, recipes, and plugins, along with extensive language model tracking. Now, we will look at the other components of the MLflow library.

MLflow — Experiment Tracking

MLflow has many features, including Experiment tracking to track machine learning experiments for any ML project. Experiment tracking is a unique set of APIs and UI for logging parameters, metrics, code versions, and output files for diagnosing purposes. MLflow experiment tracking has Python, Java, REST, and R APIs.

Now, look at the code example of MLflow experiment tracking using Python programming.

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from mlflow.models.signature import infer_signature

# Load and preprocess your dataset
data = load_dataset()
X_train, X_test, y_train, y_test = train_test_split(data["features"], data["labels"], test_size=0.2)

# Start an MLflow experiment
mlflow.set_experiment("My Experiment")
      # Log parameters
      mlflow.log_param("n_estimators", 100)
      mlflow.log_param("max_depth", 5)

      # Create and train the model
      model = RandomForestClassifier(n_estimators=100, max_depth=5), y_train)

      # Make predictions on the test set
      y_pred = model.predict(X_test)
      signature = infer_signature(X_test, y_pred)

      # Log metrics
      accuracy = accuracy_score(y_test, y_pred)
      mlflow.log_metric("accuracy", accuracy)

      # Save the model
      mlflow.sklearn.save_model(model, "model")

# Close the MLflow run

In the above code, we import the modules from MLflow and the sklearn library to perform a model experiment tracking. After that, we load the sample dataset to proceed with mlflow experiment APIs. We are using start_run(), log_param(), log_metric(), and save_model() classes to run the experiments and save them in an experiment called “My Experiment.”

Apart from this, MLflow also supports automatic logging of the parameters and metrics without explicitly calling each tracking function. You can use mlflow.autolog() before training code to log all the parameters and artifacts.

MLflow — Model registry

 Model registry illustration (source: Databricks) | Machine learning experiment
Model registry illustration (source: Databricks)

The model registry is a centralized model register that stores model artifacts using a set of APIs and a UI to collaborate effectively with the complete MLOps workflow.

It provides a complete lineage of machine learning model saving with model saving, model registration, model versioning, and staging within a single UI or using a set of APIs.

Let’s look at the MLflow model registry UI in the screenshot below.

 MLflow UI screenshot
mlflow UI screenshot

The above screenshot shows saved model artifacts on MLflow UI with the ‘Register Model’ button, which can be used to register models on a model registry. Once the model is registered, it will be shown with its version, time stamp, and stage on the model registry UI page. (Refer to the below screenshot for more information.)

 MLflow model registry UI
MLflow model registry UI

As discussed earlier apart from UI workflow, MLflow supports API workflow to store models on the model registry and update the stage and version of the models.

# Log the sklearn model and register as version 1

The above code logs the model and registers the model if it already doesn’t exist. If the model name exists, it creates a new version of the model. There are many other alternatives to register models in the MLflow library. I highly recommend reading official documentation for the same.

MLflow — Projects

Another component of MLflow is MLflow projects, which are used to pack data science code in a reusable and reproducible way for any team member in a data team.

The project code consists of the project name, entry point, and environment information, which specifies the dependencies and other project code configurations to run the project. MLflow supports environments such as Conda, virtual environments, and Docker images.

In a nutshell, the MLflow project file contains the following elements:

  • Project name
  • Environment file
  • Entry points

Let’s look at the example of the MLflow project file.

# name of the project
name: My Project

python_env: python_env.yaml
# or
# conda_env: my_env.yaml
# or
# docker_env:
#    image:  mlflow-docker-example

# write the entry points
      data_file: path
      regularization: {type: float, default: 0.1}
    command: "python -r {regularization} {data_file}"
      data_file: path
    command: "python {data_file}"

The above file shows the project name, the environment config file’s name, and the project code’s entry points for the project to run during runtime.

Here’s the example of Python python_env.yaml environment file:

# Python version required to run the project.
python: "3.8.15"
# Dependencies required to build packages. This field is optional.
  - pip
  - setuptools
  - wheel==0.37.1
# Dependencies required to run the project.
  - mlflow==2.3
  - scikit-learn==1.0.2

MLflow — LLM Tracking

As we have seen, LLMs are taking over the technology industry like nothing in recent times. With the rise in LLM-powered applications, developers are increasingly adopting LLMs into their workflows, creating the need for tracking and managing such models during the development workflow.

What are the LLMs?

Large language models are a type of neural network model developed using transformer architecture with training parameters in billions. Such models can perform a wide range of natural language processing tasks, such as text generation, translation, and question-answering, with high levels of fluency and coherence.

Why do we need LLM Tracking?

Unlike classical machine learning models, LLMs must monitor prompts to evaluate performance and find the best production model. LLMs have many parameters like top_k, temperature, etc., and multiple evaluation metrics. Different models under different parameters produce various results for certain queries. Hence, It is important to monitor them to identify the best-performing LLM.

MLflow LLM tracking APIs are used to log and monitor the behavior of LLMs. It logs inputs, outputs, and prompts submitted and returned from LLM. It also provides a comprehensive UI to view and analyze the results of the process. To learn more about the LLM tracking APIs, I recommend visiting their official documentation for a more detailed understanding.


In conclusion, MLflow is an immensely effective and exhaustive platform for managing machine learning workflows and experiments. With features like model management and support for various machine-learning libraries. With its four main components — experiment tracking, model registry, projects, and LLM tracking — MMLflow provides a seamless end-to-end machine learning pipeline management solution for managing and deploying machine learning models.

Key Takeaways

Let’s look at the key learnings from the article.

  1. Machine learning experiment tracking allows data scientists and ML engineers to easily track and log the parameters and metrics of the model.
  2. The model registry helps store and manage the ML model in a centralized repository.
  3. MLflow projects help simplify project code in packaging and deploying machine learning code, which makes it easier to reproduce the results in different environments.

Frequently Asked Questions

Q1: How do you track machine learning experiments in MLflow?

A: MLflow has many features, including Experiment tracking to track machine learning experiments for any ML project. Experiment tracking is a unique set of APIs and UI for logging parameters, metrics, and code versions to track experiments seamlessly.

Q2: What is an MLflow experiment?

A: An MLflow experiment that tracks and stores all the runs under one common experiment title in order to diagnose the best experiment available.

Q3: What is the difference between a run and an experiment in MLflow?

A: An experiment is the parent unit of runs in machine learning experiment tracking while the run is a collection of parameters, models, metrics, labels, and artifacts related to the training process of the model.

Q4: What is the advantage of MLflow?

A: MLflow is the most comprehensive and powerful tool to manage and track machine learning models. MLflow UI and a wide range of components are among the major advantages of MLflow.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion. 

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

  • [tta_listen_btn class="listen"]