Skip to main content

MLflow

warning

WIP / PLACEHOLDER FILE

Why use MLflow

AI Platfom allows teams to run their notebooks and pipelines using an MLflow plug-in.

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles four primary functions:

  • Tracking experiments to record and compare parameters and results (MLflow Tracking).
  • Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production (MLflow Projects).
  • Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).
  • Providing a central model store to collaboratively manage the full lifecycle of an MLflow Model, including model versioning, stage transitions, and annotations (MLflow Model Registry). MLflow is library-agnostic. You can use it with any machine learning library, and in any programming language, since all functions are accessible through a REST API and CLI. For convenience, the project also includes a Python API, R API, and Java API.

In the following sections, you can find instructions for using MLflow in AI Platform.


MLflow server setup

To automatically set up your own MLfLow server, follow these steps:

  1. Create a storage account in the Azure Portal. After creating the account, collect the following values:

    • Azure Storage Account
    • Container Name
    • Azure Storage key
      If the storage account is new, generate the key in the Azure Portal by going to Storage account > Security + networking > Access keys. Copy key and storage account name for use in the following steps.
  2. Go to the AI Platform dashboard.

  3. In the side menu, go to Applications and then do the following:

    1. Select Create MLflow to open the settings dialog box.

      MLflowMenu Create MLflow server settings dialog box.

    2. Enter the corresponding values.

    3. Click Create. This will spin up the relevant resources in the background for you.

    4. Click Refresh.

      info

      You will get information relevant to your MLflow server. This includes the internal URL which will be used when you track experiments, as well as the external URL, which directs you to the UI. This UI can also now be seen in the MLflow tab on the left-hand side menu.


View MLflow tracking UI in browser

In your browser, replace the placeholders and go to the following URL:

  • https://{Cluster_Domain}/mlflow-tracking-front-{project_name}/

For example: https://kubeflow16.internal.aurora.equinor.com/mlflow-tracking-front/

You should see MLflow Tracking UI in the browser:

MLflowTrackingUI MLflow tracking UI


Track ML experiments using MLflow

  1. In your MLflowExperimentDemo.ipynb notebook, set the following:
    1. Tracking URI: use the IP address and port values displayed in the Applications > MLflow section of the AI Platform dashboard.
    2. Create/Set an Experiment: change name

SetMLflowTrackingURI Tracking URI and experiment in notebook

  1. Run your MLflowExperimentDemo.ipynb notebook.

    After running the experiment, you can see the results of your experiment on MLflow UI.

    MLflowExperiment Experiments in MLflow

    Best practice

    Running the experiment directly from the notebook will not track the GitHub repository commit version to the experiment. This is not a recommended approach for proper experimentation tracking. Rather, you may use this method if you are testing temporarily from notebooks without lineage to a GitHub code version.

    Go through the following steps if you want to properly track experiments while logging GitHub branch commit versions on the runs.

  2. Commit all your latest changes to the GitHub repository to ensure that the GitHub code version matches with experiments and runs are being tracked.

  3. From the terminal, use the following command (defined in the MLproject file):

    python MLflowExperimentDemo.py

Log artifacts/logs (files/directory) using MLflow

  1. In MLflowArtifactTrackingDemo.ipynb and MLflowLogTrackingDemo.ipynb, set the following:

    1. Tracking URI: use the IP address and port values obtained previously
    2. Create/Set an Experiment: change name
  2. Run your MLflowArtifactTrackingDemo.ipynb and MLflowLogTrackingDemo.ipynb notebooks. After running the notebooks, you can view the artifacts on MLflow UI.

  3. To log GitHub commit code versions properly, you need to generate a Python script from the notebook and specify to run that as an entrypoint in the MLproject file. Then, you may run the script using that command from the terminal.

    MLflowArtifacts Artifacts in MLflow experiments


Log and register Scikit-learn ML model using MLflow

  1. In MLflowModelRegistrationDemo.ipynb, set the following:

    1. Tracking URI: use the IP address and port values obtained previously
    2. Create/Set an Experiment: change name
  2. Run the MLflowModelRegistrationDemo.ipynb notebook. After running the notebook, you can view the scikit-learn model saved and registered on MLflow UI.

  3. To log GitHub commit code versions properly, you need to generate a Python script from the notebook and specify to run that as an entrypoint in the MLproject file. Then, you may run the script using that command from the terminal.

    MLflowModelRegistration Artifacts in MLflow experiments


Adding labels and tags to your experiments and models

For an example that covers experiment tracking, model registration, and tagging, run the following notebook: