MLflow

warning

WIP / PLACEHOLDER FILE

AI Platfom allows teams to run their notebooks and pipelines using an MLflow plug-in.

MLflow is an open source platform for managing the end-to-end machine learning lifecycle. It tackles four primary functions:

Tracking experiments to record and compare parameters and results (MLflow Tracking).
Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production (MLflow Projects).
Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).
Providing a central model store to collaboratively manage the full lifecycle of an MLflow Model, including model versioning, stage transitions, and annotations (MLflow Model Registry). MLflow is library-agnostic. You can use it with any machine learning library, and in any programming language, since all functions are accessible through a REST API and CLI. For convenience, the project also includes a Python API, R API, and Java API.

In the following sections, you can find instructions for using MLflow in AI Platform.

To automatically set up your own MLfLow server, follow these steps:

Create a storage account in the Azure Portal. After creating the account, collect the following values:
- Azure Storage Account
- Container Name
- Azure Storage key
  If the storage account is new, generate the key in the Azure Portal by going to Storage account > Security + networking > Access keys. Copy key and storage account name for use in the following steps.
Go to the AI Platform dashboard.
In the side menu, go to Applications and then do the following:
1. Select Create MLflow to open the settings dialog box.
  
  Create MLflow server settings dialog box.
2. Enter the corresponding values.
3. Click Create. This will spin up the relevant resources in the background for you.
4. Click Refresh.
  
  info
  You will get information relevant to your MLflow server. This includes the internal URL which will be used when you track experiments, as well as the external URL, which directs you to the UI. This UI can also now be seen in the MLflow tab on the left-hand side menu.

In your browser, replace the placeholders and go to the following URL:

For example: https://kubeflow16.internal.aurora.equinor.com/mlflow-tracking-front/

You should see MLflow Tracking UI in the browser:

MLflowTrackingUI MLflow tracking UI

In your MLflowExperimentDemo.ipynb notebook, set the following:
1. Tracking URI: use the IP address and port values displayed in the Applications > MLflow section of the AI Platform dashboard.
2. Create/Set an Experiment: change name

SetMLflowTrackingURI Tracking URI and experiment in notebook

Run your MLflowExperimentDemo.ipynb notebook.

After running the experiment, you can see the results of your experiment on MLflow UI.

Experiments in MLflow

Best practice
Running the experiment directly from the notebook will not track the GitHub repository commit version to the experiment. This is not a recommended approach for proper experimentation tracking. Rather, you may use this method if you are testing temporarily from notebooks without lineage to a GitHub code version.
Go through the following steps if you want to properly track experiments while logging GitHub branch commit versions on the runs.
Commit all your latest changes to the GitHub repository to ensure that the GitHub code version matches with experiments and runs are being tracked.
From the terminal, use the following command (defined in the MLproject file):
```
python MLflowExperimentDemo.py
```

In MLflowArtifactTrackingDemo.ipynb and MLflowLogTrackingDemo.ipynb, set the following:
1. Tracking URI: use the IP address and port values obtained previously
2. Create/Set an Experiment: change name
Run your MLflowArtifactTrackingDemo.ipynb and MLflowLogTrackingDemo.ipynb notebooks. After running the notebooks, you can view the artifacts on MLflow UI.
To log GitHub commit code versions properly, you need to generate a Python script from the notebook and specify to run that as an entrypoint in the MLproject file. Then, you may run the script using that command from the terminal.

Artifacts in MLflow experiments

In MLflowModelRegistrationDemo.ipynb, set the following:
1. Tracking URI: use the IP address and port values obtained previously
2. Create/Set an Experiment: change name
Run the MLflowModelRegistrationDemo.ipynb notebook. After running the notebook, you can view the scikit-learn model saved and registered on MLflow UI.
To log GitHub commit code versions properly, you need to generate a Python script from the notebook and specify to run that as an entrypoint in the MLproject file. Then, you may run the script using that command from the terminal.

Artifacts in MLflow experiments

For an example that covers experiment tracking, model registration, and tagging, run the following notebook: