Introducing our new CEO Don Johnson - Read More

bitnamicharts/mlflow

Verified Publisher

By VMware

Updated 14 days ago

Bitnami Helm chart for MLFlow

Image
Helm
Data Science
Integration & Delivery
Machine Learning & AI
1

500K+

Bitnami package for MLflow

MLflow is an open-source platform designed to manage the end-to-end machine learning lifecycle. It allows you to track experiments, package code into reproducible runs, and share and deploy models.

Overview of MLflow

Trademarks: This software listing is packaged by Bitnami. The respective trademarks mentioned in the offering are owned by the respective companies, and use of them does not imply any affiliation or endorsement.

TL;DR

helm install my-release oci://registry-1.docker.io/bitnamicharts/mlflow

Looking to use MLflow in production? Try VMware Tanzu Application Catalog, the commercial edition of the Bitnami catalog.

Introduction

This chart bootstraps a MLflow deployment on a Kubernetes cluster using the Helm package manager.

Python is built for full integration into Python that enables you to use it with its libraries and main packages.

Bitnami charts can be used with Kubeapps for deployment and management of Helm Charts in clusters.

Prerequisites

  • Kubernetes 1.19+
  • Helm 3.2.0+
  • PV provisioner support in the underlying infrastructure
  • ReadWriteMany volumes for deployment scaling

Installing the Chart

To install the chart with the release name my-release:

helm install my-release oci://REGISTRY_NAME/REPOSITORY_NAME/mlflow

Note: You need to substitute the placeholders REGISTRY_NAME and REPOSITORY_NAME with a reference to your Helm chart registry and repository. For example, in the case of Bitnami, you need to use REGISTRY_NAME=registry-1.docker.io and REPOSITORY_NAME=bitnamicharts.

The command deploys mlflow on the Kubernetes cluster in the default configuration. The Parameters section lists the parameters that can be configured during installation.

Tip: List all releases using helm list

Configuration and installation details

Resource requests and limits

Bitnami charts allow setting resource requests and limits for all containers inside the chart deployment. These are inside the resources value (check parameter table). Setting requests is essential for production workloads and these should be adapted to your specific use case.

To make this process easier, the chart contains the resourcesPreset values, which automatically sets the resources section according to different presets. Check these presets in the bitnami/common chart. However, in production workloads using resourcesPreset is discouraged as it may not fully adapt to your specific needs. Find more information on container resource management in the official Kubernetes documentation.

Prometheus metrics

This chart can be integrated with Prometheus by setting tracking.metrics.enabled to true. This will expose MLFlow native Prometheus endpoint in the service. It will have the necessary annotations to be automatically scraped by Prometheus.

Prometheus requirements

It is necessary to have a working installation of Prometheus or Prometheus Operator for the integration to work. Install the Bitnami Prometheus helm chart or the Bitnami Kube Prometheus helm chart to easily have a working Prometheus in your cluster.

Integration with Prometheus Operator

The chart can deploy ServiceMonitor objects for integration with Prometheus Operator installations. To do so, set the value tracking.metrics.serviceMonitor.enabled=true. Ensure that the Prometheus Operator CustomResourceDefinitions are installed in the cluster or it will fail with the following error:

no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"

Install the Bitnami Kube Prometheus helm chart for having the necessary CRDs and the Prometheus Operator.

Securing traffic using TLS

MLflow can encrypt communications by setting tracking.tls.enabled=true. The chart allows two configuration options:

  • Provide your own secret using the tracking.tls.certificatesSecret value. Also set the correct name of the certificate files using the tracking.tls.certFilename, tracking.tls.certKeyFilename and tracking.tls.certCAFilename values.
  • Have the chart auto-generate the certificates using tracking.tls.autoGenerated=true.
Backup and restore

To back up and restore Helm chart deployments on Kubernetes, you need to back up the persistent volumes from the source deployment and attach them to a new deployment using Velero, a Kubernetes backup/restore tool. Find the instructions for using Velero in this guide.

Parameters

Global parameters
NameDescriptionValue
global.imageRegistryGlobal Docker image registry""
global.imagePullSecretsGlobal Docker registry secret names as an array[]
global.defaultStorageClassGlobal default StorageClass for Persistent Volume(s)""
global.storageClassDEPRECATED: use global.defaultStorageClass instead""
global.security.allowInsecureImagesAllows skipping image verificationfalse
global.compatibility.openshift.adaptSecurityContextAdapt the securityContext sections of the deployment to make them compatible with Openshift restricted-v2 SCC: remove runAsUser, runAsGroup and fsGroup and let the platform use their allowed default IDs. Possible values: auto (apply if the detected running cluster is Openshift), force (perform the adaptation always), disabled (do not perform adaptation)auto
Common parameters
NameDescriptionValue
kubeVersionOverride Kubernetes version""
nameOverrideString to partially override common.names.name""
fullnameOverrideString to fully override common.names.fullname""
namespaceOverrideString to fully override common.names.namespace""
commonLabelsLabels to add to all deployed objects{}
commonAnnotationsAnnotations to add to all deployed objects{}
clusterDomainKubernetes cluster domain namecluster.local
extraDeployArray of extra objects to deploy with the release[]
diagnosticMode.enabledEnable diagnostic mode (all probes will be disabled and the command will be overridden)false
diagnosticMode.commandCommand to override all containers in the deployment["sleep"]
diagnosticMode.argsArgs to override all containers in the deployment["infinity"]
MLflow common Parameters
NameDescriptionValue
image.registrymlflow image registryREGISTRY_NAME
image.repositorymlflow image repositoryREPOSITORY_NAME/mlflow
image.digestmlflow image digest in the way sha256:aa.... Please note this parameter, if set, will override the tag image tag (immutable tags are recommended)""
image.pullPolicymlflow image pull policyIfNotPresent
image.pullSecretsmlflow image pull secrets[]
image.debugEnable mlflow image debug modefalse
gitImage.registryGit image registryREGISTRY_NAME
gitImage.repositoryGit image repositoryREPOSITORY_NAME/git
gitImage.digestGit image digest in the way sha256:aa.... Please note this parameter, if set, will override the tag""
gitImage.pullPolicyGit image pull policyIfNotPresent
gitImage.pullSecretsSpecify docker-registry secret names as an array[]
MLflow Tracking parameters
NameDescriptionValue
tracking.enabledEnable Tracking servertrue
tracking.replicaCountNumber of mlflow replicas to deploy1
tracking.hostmlflow tracking listening host. Set to "[::]" to use ipv6.0.0.0.0
tracking.containerPorts.httpmlflow HTTP container port5000
tracking.livenessProbe.enabledEnable livenessProbe on mlflow containerstrue
tracking.livenessProbe.initialDelaySecondsInitial delay seconds for livenessProbe5
tracking.livenessProbe.periodSecondsPeriod seconds for livenessProbe10
tracking.livenessProbe.timeoutSecondsTimeout seconds for livenessProbe5
tracking.livenessProbe.failureThresholdFailure threshold for livenessProbe5
tracking.livenessProbe.successThresholdSuccess threshold for livenessProbe1
tracking.readinessProbe.enabledEnable readinessProbe on mlflow containerstrue
tracking.readinessProbe.initialDelaySecondsInitial delay seconds for readinessProbe5
tracking.readinessProbe.periodSecondsPeriod seconds for readinessProbe10
tracking.readinessProbe.timeoutSecondsTimeout seconds for readinessProbe5
tracking.readinessProbe.failureThresholdFailure threshold for readinessProbe5
tracking.readinessProbe.successThresholdSuccess threshold for readinessProbe1
tracking.startupProbe.enabledEnable startupProbe on mlflow containersfalse
tracking.startupProbe.initialDelaySecondsInitial delay seconds for startupProbe5
tracking.startupProbe.periodSecondsPeriod seconds for startupProbe10
tracking.startupProbe.timeoutSecondsTimeout seconds for startupProbe5
tracking.startupProbe.failureThresholdFailure threshold for startupProbe5
tracking.startupProbe.successThresholdSuccess threshold for startupProbe1
tracking.customLivenessProbeCustom livenessProbe that overrides the default one{}
tracking.customReadinessProbeCustom readinessProbe that overrides the default one{}
tracking.customStartupProbeCustom startupProbe that overrides the default one{}
tracking.resourcesPresetSet container resources according to one common preset (allowed values: none, nano, micro, small, medium, large, xlarge, 2xlarge). This is ignored if tracking.resources is set (tracking.resources is recommended for production).medium
tracking.resourcesSet container requests and limits for different resources like CPU or memory (essential for production workloads){}
tracking.podSecurityContext.enabledEnabled mlflow pods' Security Contexttrue
tracking.podSecurityContext.fsGroupChangePolicySet filesystem group change policyAlways
tracking.podSecurityContext.sysctlsSet kernel settings using the sysctl interface[]
tracking.podSecurityContext.supplementalGroupsSet filesystem extra groups[]
tracking.podSecurityContext.fsGroupSet mlflow pod's Security Context fsGroup1001
tracking.containerSecurityContext.enabledEnabled containers' Security Contexttrue
tracking.containerSecurityContext.seLinuxOptionsSet SELinux options in container{}
tracking.containerSecurityContext.runAsUserSet containers' Security Context runAsUser

Note: the README for this chart is longer than the DockerHub length limit of 25000, so it has been trimmed. The full README can be found at https://github.com/bitnami/charts/blob/main/bitnami/mlflow/README.md

Docker Pull Command

docker pull bitnamicharts/mlflow
Bitnami