Installing Kubeflow on VMware Tanzu Kubernetes Grid Cluster (TKC)
What is Kubeflow?
The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.
Watch this video:
What is Tanzu Kubernetes Grid Cluster?
TKG Cluster is a VMware opinionated production ready Kubernetes cluster that can run across hybrid multicloud environment.
Read more detail here:
Now, Let’s discuss about how to install kubeflow on TKC.
In this post, I will be using Charmed Operator to install kubeflow on TKC, to know more about charmed operator, check this url: https://charmed-kubeflow.io/docs
Kubeflow Installation Pre-requirements
- Install the juju client on a Linux Server.
$ snap install juju --classic
juju 2.9.11 from Canonical✓ installed
quick note: “Juju provides easy, intelligent application orchestration on top of Kubernetes”. For more detail, visit here:
Validate if juju client installed successfully.
$ juju help
2. Connect Juju to Tanzu Kubernetes Grid Cluster (TKC)
$ juju add-k8s mytkgcluster --cluster-name=<name of your cluster>
--storage=<storage class name>
This operation can be applied to both a copy on this client and to the one on a controller.
No current controller was detected and there are no registered controllers on this client: either bootstrap one or register one.k8s substrate "<Cluster name>" added as cloud "mytkgcluster" with storage provisioned
by the existing "tanzu-storage-policy" storage class.
You can now bootstrap to this cloud by running 'juju bootstrap mytkgcluster'.
3. Create a controller. To operate workloads on a Kubernetes cluster, Juju uses controllers.
$ juju bootstrap mytkgcluster my-tkg-controllerCreating Juju controller "my-tkg-controller" on mytkgclusterBootstrap to generic Kubernetes clusterFetching Juju Dashboard 0.8.1Creating k8s resources for controller "controller-my-tkg-controller"Starting controller podBootstrap agent now startedContacting Juju controller at 10.110.11.40 to verify accessibility...Bootstrap complete, controller "my-tkg-controller" is now available in namespace "controller-my-tkg-controller"Now you can runjuju add-model <model-name>to create a new model to deploy k8s workloads.
4. Validate the resources deployed
$ k get all -n controller-my-tkg-controllerNAME READY STATUS RESTARTS AGEpod/controller-0 2/2 Running 2 3m8spod/modeloperator-696db856f9-xc2nw 1/1 Running 0 2m14sNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEservice/controller-service ClusterIP 10.110.11.40 <none> 17070/TCP 3m11sservice/modeloperator ClusterIP 10.102.11.92 <none> 17071/TCP 95sNAME READY UP-TO-DATE AVAILABLE AGEdeployment.apps/modeloperator 1/1 1 1 95sNAME DESIRED CURRENT READY AGEreplicaset.apps/modeloperator-696db856f9 1 1 1 2m15sNAME READY AGEstatefulset.apps/controller 1/1 3m8s
5. Create a model. A model in Juju is a blank canvas where operators will be deployed, and it holds a one:one relationship with a k8s namespace.
$ juju add-model kubeflowAdded 'kubeflow' model with credential 'mytkgcluster' for user 'admin'
Kubeflow Installation
To deploy the full Kubeflow bundle, we will need at least 50Gb available of disk, 14Gb of RAM, and 2 CPUs available in your machine/VM. In this post, I will be showing how to deploy kubeflow-lite.
- Install kubeflow lite
$ juju deploy cs:kubeflow-liteLocated bundle "kubeflow-lite" in charm-store, revision 54Located charm "admission-webhook" in charm-store, revision 10Located charm "argo-controller" in charm-store, revision 51Located charm "dex-auth" in charm-store, revision 60Located charm "istio-ingressgateway" in charm-store, revision 20Located charm "istio-pilot" in charm-store, revision 20Located charm "jupyter-controller" in charm-store, revision 56Located charm "jupyter-ui" in charm-store, revision 10Located charm "kfp-api" in charm-store, revision 12Located charm "mariadb-k8s" in charm-store, revision 35Located charm "kfp-persistence" in charm-store, revision 9Located charm "kfp-schedwf" in charm-store, revision 9Located charm "kfp-ui" in charm-store, revision 12Located charm "kfp-viewer" in charm-store, revision 9Located charm "kfp-viz" in charm-store, revision 8Located charm "kubeflow-dashboard" in charm-store, revision 56Located charm "kubeflow-profiles" in charm-store, revision 52Located charm "kubeflow-volumes" in charm-store, revision 0Located charm "minio" in charm-store, revision 55Located charm "mlmd" in charm-store, revision 5Located charm "oidc-gatekeeper" in charm-store, revision 54Located charm "pytorch-operator" in charm-store, revision 53Located charm "seldon-core" in charm-store, revision 50Located charm "tfjob-operator" in charm-store, revision 1Executing changes:- upload charm admission-webhook from charm-store with architecture=amd64- deploy application admission-webhook from charm-store with 1 unitadded resource oci-image- set annotations for admission-webhook- upload charm argo-controller from charm-store with architecture=amd64- deploy application argo-controller from charm-store with 1 unitadded resource oci-image- set annotations for argo-controller- upload charm dex-auth from charm-store with architecture=amd64- deploy application dex-auth from charm-store with 1 unitadded resource oci-image- set annotations for dex-auth- upload charm istio-ingressgateway from charm-store with architecture=amd64- deploy application istio-ingressgateway from charm-store with 1 unitadded resource oci-image- set annotations for istio-ingressgateway- upload charm istio-pilot from charm-store with architecture=amd64- deploy application istio-pilot from charm-store with 1 unitadded resource oci-image- set annotations for istio-pilot- upload charm jupyter-controller from charm-store with architecture=amd64- deploy application jupyter-controller from charm-store with 1 unitadded resource oci-image- set annotations for jupyter-controller- upload charm jupyter-ui from charm-store with architecture=amd64- deploy application jupyter-ui from charm-store with 1 unitadded resource oci-image- set annotations for jupyter-ui- upload charm kfp-api from charm-store with architecture=amd64- deploy application kfp-api from charm-store with 1 unitadded resource oci-image- set annotations for kfp-api- upload charm mariadb-k8s from charm-store with architecture=amd64- deploy application kfp-db from charm-store with 1 unit using mariadb-k8s- set annotations for kfp-db- upload charm kfp-persistence from charm-store with architecture=amd64- deploy application kfp-persistence from charm-store with 1 unitadded resource oci-image- set annotations for kfp-persistence- upload charm kfp-schedwf from charm-store with architecture=amd64- deploy application kfp-schedwf from charm-store with 1 unitadded resource oci-image- set annotations for kfp-schedwf- upload charm kfp-ui from charm-store with architecture=amd64- deploy application kfp-ui from charm-store with 1 unitadded resource oci-image- set annotations for kfp-ui- upload charm kfp-viewer from charm-store with architecture=amd64- deploy application kfp-viewer from charm-store with 1 unitadded resource oci-image- set annotations for kfp-viewer- upload charm kfp-viz from charm-store with architecture=amd64- deploy application kfp-viz from charm-store with 1 unitadded resource oci-image- set annotations for kfp-viz- upload charm kubeflow-dashboard from charm-store with architecture=amd64- deploy application kubeflow-dashboard from charm-store with 1 unitadded resource oci-image- set annotations for kubeflow-dashboard- upload charm kubeflow-profiles from charm-store with architecture=amd64- deploy application kubeflow-profiles from charm-store with 1 unitadded resource kfam-imageadded resource profile-image- set annotations for kubeflow-profiles- upload charm kubeflow-volumes from charm-store with architecture=amd64- deploy application kubeflow-volumes from charm-store with 1 unitadded resource oci-image- set annotations for kubeflow-volumes- upload charm minio from charm-store with architecture=amd64- deploy application minio from charm-store with 1 unitadded resource oci-image- set annotations for minio- upload charm mlmd from charm-store with architecture=amd64- deploy application mlmd from charm-store with 1 unitadded resource oci-image- set annotations for mlmd- upload charm oidc-gatekeeper from charm-store with architecture=amd64- deploy application oidc-gatekeeper from charm-store with 1 unitadded resource oci-image- set annotations for oidc-gatekeeper- upload charm pytorch-operator from charm-store with architecture=amd64- deploy application pytorch-operator from charm-store with 1 unitadded resource oci-image- set annotations for pytorch-operator- upload charm seldon-core from charm-store with architecture=amd64- deploy application seldon-controller-manager from charm-store with 1 unit using seldon-coreadded resource oci-image- set annotations for seldon-controller-manager- upload charm tfjob-operator from charm-store with architecture=amd64- deploy application tfjob-operator from charm-store with 1 unitadded resource oci-image- set annotations for tfjob-operator- add relation argo-controller - minio- add relation dex-auth:oidc-client - oidc-gatekeeper:oidc-client- add relation istio-pilot:ingress - dex-auth:ingress- add relation istio-pilot:ingress - jupyter-ui:ingress- add relation istio-pilot:ingress - kfp-ui:ingress- add relation istio-pilot:ingress - kubeflow-dashboard:ingress- add relation istio-pilot:ingress - kubeflow-volumes:ingress- add relation istio-pilot:istio-pilot - istio-ingressgateway:istio-pilot- add relation istio-pilot:ingress - oidc-gatekeeper:ingress- add relation istio-pilot:ingress-auth - oidc-gatekeeper:ingress-auth- add relation kfp-api - kfp-db- add relation kfp-api:kfp-api - kfp-persistence:kfp-api- add relation kfp-api:kfp-api - kfp-ui:kfp-api- add relation kfp-api:kfp-viz - kfp-viz:kfp-viz- add relation kfp-api:object-storage - minio:object-storage- add relation kfp-ui:object-storage - minio:object-storage- add relation kubeflow-profiles - kubeflow-dashboardDeploy of bundle completed.
2. Validate the installation status by checking various pods status inside kubeflow namespace in TKC
3. It will take around 20 mins. Once installation is successful, move to the next step and access kubeflow UI
Accessing Kubeflow UI
I will be demonstrating very simple way to access kubeflow by using port forwarding. You can use other methods.
- Run the port forwarding to access the kubeflow ui
2. Access the kubeflow GUI using localhost and port
3. Provide the namespace name. This namespace is in kubeflow, not TKC.
3. Here is your kubeflow landing page.
4. Now, kubeflow is successfully installed and i am able to access the GUI.
If you are a Data Scientist, this dashboard is for You :)
Uninstall Kubeflow
Kubeflow really takes lot of resource to run, hence i have uninstalled it and here is one simple command that you can run too.
$ juju destroy-model kubeflow --destroy-storage
You need to wait approx 10 mins and juju will take care of complete cleanup.