Enable gVisor based container sandboxing in a Tanzu Kubernetes Grid Cluster


Before i talk about the procedure to enable gVisor in a TKG cluster worker nodes, let us understand what is container sandboxing.

Traditional Linux containers are not sandboxes, What does that mean?

Source: https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime

So, this means that the containers running on a worker nodes can directly make a call to linux kernel.

Sandboxed containers with gVisor

Source: https://cloud.google.com/blog/products/identity-security/open-sourcing-gvisor-a-sandboxed-container-runtime

So here, container running application are making system calls to gVisor which is a linux kernel running on userspace.

There is a very good article that i found, go through once to understand more.

Steps to enable gVisor sandboxing in a TKG cluster worker node

  1. By default, TKG cluster uses a containerd container runtime and that internally uses runc as default runtime. You can validate by looking at containerd configuration file located on a node.
$ cat  cat /etc/containerd/config.toml

Now, we have to configure containerd to use “containerd-shim-runsc-v1”, but before we really configure, we need to install “runsc” and “containerd-shim-runsc-v1”.

Run the following commands on one of the worker node

$ set -e
$ ARCH=$(uname -m)
$ URL=https://storage.googleapis.com/gvisor/releases/release/latest/${ARCH}
$ wget ${URL}/runsc ${URL}/runsc.sha512 \
${URL}/containerd-shim-runsc-v1 ${URL}/containerd-shim-runsc-v1.sha512
sha512sum -c runsc.sha512 \
-c containerd-shim-runsc-v1.sha512
$ rm -f *.sha512
$ chmod a+rx runsc containerd-shim-runsc-v1
$ sudo mv runsc containerd-shim-runsc-v1 /usr/local/bin

Now, runsc is installed on a worker node. Verify this by running the below command and it will give you a help options

$ runsc --help

2. Now, lets update containerd configuration file to use “runsc” runtime.

Update /etc/containerd/config.toml

$ cat <<EOF | sudo tee /etc/containerd/config.toml
version = 2
shim_debug = true
runtime_type = "io.containerd.runc.v2"
runtime_type = "io.containerd.runsc.v1"

3. Restart containerd

$ sudo systemctl restart containerd

4. Lets come back to Kubernetes side now. Come out from worker node and now we will be creating some Kubernetes objects.

5. Create a RuntimeClass, Save the below yaml file e.g. gvisor.yml

apiVersion: node.k8s.io/v1beta1
kind: RuntimeClass
name: gvisor
handler: runsc

6. Lets apply this using kubectl, This is not a namespace scoped resource, so no need to specify the namespace.

$ kubectl create -f gvisor.yml
runtimeclass.node.k8s.io/gvisor created
$ k get runtimeclass
gvisor runsc 5s

7. Now, its time to deploy a pod with newly created runtime class and also since we have installed runsc on a one worker node, so we will specify the node name too. save the file e.g. pod.yml

apiVersion: v1
kind: Pod
name: nginx-gvisor
nodeName: <nodename here>
runtimeClassName: gvisor
- name: nginx
image: nginx

8. Lets create the pod

$ k create -f pod.yml

9. in default namespace, you will see the pod running now.

$ k get pods
nginx-gvisor 1/1 Running 0 47m

10. You can validate on worker node too and will find that there are containers using runsc

And we are done here:)…

Also i would like to mention that, this is just my test and i was curious to see if this works with TKG or not.. You should check with VMware before doing any of this in production environment.

