Kubernetes Up & Running by Brendan Burns, Joe Beda and Kelsey Hightower

Published 3 April 2019
Kubernetes Up & Running cover
Kubernetes Up & Running, Brendan Burns, Joe Beda and Kelsey Hightower

Summary

Kubernetes is the new default choice in term of container management tool. It has proven its capacity to solve a wide range of issues around autoscaling, automated deployment without downtime, high availability and speed of iteration. Moreover, the backing it receives from most cloud providers and other big tech actors as well as the wide community of developers who get involved gives it all the chances to stick around and get even better in the coming years.

This book is a great way to get to discover, understand and use the main concepts in Kubernetes while getting used to the specific way of thinking around infrastructure needed to make the most out of it. The examples are mostly basic setups, but they provide most of what is needed to get started. The emphasis is more on how to use a Kubernetes cluster to deploy the services or other components you might need rather than managing the cluster itself. More resources can be found on the Kubernetes documentation.

Detailed Summary

Note: I have updated the YAML files example taken from the book to match valid files as of the latest edit of this summary. If you want to experiment locally, my setup with Minikube is described in this Github repository.

Chapter 1 - Introduction

Four main reasons to use containers and container APIs like Kubernetes.

Velocity

Based on the principle you want to ship new features while keeping high availability in order not to have losses coming from your deployments. These concepts help with keeping velocity.

  • Immutable infrastructure: any change is applied by creating a new container image which remains accessible in the exact same state in the registry.
  • Declarative configuration: Kubernetes job is to match the desired state describe by the configuration. You should not have to care too much of the actions taken for that.
  • Self-Healing Systems: once the desired state is achieved, Kubernetes will also take action in order to maintain it over time.

Scaling your service and your teams

Thanks to decoupled architecture it is much easier to achieve independent teams and service cooperating through well defined APIs at different levels

  • Pods = groups of containers = deployable unit
  • Services adds load balancing, naming and discovery
  • Namespaces provide isolation and access control
  • Ingress objects can combine several services to a single external API).

Scaling is just a matter of specifying a number of instance and these instance can even end up using less resources than if you would have define them manually.

Abstracting your infrastructure

If you take a few extra step like not using cloud managed services and abstract specific storage implementations for example, you should be able to move a Kubernetes cluster from one provider to another or even self hosted or a combinations of these by just providing the existing config files to the new cluster.

Efficiency

Thanks to the isolation and automation, several applications can be packed on the same servers increasing the the usage ratio of the underlying hardware. It reduces also a lot the cost of test environment as they can just be setup from the same config as prod. That enables new options to increase velocity while keeping a high level of confidence in the releases.

Chapter 2 - Creating and Running Containers

Images are most commonly built based on previously existing images by just adding layers on top of it. For example you take a “system image” of a Linux distribution, add a given version of the JVM to it and then add your Java application. You can then run the application making sure it will always use this same JVM version packaged with it, and also reuse the image containing the JVM for any other application. One thing to keep in mind is that one layer cannot “physically” change a previous one. For example if you remove a file which was created upstream, that file will still count as weight in the final image, it just won’t be accessible in a straightforward way. One you have the image you want, you would commonly host it on a docker registry to be able to fetch it from anywhere and then run it. When running the image you can define useful resources limit as show in this command example.

docker run -d --name <image-name> --publish 8080:8080 \
  --memory 200m --memory-swap 1G --cpu-shares 1024 \
  <image>

Chapter 3 - Deploying a Kubernetes Cluster

To start with, probably better to use a hosted solution on your provider of choice. Locally you can use minikube, but one limitation is that it only creates a single-node cluster, preventing most of the reliability promised by Kubernetes.

One the cluster created, you interact with it using the Kubernetes client kubectl. Some useful commands

kubectl version

Provides both the version of the kubectl tool and of the Kubernetes API server, backward and forward compatibility is guaranteed only within 2 minors versions, so try to keep the update of these two rather close.

kubectl get componentstatuses

Runs a diagnostic on the cluster; useful to get an idea of the general health status of the cluster.

kubectl get nodes
kubectl describe nodes node-1

Gets the list of nodes and then gives details about one specific nodes.

kubectl get daemonSets --namespace--kube-system kube-proxy

Gives the list of proxies.

kubectl get deployments --namespace--kube-system kube-dns
kubectl get services --namespace--kube-system kube-dns

Gives the DNS and of the service performing the load-balancing for the DNS

kubectl get deployments --namespace--kube-system kube-dashboard
kubectl get services --namespace--kube-system kube-dashboard

Gives the dashboard and of the service performing the load-balancing for the dashboard

To access the UI kubectl proxy and http://localhost:8001/ui

Chapter 4 - Common kubectl Commands

By default kubectl interacts with the default namespace, the --namespace=<name> flag can allow to configure that per command.

Default namespaces, clusters and users can de defined in contexts.

kubectl config set-context <context-name>
  --namespace=<name> \
  --users=<users> \
  --clusters=<clusters>
kubectl config use-context <context-name>

Any kind of object in Kubernetes is a resource and can be seen as we saw earlier for the nodes

For a list

kubectl get <resource-name>

For a specific object

kubectl get <resource-name> <object-name>

Useful flags to get correct output -o wide | json | yaml or --no-headers Querying specific fields can be made with --template={.status.podIP} The flag --watch keeps the get command running and update the output if the list of objects or the state of some of the objects change.

For details about an object

kubectl describe <resource-name> <object-name>

To create or update an object, define the desired state in JSON or YAML file and run

kubectl apply -f object.yaml

You can edit a config in an interactive way with

kubectl edit <resource-name> <object-name>

But it should probably be used only for testing purposes since you lose most of the checks and tracking the file editing in a versioning system would provide.

To delete an object

kubectl delete -f obj.yaml

or

kubectl delete <resource-name> <object-name>

But it goes on without confirmation, so to use carefully! Even more carefully you can delete all instances of a given resource

kubectl delete <resource-name> --all

To add a label

kubectl label <resource-name> <object-name> <label-name>=<label-value>

To remove a label

kubectl label <resource-name> <object-name> <label-name>-

Debugging commands seem similar to Docker ones

kubectl logs (-f) <pod-name> (-c <container-name)

The -c flag is useful if several containers run on the same pod. The --previous flag will get the logs from the previous instance of the container. In case of unexpected shutdowns, it can be useful to investigate.

Execute a command in a running container

kubectl exec -it <pod-name> -- bash

Copy files from a container with

kubectl cp <pod-name>:/path/to/file /local/path

Reverse the syntax to copy to the container

See kubectl help for more details

Chapter 5 - Pods

A Pod is the atomic unit in Kubernetes, it is a group of Docker containers. Everything in one Pod execute in the same environment, for example they share the same IP address, port space, hostname. They also always will run on the same machine. If two containers are in different pods, they can be on different servers as far as you know.

To decide what to put in one Pod, the useful question is

“Will these containers work correctly if they land on different machines?”

If the answer is no, then they will have to be on the same Pod, if the answer is yes, you can happily decouple them and put them on different pods. For a typical example, a database doesn’t need to be on the same Pod of the application using it, you just need to indicate to the application where the database is located. But if you have several containers needing the same filesystem, they will have to be on the same pod.

You define Pods in a Pod manifest where you configure the desired state, following the declarative configuration advocated by Kubernetes.

When you submit a Pod manifest to Kubernetes, it gets persisted on the etcd storage. The scheduler places Pods which aren’t already schedules on nodes with enough available resources (favoring different nodes for different instance of the same application for reliability). To manage multiple instance you can submit several time the same Pod manifest or use ReplicaSets which will be explained later.

Creating a Pod (through a deployment) can be done with

kubectl run <pod-name> --image=<image-name>

But the normal way would be to create a Pod manifest in YAML or JSON and make Kubernetes apply it with

kubectl apply -f config-file.yaml

In order to access a Pod you can forward one of its port to a port of your machine with

kubetctl port-forward <pod-name> <loalhost-port>:<pod-pord>

Kubernetes defines two types of health checks for a Pod, Any of them can be httpGet, tcpSocket or even exec. The liveness probe is used to know if a container is healthy. If it fails this probe, the container will be restarted.

The readiness probe is used to know if a container is ready to accept traffic. If one container fails it, it gets removed from the service load balancers.

In a Pod manifest you can define how much resource each container needs and should be given. Two different types of configuration exist and the Pod’s resources will be the sum of its containers’ resources. The requests are the resources the container should always have. The container will not be put on a server with less than these resources available. The limits are the maximum resources the container should ever be allocated. If left undefined, the containers running on a machine will be allocated evenly all resources of the machine until more containers are scheduled to run there and request some of the resources.

If you need to persist data from a Pod over containers restarts you will need to access some kind of persistent storage, it is done through spec.volumes and spec.containers.volumeMounts in order to map the mounted volume to a path for the container. This can be used for sharing data across containers, persisting a cache which should survive restart of the container (not pod) (with emptyDir), actual persistent data on remote disks to be able to access it wherever the Pod is scheduled next (with nfs for example), or access the host filesystem with hostPath.

Based on the previously mentioned concepts the Pod manifest would look something like the following file.

apiVersion: v1
kind: Pod
metadata:
  name: <pod-name>
spec:
  volumes:
    - name: "<volume-name>"
      nfs:
        server: my.nfs.server.local
        path: "/path"
  containers:
    - name: <image-name>
      image: <image>
      ports:
        - containerPort: 8080
          name: http
          protocol: TCP
      resources:
        requests:
          cpu: "500m"
          memory: "128Mi"
        limits:
          cpu: "1000m"
          memory: "256Mi"
      volumeMounts:
        - mountPath: "/path"
          name: "<volume-name>"
      livenessProbe:
        httpGet:
          path: /healthy
          port: 8080
        initialDelaySeconds: 5
        timeoutSeconds: 1
        periodSeconds: 10
        failureThreshold: 3
      readinessProbe:
        httpGet:
          path: /ready
          port: 8080
        initialDelaySeconds: 30
        timeoutSeconds: 1
        periodSeconds: 10
        failureThreshold: 3

Chapter 6 - Labels and Annotations

Both are very similar key/value pairs. The key is made of a prefix <253 characters following DNS subdomain syntax and a name < 63 characters starting and ending with alphanumeric character and allowing -, _, .. For the value, in case of label it follows the same restrictions as the key name. While for annotations it is totally free text.

Labels can be queried through selectors and are used to identify objects while annotations can be used for any kind of data relevant to an object. Some uses cases of annotations is passing information to some tools and library or just storing metadata which doesn’t need to be queried.

Applying a label to a deployment can be done with a command like

kubectl run <pod-name> \
  --image=<image> \
  --replicas=2 \
  --labels="ver=1,app=<app-name>,env=prod"

You can then add the --show-label to a get command

kubectl get deployments --show-labels

To modify labels on a deployment run

kubectl label deployments <deployment-name> "<label-name>=<label-value>"

However, this way the new label would not be propagated to Pods and ReplicaSets created by the deployment.

The -L <label-name> flag on a get command, shows the label as a column.

Remove a label with

kubectl label deployments <deployment-name> "<label-name>-"

To use selectors

kubectl get pods --selector="<label-name>=<label-value>,<label-name>=<label-value>"
kubectl get pods --selector="<label-name> in (<label-value>,<label-value>)"

The , is an AND

kubectl get deployments --selector="<label-name>"

Returns the deployments with this label set, independently of the value.

A YAML syntax would look like

selector:
  matchLabels:
    app: <app-name>
  matchExpressions:
    - {key: ver, operator: In, values: [1, 2]}

Annotations are defined in the metadata section of the manifest like this

metadata:
  annotations:
    <annotation-key>: "<annotation-value>"

Two main use cases are:

  • Build, release, or image information that isn’t appropriate for labels (may include a Git hash, timestamp, PR number, etc.).
  • Enable the Deployment object to keep track of ReplicaSets that it is managing for rollouts.

Chapter 7 - Service Discovery

The service discovery is based on the Service objects. A Service is a named label selector. You can create a service by running

kubectl expose deployment <deployment-name>

It will assign a cluster (virtual) IP to the service that can is used by the system to load balance between all the pods identified by the selector. This IP is stable and can thus be used by the Kubernetes DNS service. What will change are the pods identified by the selector.

The magic here is done by the kube-proxy running on every node and watching for new services via the API server. It then writes iptables rules used to direct the calls to the endpoints of the target service.

The Kubernetes DNS service provides DNS names like <deployment-name>.default.svc.cluster.local.

The list of pods available in a service is maintained based on the responses of the pods on their readiness probe.

To expose one node outside a cluster you will need to define spec.type.NodePorts, it will affect a port and every node in the cluster will forward traffic from this port to the service.

Integrations with cloud services allow to automatically allocate load balancers using the spec.type.LoadBalancer. This will expose the service to the outside world.

For every Service, Kubernetes creates an Endpoint object, it contains the list of IPs for the service at any given point in time.

Chapter 8 - ReplicaSets

In order to have several pods of the same type running, it would be quite error-prone to have to manage several almost identical config files. The ReplicaSet objects are here exactly to fulfil this task.

One important concept, recurrent in Kubernetes, is that the pods created by a ReplicaSet don’t belong to it. If you delete the ReplicaSet, the pods won’t be deleted with it, they just won’t be monitored any more in term of matching the desired state. The ReplicaSet keeps track of the pods based on label selectors similarly to the Service. That allows to adopt existing containers where creating a ReplicaSet (rather than having to delete the Pod and re-create it through the ReplicaSet or keep a Pod in quarantine in case It misbehaves in order to experiment more in details rather than relying just on logs.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: <replicaset-name>
spec:
  replicas: 1
  selector:
    matchLabels:
      <label-to-select1-name>: <label-to-select1-value>
  template:
    metadata:
      labels:
        <label1-name>: <label1-value>
        <label2-name>: <label1-value>
    spec:
      containers:
        - name: <container-name>
          image: <image>

Submitting a ReplicaSet to Kubernetes would look like

$ kubectl apply -f config.yaml
replicaset "<replicaset-name>" created

They can then be accessed through the object type rs with the different commanded seen earlier.

For a given pod, you can find if it is managed by a ReplicaSet by looking into the annotation kubernetes.io/created-by.

Scaling ReplicaSets can be done with

kubectl scale replicasets <replicaset-name> --replicas=4

But is much better done by updating the manifest directly in order to prevent misalignment between the config files and the cluster version of the desired state. You would update the ReplicaSet with

kubectl apply -f config.yaml replicaset "<replicaset-name>" configured

Autoscaling (horizontal pod autoscaling) can be configured (but you should then no specified a number of replicas in the ReplicaSet in order to prevent conflicting behaviours). For it to work you need to have the heapster Pod running on your cluster as it provided the needed metrics about pods. You should see it in the list returned by

kubectl get pods --namespace=kube-system

You can then create an autoscale with

kubectl autoscale rs <replicaset-name> --min=2 --max=5 --cpu-percent=80

It is another kind of object (accessible through hpa or horizontalpodautoscalers) so it is not strictly coupled with the ReplicaSet.

As mentioned the ReplicaSet is not coupled with the pods, but by default deleting it will also delete the pods it manages. You can prevent that with the flag --cascade=false

kubectl delete rs <replicaset-name> --cascade=false

Chapter 9 - DaemonSets

The DaemonSet is rather similar to the ReplicaSet in a way that it manages pods matching a given label selector over the cluster. The main different is that it’s purpose is to ensure the Pod has a copy running on each node of the cluster. That is useful typically for monitoring agents or other kind of services providing features to the cluster itself rather than consumer serving features.

Defining a DaemonSet looks like

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: <daemonset-name>
  namespace: <namespace>
  labels:
    <label1-name>: <label1-value>
spec:
  selector:
    matchLabels:
      <label-to-select-name>: <label-to-select-value>
  template:
    metadata:
      labels:
        <label2-name>: <label2-value>
    spec:
      nodeSelector:
        <node-label-name-to-select>: <node-label-value-to-select>
      containers:
      - name: <container-name>
        image: <image>
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
          ...
      terminationGracePeriodSeconds: 30
      volumes:
      - name: <volume-name>
        hostPath:
          path: /path

You can submit it similarly to ReplicaSet

$ kubectl apply -f config.yaml
daemonset "<daemonset-name>" created

If the main use case might be to run one copy of the Pod on each node you can also limit the nodes to which a DaemonSet should apply. You need to add labels to node and define a NodeSelector (as shown in the example above).

kubectl label nodes <node-name> <label-name>=<label-value>

To define how a daemon will update the pods, use spec.updateStrategy.type, it defaults to OnDelete which creates the new Pod only after a manual delete of the existing one, but RollingUpdate would automatically start replacing the pods ensure success of one Pod between moving forwards to others. You can configure it with spec.minReadySeconds and spec.updateStrategy.rollingUpdate.maxUnavailable.

Delete a DaemonSet with

kubectl delete -f config.yaml

But it will also delete the pods unless --cascade=false specified.

Chapter 10 - Jobs

Kubernetes can also be used for more punctual needs with the Job object. You can run one or several pods for a given number of times (until exit 0). You would configure this with the completions and parallelism specs.

The mains three scenario are

  • One shot job: completion=1, parallelism=1. You run one Pod once until it completes.
  • Parallel fixed completions: completion=1+, parallelism=1+. You run several pods, as many times as needed until they complete the number of times specified.
  • Work queue: parallel Jobs: completion=1, parallelism=2+. You run several pods, they will run as many times as needed until one completes and then all be retired as they finish their current run.

The Job config would look like

apiVersion: batch/v1
kind: Job
metadata:
  name: oneshot
  labels:
    chapter: jobs
spec:
  parallelism: 1
  completions: 1
  template:
    metadata:
      labels:
        chapter: jobs
    spec:
      containers:
      - name: <container-name>
        image: <image>
        imagePullPolicy: Always
        args:
        - "--keygen-enable"
        - "--keygen-exit-on-complete"
        - "--keygen-num-to-gen=10"
      restartPolicy: OnFailure

The benefit of the Job object is that is the job fails for any reason (application bug or Pod crash), it will be retried until the completion parameter is fulfilled. One alternative to restartPolicy: OnFailure is restartPolicy: Never. In the first case it would restart the existing Pod if it failed, in the second it would create a new pod, which can pollute your cluster.

Chapter 11 - ConfigMaps and Secrets

In order to keep you application decoupled with the infrastructure, you will want to pass some config to it. Kubernetes has the concept of ConfigMap for that, they are basically small filesystem combined with the Pod when it is run.

Config is meant to be create by command line as follows

kubectl create configmap my-config --from-file=my-config.txt --from-literal=key=value

It can then be used in three ways

1- Mounted as a filesystem into the pod

spec:
  containers:
    - name: <container-name>
    ...
    volumeMounts:
      - name: config-volume
        mountPath: /config
  volumes:
    - name: config-volume
      configMap:
        name: my-config

2- Through environment variables

spec:
  containers:
    - name: <container-name>
    ...
    env:
      - name: ANOTHER_PARAM
        valueFrom:
          configMapKeyRef:
            name: my-config
            key: another-param
      - name: EXTRA_PARAM
        valueFrom:
          configMapKeyRef:
            name: my-config
            key: extra-param

3- Command-line argument

Using substitution with $(<env-var-name>).

Depending your setup, the main piece of config you expect to pass to you application directly through the environment are secrets. Of course these are sensitive by nature and you don’t want them accessible outside of the app. Kubernetes offers the concept of Secrets to help with that.

Secrets are created in a similar way

kubectl create secret generic my-secret --from-file=secret.txt

They can be exposed through a secret volume

spec:
  containers:
    - name: <container-name>
      ...
      volumeMounts:
      - name: <volume-name>
        mountPath: "/path"
        readOnly: true
  volumes:
    - name: <volume-name>
      secret:
        secretName: <secret-name>

One special kind of secret are the docker credential to access private registries, they can be created and used as follows

kubectl create secret docker-registry my-image-pull-secret \
  --docker-username=<username> \
  --docker-password=<password> \
  --docker-email=<email-address>
spec:
  containers:
    - name: <container-name>
      ...
  imagePullSecrets:
  - name:  my-image-pull-secret

Secrets names have to be alphanumerical characters separated by dots and or underscores.

Chapter 12 - Deployments

With the ReplicaSets seen previously, you can have an application running in a highly available way regarding hardware or software failure. If your workers are provisioned correctly, one Pod of a given ReplicaSet should always be working and the other ones being restarted at worse. But what about deploying a new version, you would need to kill this ReplicaSet and replace it with a new one, this would cause downtime. This is what Deployments offer to fix and automate.

One Deployment initially creates a replica set, and when updated will create a new one. When the new one is available for service, the old one will get decommissioned (roughly, but the details can depend the exact strategy used).

A deployment manifest would look similar to the following.

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  labels:
    <lavel1-name>: <label1-value>
  name: <name>
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      <label-to-select-name>: <label-to-select-value>
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      labels:
        <label2-name>: <label2-value>
      annotations:
        kubernetes.io/change-cause: "Initial deployment"
    spec:
      containers:
      - name: <container-name>
        image: <image>
        imagePullPolicy: Always
      dnsPolicy: ClusterFirst
      restartPolicy: Always
  revisionHistoryLimit: 14

The revisionHistoryLimit is useful as otherwise the Deployment can grow out of control over time, the same deployment could be used for the whole life of an application.

To monitor the creation of a new Deployment, you can use

kubectl rollout status deployments <deployment-name>

To the see the history, with revisions and change-causes from the annotations.

kubectl rollout history deployment <deployment-name>

To roll back a deployment

kubectl rollout undo deployments nginx

It will basically do a new rollout but based on the previous revisions, and that will override the history, for example rolling back from 3 to 2 would make history change from

REVISION        CHANGE-CAUSE
1               <none>
2               Update nginx to 1.9.10
3               Update nginx to 1.10.2

to

REVISION        CHANGE-CAUSE
1               <none>
3               Update nginx to 1.10.2
4               Update nginx to 1.9.10

Chapter 13 - Integrating Storage Solutions and Kubernetes

If you have applications running outside of Kubernetes and want or need to migrate progressively, you can make the process smoother by using the Kubernetes concepts. For example even if a service or database lives outside Kubernetes, you can represent it inside by a Service, so that the other services calling it won’t have to care if and when it moves to Kubernetes. You replace the selector by an externalName as follows

kind: Service
apiVersion: v1
metadata:
  name: external-database
spec:
  type: ExternalName
  externalName: database.company.com

One limitation is that no health check will be performed on external resources.

You can also run a database as a singleton Pod inside Kubernetes. Of course you won’t have the high availability offered by having several instances, but that is often not the case either outside of Kubernetes.

I’m skipping the details here, but basically we need to create a PersistentVolume, where the actual data will be stored, a PersistentVolumeClaim to enable the Pod to use this volume, a ReplicaSet with replicas: 1 to have our one pod, but still have it restarted if it crashes and a Service to expose it to the rest of the pods.

Instead of a PersistentVolume, you can look into StorageClass for dynamic volume provisioning.

To get higher availability than the one Pod solution, the StatefulSets object can be used.

Differences compared to a ReplicaSet

  • Each replica gets a persistent hostname with a unique index (pod-0, pod-1, …)
  • These pods are created in order from 0 to x and the Pod n+1 is not created until the Pod n is healthy and available.
  • These pods get deleted in reverse order from x to n.

A StatefulSet deploying MongoDb would look like the following:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo
spec:
  serviceName: "mongo"
  replicas: 3
  selector:
    matchLabels:
      app: mongo
  template:
    metadata:
      labels:
        app: mongo
    spec:
      containers:
      - name: mongodb
        image: mongo:3.4.1
        command:
        - mongod
        - --replSet
        - rs0
        ports:
        - containerPort: 27017
          name: peer

You can then mange the DNS entries with a Service as usual. One difference is that it is headless clusterIP: None as each Pod as their own specific identify as opposed to being identical substitutable instances.

apiVersion: v1
kind: Service
metadata:
  name: mongo
spec:
  ports:
  - port: 27017
    name: peer
  clusterIP: None
  selector:
    app: mongo

You could then use the pods hostname to setup the replication of MongoDB for example, it will then be able to find the pods trough these names even if they get commissioned on a different worker in the cluster.

To attach PersistentVolumes to StatefulSets, you need to use VolumeClaimTemplates, it will create a VolumeClaim for each of the Pod created by the StatefulSet. You would add the following in your StatefulSet definition

volumeClaimTemplates:
- metadata:
    name: database
    annotations:
      volume.alpha.kubernetes.io/storage-class: anything
  spec:
    accessModes: [ "ReadWriteOnce" ]
    resources:
      requests:
        storage: 100Gi