Learn Kubernetes basics by hands on experience on flask micro-service architecture.
Introduction
Hi there! This article is suitable for the audience who is interested in getting basic knowledge of Kubernetes. In this article we are going to understand basics of Kubernetes by working on a small project (Video to MP3 Converter). We will use flask micro web framework which is built using python language.
The main components of this project will be:
- 3 micro-services (auth, converter & notification)
- mysql database to store user data
- MongoDb (gridfs) to store large files (video & audio). One can raise an argument by saying why can’t we leverage cloud storage like s3. The answer is we can, but to keep things simple in a project we have selected MongoDb.
- docker to containerize applications
- Kubernetes
Setting up the project:
Please refer read-me file in this repository. Now let’s try to understand Kubernetes and basics of its components.
What is Kubernetes?
Kubernetes is an application orchestrator. It orchestrates containerized cloud-native micro services apps.
An orchestrator is a system that deploys and manages applications. It can deploy your applications and dynamically respond to changes. For example, Kubernetes can:
• Deploy your application
• Scale it up and down dynamically based on demand
• Self-heal it when things break
• Perform zero-downtime rolling updates and rollbacks
• Lots more…
Like many of the modern cloud-native projects, it’s written in Go, it’s built in the open on GitHub, it’s actively discussed on the IRC channels, you can follow it on Twitter (@kubernetesio), and slack.k8s.io.
Kubernetes as a cluster
A Kubernetes cluster is made of a control plane and worker nodes. The control plane exposes the API, has a scheduler for assigning work, and records the state of the cluster and apps in a persistent store. Worker nodes are where user applications run.
Kubernetes likes to manage applications declaratively. This is a pattern where you describe what you want in a set of configuration files, post them to Kubernetes, then sit back while Kubernetes makes it all happen. We will look into yaml files where we will declare states of micro services.
Control Plane:
Kubernetes control plane nodes run the cluster’s control plane services. These services are the brains of the cluster where all the control and scheduling decisions happen. Behind the scenes, these services include the API server, the cluster store, scheduler, and core controllers.
Worker Node:
Worker nodes are where user applications run. At a high-level they do three things:
1. Watch the API server for new work assignments
2. Execute work assignments
3. Report back to the control plane (via the API server)
The kubelet is the main Kubernetes agent and runs on every worker node. When you join a node to a cluster, the process installs the kubelet, which is then responsible for registering it with the cluster. This process registers the node’s CPU, memory, and storage into the wider cluster pool.
The kubelet needs a container runtime to perform container-related tasks — things like pulling images and starting and stopping containers.
The last piece of the worker node puzzle is the kube-proxy. This runs on every node and is responsible for local cluster networking.
Packaging apps for Kubernetes
An application needs to tick a few boxes to run on a Kubernetes cluster. These include.
1. Packaged as a container
2. Wrapped in a Pod
3. Deployed via a declarative manifest file
Pods
In the VMware world, the atomic unit of scheduling is the virtual machine (VM). In the Docker world, it’s the container. Well… in the Kubernetes world, it’s the Pod.
Pods ring-fence an area of the host OS, build a network stack, create a bunch of kernel namespaces, and run one or more containers. If you’re running multiple containers in a Pod, they all share the same Pod environment. This includes the network stack, volumes, IPC namespace, shared memory, and more.
Few Important points about pods:
- Pods are atomic means, the entire Pod either comes up and is put into service, or it doesn’t, and it fails.
- If they die unexpectedly, you don’t bring them back to life. A new pod will replace the old one.
- Pods are also immutable – this means you don’t change them once they’re running.
Deployment
You write an application microservice in a language of your choice. You then build it into a container image and store it in a registry. At this point it’s containerized.
Next, you define a Kubernetes Pod to run the containerized application. At the kind of high level we’re at, a Pod is just a wrapper that allows a container to run on a Kubernetes cluster.
While it’s possible to run static Pods like this, the preferred model is to deploy all Pods via higher-level controllers. The most common controller is the Deployment. It offers scalability, self-healing, and rolling updates for stateless apps.
Service
We’ve just learned that Pods are mortal and can die. However, if they’re managed via higher level controllers, they get replaced when they fail. But replacements come with totally different IP addresses. This also happens with rollouts and scaling operations. Events like these cause a lot of IP churn.
This is where Services come in to play. They provide reliable networking for a set of Pods.
As Pods come and go, the Service observes this, automatically updates itself, and continues to provide that stable networking endpoint.
Now we’re ready to jump to the code. If you’re following till now, I’m assuming you would have already cloned the repo to your local and followed the steps mentioned in the read me file.
Understanding of components used in project
Too keep things simple, I have tried to keep format same for all apps k8s component files.
Deployment
apiVersion: apps/v1 -- API version
kind: Deployment -- type of component
metadata:
name: auth -- name of deployment
labels: -- service labels must be the subset of deployment labels
app: auth -- Labels can be used to organize and to select subsets of objects.
spec:
replicas: 1 -- number of pods which will be running at a time
selector:
matchLabels:
app: auth
strategy:
type: RollingUpdate -- https://kubernetes.io/docs/tutorials/kubernetes-basics/update/update-intro/
rollingUpdate:
maxSurge: 3
template:
metadata:
labels:
app: auth
spec:
containers:
- name: auth -- container name
image: nilay103/k8sauth -- image name
ports:
- containerPort: 5000
envFrom:
- configMapRef:
name: auth-configmap
- secretRef:
name: auth-secret
Detailed information about deployment can be found here.
kubectl apply -f filename -- To apply k8s config you can run
kubectl get deployment -- to get all deployments running on a server
kubectl describe deployment deployment_name -- to get deployment description
Service
Services are loosely coupled with Pods via labels and selectors. For a Service to send traffic to a Pod, the Pod needs every label the Service is selecting on. It can also have additional labels the Service isn’t looking for. Pods can have additional labels but service labels must be the subset of pod labels.
apiVersion: v1 -- API version
kind: Service -- type of component
metadata:
name: auth -- name of service
spec:
selector:
app: auth -- label selector to identify pods
type: ClusterIP -- type of service. Details can be found -> https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
ports: -- port bindings
- port: 5000
targetPort: 5000
protocol: TCP
Detailed information about service can be found here.
kubectl apply -f filename -- To apply k8s config you can run
kubectl get services -- to get all deployments running on a server
kubectl describe svc service_name -- to get deployment description
Persistent Volume and Persistent Volume Claims
In summary, PV and PVC are used as a storage. The most common use case is define storage for databases (like volumes in docker)
apiVersion: v1 -- API version
kind: PersistentVolume -- type of component
metadata:
name: mysql-pvvolume -- name of component
labels: -- labels
type: local
spec:
storageClassName: manual -- type of storage class
capacity:
storage: 2Gi
accessModes: -- access mode, it can be readonly or write only or read write etc.
- ReadWriteOnce
hostPath:
path: "/mnt/data"
---
apiVersion: v1 -- API version
kind: PersistentVolumeClaim -- type of component
metadata:
name: mysql-pvclaim -- name of component
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce -- access mode
resources:
requests:
storage: 2Gi
Config Maps and Secrets
apiVersion: v1 -- API version
kind: Secret -- type of component
metadata:
name: mysql-secret -- component name
stringData: -- config data (key, value)
MYSQL_PASSWORD: root
type: Opaque
---
apiVersion: v1 -- API version
kind: ConfigMap -- type of component
metadata:
name: celery-configmap -- component name
data: -- config data (key, value)
AUTH_SVC_ADDRESS: auth:5000
NOTIFICATION_SVC_ADDRESS: ns:5055
MONGO_HOST: mongodb
REDIS_HOST: redis
Congratulations for getting this far. If you have followed the steps mentioned till now, you should be able to run the project on your local machine. To play around, feel free to make changes in source code, push updated images to docker file and change container image section in associated deployment file. Add new features to the project and lot more. Feel free to press the clap button if you like this article and have learned something exciting!
Happy learning!
References
All images and few topic descriptions are referred from The kubernets book.