What is Kubernetes (K8s)
Kubernetes is an Open-Source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It was initially developed by Google and is now maintained by the Cloud Native Computing Foundation (CNCF).
Kubernetes is a portable, extensible, open source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.
buymeacoffee ☕ 👈 Click the link
Kubernetes enables users to deploy containerized applications across a cluster of nodes, automatically scaling and managing the application’s resources. It abstracts away the underlying infrastructure, allowing users to focus on building and running their applications without having to worry about the underlying infrastructure.
Kubernetes architecture is based on a master-worker node model. The master node runs the Control Plane components such as the API Server, etc, and the scheduler, which manages the overall state of the cluster. Worker nodes run the application workloads and are managed by the Control Plane.
— — — — — — — — — — — —
Container
A container is a runnable instance of an image. You can create, start, stop, move, or delete a container using the Docker API or CLI. You can connect a container to one or more networks, attach storage to it, or even create a new image based on its current state.
Container is defined by its image as well as any configuration options you provide to it when you create or start it. When a container is removed, any changes to its state that are not stored in persistent storage disappear.
Stuff like your application code, dependent libraries, and its dependencies all the way up to the kernel. The key concept here is isolation. Isolate all your stuff from the rest so that you have better control of them. There are three types of isolation provided by containers.
Workspace isolation (Process, Network)
Resource isolation (CPU, Memory)
File system isolation (Union File System)
Think of containers like VMs on a diet. They are lean, fast (to startup) and small. And, all this was not built ground up. Instead, they used the constructs (like cgroups, namespaces) present in the linux system to build a nice abstraction over it.
Simply put, a container is a sandboxed process on your machine that is isolated from all other processes on the host machine. That isolation leverages kernel namespaces and cgroups, features that have been in Linux for a long time.
Now we know what containers are, It is easy to understand why they are very popular. Instead of just shipping only your application binary / code, It is possible to ship the whole environment needed to run your application in a practical way as containers can be built as very small units. A perfect fix for the “It works in my machine” issue.
— — — — — — — — — — — —
When to use Kubernetes?
All is well with containers and software developers life is much better now. Then why do we need another piece of technology, a container orchestrator like Kubernetes…?
You need it when you get to this state, where there are too many containers to manage
Q: Where is my front end container, how many of them am I running?
A: Hard to tell. Use a container orchestrator
Q: How will I make my front end containers to talk to newly created backend containers?
A: Hardcode the IPs. Or, Use a container orchestrator
Q: How will I do rolling upgrades?
A: Manually hand holding in each step. Or, Use a container orchestrator
— — — — — — — — — — — —
Why I prefer Kubernetes
There are multiple orchestrators like docker swarm, Mesos and Kubernetes. My choice is Kubernetes (and hence this article) because Kubernetes is …
… like lego blocks. It not only has the components needed to run a container orchestrator at scale, but also has the flexibility to swap different components in and out with custom ones. Want to have a custom scheduler, sure just plug it in. I need to have a new resource type, write a CRD. Also, the community is very active and evolving the tool rapidly.
— — — — — — — — — — — —
Kubernetes Architecture
Every Kubernetes cluster has two types of nodes (machines). Master and a Worker. As the name suggests, Master is to control and monitor the cluster where as the worker runs the payload (applications)
A cluster could work with a single master node. But better to have three of them for high availability (Known as HA Clusters)
Let us take a closer look at the master and what it is composed of
etcd : Database to store all the data about kubernetes objects, their current state, access information and other cluster config information.
API Server : RESTful API server that exposes end points to operate the cluster. Almost all of the components in master and worker nodes communicate to this server to perform their duties.
Scheduler : Responsible to decide which payload needs to run in which machine.
Control Manager : It is a control loop that watches the state of the cluster (gets this data by making calls to the API server) and takes actions to bring it to the expected state.
kubelet : Is the heart of the worker node. It communicates with the master node API server and runs the containers scheduled for its node.
kube Proxy : Takes care of networking needs of pods using IP tables / IPVS.
Pod : The work horse of kubernetes which runs all your containers. You cannot run containers inside kubernetes without a pod abstraction over it. A pod adds functionalities that are crucial to kuberenetes way of networking between containers.
A pod could have more than one container and all the servers running inside these containers can see each other as localhosts. This makes it very convenient to separate different aspects of your app as separate containers and load them all together as one pod. There are different pod patterns like sidecar, proxy and ambassador to address different needs. Check this article to learn more about them.
Pod networking interface provides a mechanism to network it with other pods in the same nodes and other worker nodes.
Also, each pod will be assigned its own IP address which is used by kube-proxy to route traffic. And this IP address is visible only within the cluster.
A volume mounted inside a pod is also visible to all the containers and sometimes these volumes can be used to communicate asynchronously between the pods. For example, say your app is a photo uploading app (like instagram may be), it could save these files in a volume and another container in the same pod can watch for new files in this volume and start processing it to create multiple sizes and upload them to cloud storage.
— — — — — — — — — — — —
Controllers
In kubernetes, there are lot of controllers like ReplicaSet, Replication Controllers, Deployments, StatefulSets and Service. These are objects that control pods in one way or another. Let us look at some of the important ones.
ReplicaSet
ReplicaSet doing what it is good at. Replicating pods.
The main responsibility of this controller is to create replicas of the given pod. If a pod dies for some reason, this controller will be notified and it immediately jumps into action to create a new pod.
Deployment
Deployment (with messy hair) trying to control the ReplicaSet.
Deployment is a higher order object which uses a ReplicaSet to manage replicas. It provides rolling upgrades by scaling up a new ReplicaSet and scaling down (eventually removing) an existing ReplicaSet.
Service
Service represented as a drone delivering data packets to corresponding pods.
Service is a controller object whose prime responsibility is to work as a load balancer in distributing the “packets” to the corresponding nodes. It is basically a controller constructed to group similar pods (usually identified by pod labels) across worker nodes.
Say if your “front-end” app wants to communicate to “back-end” app, there could be many running instances of each. Instead of worrying about hard coding the IPs of every back-end pod, you send the data packets to the back-end service which then decides how to load balance and forwards accordingly.
PS: Note that service is more like a virtual entity as all the packet routing is handled by IP tables /IPVS /CNI plugin. It just makes it easier to think of it as a real entity sitting out there to understand its role in the kubernetes ecosystem.
Ingress
Ingress a floating platform through which all the packets flow into the cluster.
Ingress controller is a single point of contact to the outside world to talk to all the services that are running inside the cluster.
Ingress a floating platform through which all the packets flow into the cluster
This makes it it easy for us to set security policies, monitoring and even logging at a single place
P.S: There are a lot of other controller objects in Kubernetes like DaemonSets, StatefulSets and Jobs. There are also objects like Secrets, ConfigMaps that are used to store application secrets and configurations.
I hope you will find it useful.