Rolling Updates and Rollbacks in Kubernetes: Managing Application Updates

Last Updated : 01 Mar, 2024

Many websites and apps now run on clusters of computers called containers. Containers let apps run smoothly as groups work on updating and improving the software behind the apps. A system called Kubernetes helps manage and update all those containerized apps. Sometimes app updates go wrong or cause problems for users. Kubernetes has clever ways to update apps that avoid issues.

Intro-diagram

The main method is called a rolling update. This slowly switches the software behind the scenes from an old version to a new one. A few containers at a time are updated to the new software. Kubernetes checks that each small batch works fine before updating more. This means no downtime for users! Another useful capability is rollbacks. If a new software version causes glitches, Kubernetes can automatically revert to the previous stable version. There is no need for websites to crash or stay broken!

Rolling updates and rollbacks in Kubernetes

Rolling Updates: When we update our application in Kubernetes, we want to avoid downtime where users experience errors or outages. So instead of updating everything instantly, Kubernetes does “rolling updates”. It’s like changing clothes by putting on a new pair of pants one leg at a time instead of standing naked to swap your whole outfit all at once!

With rolling updates, Kubernetes slowly replaces individual instances of our application with updated ones. It takes down old ones, brings up new ones in their place, and repeats until updated. Another feature is rollbacks. We all make mistakes in changing code that leads to bugs! Rollbacks are like an “undo” button – they revert application back to the safe, stable previous version.

Rollbacks: Another feature is rollbacks. We all make mistakes changing code that leads to bugs! Rollbacks are like an “undo” button – they revert application back to the safe, stable previous version. If one of our shiny new updates starts failing, Kubernetes automatically rolls back by removing the buggy updated app instances and scaling up the last good version that worked. Crisis averted!

Overall, rolling updates help avoid disruption, while automated rollbacks save us from new bugs. They let us iterate and deploy new code faster and more safely. Pretty handy features!

Why do we use rolling updates and rollbacks in Kubernetes?

Rolling updates are the best way to upgrade apps managed by Kubernetes. They avoid headaches for both users and website owners. The biggest benefit is zero downtime during the upgrade. The site or app remains available throughout the process. Containers with the old software version keep running until replaced gradually by new containers. So users enjoy continuous access.

This compares extremely well to regular software updates. Those require completely stopping the old app, installing the new version, and restarting it. That whole process means taking the app offline for minutes or even hours in some cases. Customers get annoyed by downtime interrupting their work or entertainment!

Rolling updates minimize downtime:

Rolling updates also slowly phase in the changes behind the scenes. For example, a website redesign goes live 10 or 20 percent at a time across the container cluster. So if bugs pop up with the new version, only some users are impacted before fixes can be made. An all-at-once regular update could cripple the whole website if not tested enough. So when is this clever Kubernetes approach especially useful? Continuous deployment and integration workflows rely on frequent incremental improvements to apps. For example, companies like Amazon and Netflix update their apps even multiple times per day! Zero downtime is required to avoid losing sales or customers along the way.

Continuous deployment, no downtime:

Apps demanding extremely high reliability and uptime also benefit. Things like banking systems, cloud infrastructures, communication platforms. Rolling updates let them evolve without service interruptions during business hours or critical loads. In summary, rolling updates minimize headaches both for those running and using containerized apps. So why ever take things offline to upgrade when you don’t have to?

How Rolling Updates Work

Rolling update process on Kubernetes.

Kubernetes handles rolling updates through ReplicaSets. These manage groups of identical pods, which are units running containerized application instances.
Let’s look at a simple example:Say a website has 6 pod replicas behind a load balancer. This ReplicaSet runs containers from version 1.0 of the website software.
Kubernetes then starts the update by creating a parallel ReplicaSet definition for version 2.0, while keeping the old 1.0 one.
The update begins by spinning up 1 new pod on version 2.0. Traffic still goes mainly to the 6 pods on 1.0. Kubernetes checks that main metrics for the new pod look normal before continuing. The rolling update then creates another 2.0 pod, ramping up to 2 out of 7 pods now on the new version. At the same time, Kubernetes scales down the 1.0 ReplicaSet by destroying one of its pods in parallel.
This gradual ramp-up of new pods while winding down old ones proceeds slowly. Kubernetes adds health checks on response times, error rates, and resource usage of all pods during this. If metrics degrade, it pauses the update process automatically.
Eventually version 2.0 pods take over, while removing the last few 1.0 ones. The whole transition gets done with no downtime for users in the best case.

Kubernetes allows controlling the pace and safety checks for this through parameters:

Max Unavailable – The maximum number of pods OK to take offline during the update, e.g. 1.
Max Surge – How many extra pods to create temporarily above the original number during the transition.
Health checks – Custom metrics thresholds to track and metrics services to query if metrics degrade.

With all this power and flexibility, updates can roll out steadily at an organization’s preferred pace. And if things go sideways…just rollback!

Performing Rollbacks

Sometimes new versions of apps don’t work as expected. Bugs could slip through testing. Unpredicted spike in traffic volumes slow things down. Changes may unintentionally break important flows. When issues pop up, Kubernetes allows rapidly rolling back deployments to a previous stable release. No need to leave users frustrated or losing business while debugging!

There are two ways to do rollbacks:

Manual: The administrator sees problems and directly commands Kubernetes to undo the update. They specify rolling back to right before the troublesome release. Kubernetes handles creating replica sets for the old version and phasing pods back smoothly.
Automatic: Kubernetes can monitor custom metrics like response time thresholds. If those degrade below a specified value, rollback automatically activates! The cluster reverts the change once criteria is met.

Additional options help control rollbacks:

Revision limit: How many versions back deployment history to retain to allow rollbacks to.
Timeout: An automatic rollback will reverse course this many seconds after issues are first detected.

With these capabilities, bad updates don’t have to spell doom. Kubernetes keeps things running for users while giving breathing room to rework changes. So don’t fear exacerbating issues during upgrades! Either manually undo them or program Kubernetes to watch and self-correct automatically.

Deploying Nodejs sample application on Kubernetes

1. Sample application:

The goal is to have a simple Node.js web application that prints “Hello World!” when accessed via HTTP. The first part is the Dockerfile. This defines how to package up the application into a Docker container image that can run standalone.

Some key points:

FROM node:16-alpine: Builds on top of an existing Node.js Alpine Linux image. This gives us Node already installed and optimized Linux OS.
WORKDIR /usr/app: Sets default directory for following commands. Good practice for organization.
COPY app.js .: Copies our application code file from host into the image directory.
CMD [“node”, “app.js”]: Runs the application using node command to execute our app.js code.

Dokerfile:

FROM node:16-alpine
WORKDIR /usr/app  
COPY app.js .
CMD ["node", "app.js"]

Next is the app.js code itself:

const http = require(‘http’): Imports Node.js HTTP server library
createServer(): Creates new HTTP server instance
res.end(‘Hello World!’): Sends “Hello World!” text when requests received
server.listen(3000): Starts HTTP server on port 3000

In summary, the Dockerfile packages up the Node app, while the app.js prints “Hello World!” via a simple HTTP server on port 3000 for us to access the greeting.

const http = require('http');

const server = http.createServer((req, res) => {
  res.end('Hello World!');
});

server.listen(3000);

Container Image

$ docker build -t myaccount/hello-world:v1 .
$ docker push myaccount/hello-world:v1

The basic flow is:

Package application code into container image
Push image to a registry like Docker Hub
Kubernetes will pull image onto nodes to deploy app

The docker build command:

-t myaccount/hello-world:v1: Tags image with repository name and version tag
.: Specifies context of current directory for Dockerfile
Packages up app files into portable image layering on Node.js base

Next, docker push:

Publishes the application image to Docker Hub repository
Requires Docker Hub account and repository for image storage
Uploads our built image using account credentials and target repo name
Allows Kubernetes cluster to access image from anywhere

The major benefit of this image build/publish flow is decoupling application packaging from deployment. The image serves as fixed unit that then gets run anywhere.

So in summary, we take app code → containerize into image → push to registry → deploy image onto Kubernetes. This enables a consistent environment for the app across infra.

Kubernetes Manifests

The Kubernetes deployment and service YAML definitions are key to actually running our containerized application on the cluster:

Deployment:

The Deployment resource defines the desired state for running our app:

selector and labels map deployment to pods using app: hello-world labels
spec.template defines the pod structure and container to run
image: myaccount/hello-world:v1 tells each pod to run our container image
ports expose 3000 inside container to cluster networking

This deployment lets Kubernetes know how to run and distribute our application as pods across nodes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - name: hello-world
        image: myaccount/hello-world:v1
        ports:
        - containerPort: 3000

Service:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - name: hello-world
        image: myaccount/hello-world:v1
        ports:
        - containerPort: 3000

The Service provides networking for accessing the deployment:

selector targets deployment pods with app: hello-world labels
Creates single access point to pods for hello-world app
targetPort: 3000 directs traffic internally across pods
port: 80 is where external requests are received for the app

So the Service exposes and load balances traffic to our deployment. It gives reliable networking for the app. Together, the manifests package up the “what and how” for Kubernetes to make our container image run application pods accessible to users

Deploying the application

$ kubectl apply -f deployment.yaml -f service.yaml

The kubectl tool sends the deployment and service definitions to the Kubernetes API server. This central control plane processes the apply request.
The API server then persists the deployment and service objects into etcd (the cluster database). This makes our desired state of running the containerized application part of the current known cluster state. The Kubernetes controllers notice the new deployment and service have been created. The controllers manage reconciling actual vs desired state.
So the deployment controller spins up pods on the cluster nodes to match the pod template from the deployment YAML. It spreads them across nodes for high availability.
The service controller allocates a cluster IP and configures network routing to map traffic from that IP to the pods. This makes the application accessible internally. Finally, the kube-proxy component on each node handles forwarding application requests arriving at the service IP to the backs the pods based on updated iptables rules.
So in summary, kubectl apply triggers persistent storage of our state definitions, which the control plane work to reconcile towards that goal state, configuring fundamental resources like pods and networking. The cluster works to match reality to the desired state we specificed in the YAMLs!

Best Practices for Rolling Updates:

Kubernetes makes it easier than ever to smoothly upgrade apps. But some planning and care still makes the process less risky. Here are some best practices:

Deployment Patterns

The basic rolling update works fine. More advanced patterns add extra safety nets:

Blue-Green: Launch the new version separately first before retiring the old version. Route a little bit of traffic to the new environment to stage a soft launch. If issues emerge, no impact, just send all traffic back to the old version still intact!
Canary: Similar idea, slowly send an increasing percentage of traffic to the new version. Watch metrics carefully with each ramp-up. Rollback or pause easily if needed.

Rollout Pace & Checks

Take it slow, especially for big changes! Complex new microservices or dramatic rearchitecting deserves a gradual and cautious rollout:

Max Unavailable: Keep this low, maybe just 10% of pods updating at once
Max Surge: Allow a buffer of maybe 2 extra pods to handle traffic spikes
Readiness Checks: Ensure new pods work before sending them live traffic
Monitor Closely
Watch health metrics for individual pods and overall workload:
Resource Usage: CPU, RAM, disk fills
Application Metrics: Response times, error rates
Logs: Errors, warnings, stack traces indicating bugs
Dashboards centralize all this into clear views to base rollback decisions on.
Implement Health Checks
Make sure the app itself reports failures and self-heals using:
Liveness Probes: Did container crash unexpectedly? Restart it!
Readiness Checks: Is app ready to receive traffic after starting up?

This takes pressure off Kubernetes to detect issues.

Debugging & Partial Rollbacks

If a bad update makes it halfway through rollout before catching issues:

Debug only the new pods having problems, don’t disturb all users
Consider partially rolling back just certain troubled services, not everything

Following these practices smooths updates and lets issues be managed better when they do strike!. Proactively add safety checks on fail points instead of reacting to complaints or outages. Kubernetes empowers organizations to confidently improve apps continuously.

Conclusion

Summary of rolling update and rollback capabilities of Kubernetes:

Kubernetes provides powerful capabilities for rolling out application changes smoothly and minimizing downtime. Features like rolling updates, rollbacks, and advanced deployment strategies make continuous delivery easier.
The key benefit is avoiding disruptive traditional update methods that break or offline applications. Rolling updates incrementally replace old software versions with new ones without service interruption. Customizable pacing and automatic safety checks spot issues early. If problems do occur, one-click rollbacks revert applications to previous stable states. No need to scramble fixing things while users or customers suffer extended outages.
These capabilities perfectly fit today’s rapid software improvement workflows. Teams can ship changes faster and more frequently, avoiding bottlenecks. Applications evolve effortlessly without annoying users with constant downtime.

Emphasize benefits for continuous delivery and zero-downtime deployments:

The end result is developers can focus on coding new features and fixes rather than deployment headaches. Operations teams spend less time stressed and overloaded when migrations inevitably hit snags.
And end users enjoy familiar apps and services that refresh with innovations without interrupting their daily workflows and habits at all.
Kubernetes has revolutionized how applications get deployed, updated, and operated. Rolling updates and intelligent rollbacks are critical to enabling continuous delivery and zero-downtime environments. The future of software looks smooth and seamless thanks to these capabilities!

Rolling Updates and Rollbacks in Kubernetes – FAQ’s

What are rolling updates and rollbacks? How are they done on Kubernetes?

Rolling updates incrementally update pods behind the scenes with new software versions without downtime. Rollbacks revert changes and restore previous stable versions after issues emerge. Kubernetes manages both automatically using declarative configuration.

When are rolling updates in Kubernetes useful?

Rolling updates shine for continuously deployed applications and systems requiring high uptime during updates, like e-commerce and banking. They minimize downtime risks.

How does Kubernetes perform rolling updates?

Kubernetes uses ReplicaSets to maintain parallel old and new pod groups, then shifts traffic from old pods to new ones in batches while checking health metrics.

What common issues prompt rollbacks of Kubernetes deployments?

Bugs, performance problems from spikes in traffic, broken workflows, crashes, scaling problems, security issues, failed integration with other updated services, etc.

How fast should updates roll out?

Start slow with reasonable pacing between pod batch updates, especially for big changes. Observe monitoring dashboards for issues. Optimizing rollout speeds comes later.

What safeguards help catch update problems early?

Readiness/liveness checks, health endpoints, synthetic monitoring, dashboards with metrics and logs visibility, automated alerts, and manually verifying functionality.

What deployment patterns complement rolling updates?

Canary testing, blue-green environments, and using feature flags to toggle on/off new functionality. All minimize risk.

Suggest improvement

What Is Kubernetes Pod Disrupt Budget ?

How To Deploy Flask APP On AWS EC2 Instance?

Share your thoughts in the comments