{"id":268,"date":"2021-08-31T14:40:46","date_gmt":"2021-08-31T14:40:46","guid":{"rendered":"https:\/\/fde.cat\/?p=268"},"modified":"2021-08-31T14:40:46","modified_gmt":"2021-08-31T14:40:46","slug":"zero-downtime-node-patching-in-a-kubernetes-cluster","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/zero-downtime-node-patching-in-a-kubernetes-cluster\/","title":{"rendered":"Zero Downtime Node Patching in a Kubernetes Cluster"},"content":{"rendered":"<p><em>Authors: <\/em><a href=\"https:\/\/www.linkedin.com\/in\/vaishnavigalgali\/\"><em>Vaishnavi Galgali<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.linkedin.com\/in\/arpeetkale\/\"><em>Arpeet Kale<\/em><\/a><em>, <\/em><a href=\"https:\/\/www.linkedin.com\/in\/roboxue\/\"><em>Robert\u00a0Xue<\/em><\/a><\/p>\n<h3>Introduction<\/h3>\n<p>The Salesforce Einstein Vision and Language services are deployed in an AWS Elastic Kubernetes Service (EKS) cluster. One of the primary security and compliance requirements is operating system patching. The cluster nodes that the services are deployed on need to have regular operating system updates. Operating system patching mitigates vulnerabilities that may expose the virtual machines to\u00a0attacks.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1000\/1*HTobdR8Am3n4E_03kfqLQw.png?w=750&#038;ssl=1\" data-recalc-dims=\"1\"><\/figure>\n<h4>Patching Process<\/h4>\n<p>Einstein services are deployed as Kubernetes pods on an immutable EC2 node group, also known as an AWS AutoScaling Group (ASG). The patching process involves building a new Amazon Machine Image (AMI) that contains all of the updated security patches. The new AMI is used to update the node group, which involves launching new EC2 instances, one at a time. As the new instance passes all the health checks, one of the old instances is terminated. This process continues until all of the EC2 instances in the node group are replaced. This is also known as a rolling update.<\/p>\n<p>However, this patching process introduces a challenge. As the old EC2 instances are terminated, the service pod running on those EC2 instances is also terminated. This may lead to failures for any user requests being processed at the time of termination unless the termination of the pod is handled gracefully. Graceful termination of the pod involves infrastructure components (the Kubernetes API and AWS ASGs) and application components (service\/app container).<\/p>\n<h3>Graceful Application Termination<\/h3>\n<p>In this process, the application is first gracefully terminated. Terminating a pod may result in abruptly terminating the Docker container in the pod. This implies that the process running in the Docker container is also abruptly terminated. This may cause any requests being processed to be terminated, eventually leading to failures for any upstream service calling the application at that time.<\/p>\n<p>When an EC2 instance is being terminated as part of the patching process, the pods on that instance are evicted. This marks the pods for termination and the kubelet running on that EC2 instance commences the pod shutdown process. As part of the pod shutdown, kubelet issues a SIGTERM signal. If the application running in the pod isn\u2019t configured to handle the SIGTERM signal, it may result in abruptly terminating any running tasks. Therefore, you want to update your application to handle this signal and gracefully shut down.<\/p>\n<p>For example, in the case of a Java application, here\u2019s one way to address graceful termination (this differs from framework to framework):<\/p>\n<pre>public static final int gracefulShutdownTimeoutSeconds = 30;<\/pre>\n<pre>@Override<br>public void onApplicationEvent(@NotNull ContextClosedEvent contextClosedEvent) {<br>    this.connector.pause();<br>    Executor executor = this.connector.getProtocolHandler().getExecutor();<br>    if (executor instanceof ThreadPoolExecutor) {<br>    try {<br>        ThreadPoolExecutor threadPoolExecutor = (ThreadPoolExecutor) executor;<br>        <strong>threadPoolExecutor.shutdown();<\/strong><br>        logger.warn(\"Gracefully shutdown the service.\");<br>        if (!threadPoolExecutor.awaitTermination(gracefulShutdownTimeoutSeconds, TimeUnit.SECONDS)) {<br>        logger.warn(\"Forcefully shutdown the service after {} seconds.\", gracefulShutdownTimeoutSeconds);<br>        threadPoolExecutor.shutdownNow();<br>        }<br>    } catch (InterruptedException ex) {<br>        Thread.currentThread().interrupt();<br>    }<br>    }<br>}<\/pre>\n<p>In the above snippet, the shutdown is initiated and, after 30 seconds, the application is forcefully terminated. This gives the application 30 seconds to process any running tasks.<\/p>\n<p>If the pod consists of multiple containers, and the order of container termination matters, then define a container <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/containers\/container-lifecycle-hooks\/\">preStop hook<\/a> to ensure that the containers are terminated in the correct sequence (for example, terminating an application container before terminating a logging sidecar container).<\/p>\n<p>During the process of pod shutdown, the kubelet follows container lifecycle hooks, if defined. In our case, we have multiple containers in the same pod and so, for us, the order of termination matters. We define the preStop hook for our application containers as shown\u00a0below:<\/p>\n<pre>lifecycle:<br>    preStop:<br>        exec:<br>          command:<br>          - \/bin\/sh<br>          - -c<br>          - kill -SIGTERM 1 &amp;&amp; while ps -p 1 &gt; \/dev\/null; do sleep 1; done;<\/pre>\n<p>The action defined in the preStop hook above sends a SIGTERM signal to the process running in the Docker container (PID 1) and waits in intervals of 1 second until the process is successfully terminated. This allows the process to complete any pending tasks and terminate gracefully. <\/p>\n<p>The default timeout of the preStop hook is 30 seconds, which. in our case, gives enough time for the process to terminate gracefully. If the default time isn\u2019t sufficient, you can specify it by using the terminationGracePeriodSeconds field in the preStop\u00a0hook.<\/p>\n<h3>Graceful EC2 Instance Termination<\/h3>\n<p>As mentioned above, our services run on node groups of EC2 instances. Graceful EC2 instance termination can be achieved by using AWS ASG lifecycle hooks and an AWS lambda\u00a0service.<\/p>\n<h4>AWS EC2 Auto Scaling Lifecycle Hooks<\/h4>\n<p>Lifecycle hooks help in pausing the instance state and performing custom actions before launching the new instance or terminating the old instance. Once the instance is paused, you can complete the lifecycle action by triggering a Lambda function or running commands on the instance. The instance remains in the wait state until the lifecycle action is completed.<\/p>\n<p>We use the <strong>Terminating:Wait <\/strong>lifecycle hook to put the instance to be terminated in the <strong>WAIT <\/strong>state. For more details on ASG lifecycle hooks, see the AWS\u00a0<a href=\"http:\/\/ttps\/\/docs.aws.amazon.com\/autoscaling\/ec2\/userguide\/lifecycle-hooks.html\">docs<\/a>.<\/p>\n<h4>AWS Lambda<\/h4>\n<p>We use <a href=\"https:\/\/aws.amazon.com\/serverless\/sam\/\">SAM<\/a> framework to deploy a Lambda function (built in-house; we call it node-drainer) that is triggered on specific ASG lifecycle hook events. The following diagram shows the sequence of events involved in gracefully terminating an EC2 instance in the node\u00a0group.<\/p>\n<figure><img decoding=\"async\" alt=\"\" src=\"https:\/\/i0.wp.com\/cdn-images-1.medium.com\/max\/1024\/1*eP2x_IW0viwOrCLyzmR2hQ.jpeg?w=750&#038;ssl=1\" data-recalc-dims=\"1\"><\/figure>\n<ul>\n<li>When the Patching Automation requests the instance termination, the lifecycle webhook kicks in and puts the instance in the <strong>Terminating:Wait <\/strong>state.<\/li>\n<li>Once the instance is in the <strong>Terminating:Wait<\/strong> state, the lifecycle webhook triggers the node-drainer AWS Lambda function.<\/li>\n<li>The Lambda function calls Kubernetes APIs and <a href=\"https:\/\/kubernetes.io\/docs\/concepts\/architecture\/nodes\/#manual-node-administration\">cordons<\/a> the terminating instance. Cordoning the instance prevents any new pods launching on the terminating instance.<\/li>\n<li>Once the instance is cordoned, all the pods from that instance are evicted and placed on a healthy\u00a0node.<\/li>\n<li>Kubernetes takes care of bringing up new pods on healthy instances.<\/li>\n<li>The lifecycle hook waits until all the pods are evicted from the instance and the new pods come up on a healthy instance.<\/li>\n<li>Once the node is drained completely, the Lifecycle hook removes the <strong>WAIT <\/strong>on the node being terminated and continues with the termination.<\/li>\n<li>This ensures that all the existing requests are completed, and then the pods are evicted from the\u00a0node.<\/li>\n<li>While doing this we ensure new healthy pods are up to service new\u00a0traffic.<\/li>\n<li>This graceful shutdown process helps us to ensure that no pods are abruptly shut down and there is no service disruption.<\/li>\n<\/ul>\n<h3>RBAC<\/h3>\n<p>To access the Kubernetes resources from the AWS Lambda function we create an IAM role, a clusterrole and a clusterrolebinding. The IAM role grants permission to access ASGs. The clusterrole and clusterrolebinding grant the node-drainer Lambda function permissions for Kubernetes Pod eviction.<\/p>\n<p> <strong>IAM role\u00a0policy<\/strong><\/p>\n<pre>{<br>    \"Version\": \"2012-10-17\",<br>    \"Statement\": [<br>        {<br>            \"Action\": [<br>                \"autoscaling:CompleteLifecycleAction\",<br>                \"ec2:DescribeInstances\",<br>                \"eks:DescribeCluster\",<br>                \"sts:GetCallerIdentity\"<br>            ],<br>            \"Resource\": \"*\",<br>            \"Effect\": \"Allow\"<br>        }<br>    ]<br>}<\/pre>\n<p><strong>Clusterrole<\/strong><\/p>\n<pre>kind: ClusterRole<br>apiVersion: rbac.authorization.k8s.io\/v1<br>metadata:<br>  name: lambda-cluster-access<br>rules:<br>  - apiGroups: [\"\"]<br>    resources: [\"pods\", \"pods\/eviction\", \"nodes\"]<br>    verbs: [\"create\", \"list\", \"patch\"]<\/pre>\n<p><strong>Clusterrolebinding<\/strong><\/p>\n<pre>kind: ClusterRoleBinding<br>apiVersion: rbac.authorization.k8s.io\/v1<br>metadata:<br>  name: lambda-user-cluster-role-binding<br>subjects:<br>  - kind: User<br>    name: lambda<br>    apiGroup: rbac.authorization.k8s.io<br>roleRef:<br>  kind: ClusterRole<br>  name: lambda-cluster-access<br>  apiGroup: rbac.authorization.k8s.io<\/pre>\n<h3>Conclusion<\/h3>\n<p>With the combination of AWS Lambda, AWS EC2 AutoScaling Lifecycle hooks, and graceful application process termination, we ensure zero downtime while replacing our EC2 instances frequently during patching.<\/p>\n<p>Please <a href=\"mailto:einstein.ai.platform@salesforce.com\">reach out to us<\/a> with any questions, or if there is something you\u2019d be interested in discussing that we haven\u2019t\u00a0covered.<\/p>\n<h3>References<\/h3>\n<ul>\n<li><a href=\"https:\/\/docs.aws.amazon.com\/autoscaling\/ec2\/userguide\/lifecycle-hooks.html\">https:\/\/docs.aws.amazon.com\/autoscaling\/ec2\/userguide\/lifecycle-hooks.html<\/a><\/li>\n<li><a href=\"https:\/\/aws.amazon.com\/serverless\/sam\/\">https:\/\/aws.amazon.com\/serverless\/sam\/<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/aws-samples\/amazon-k8s-node-drainer\">https:\/\/github.com\/aws-samples\/amazon-k8s-node-drainer<\/a><\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/medium.com\/_\/stat?event=post.clientViewed&amp;referrerSource=full_rss&amp;postId=cdceb21c8c8c\" width=\"1\" height=\"1\" alt=\"\"><\/p>\n<hr>\n<p><a href=\"https:\/\/engineering.salesforce.com\/zero-downtime-node-patching-in-a-kubernetes-cluster-cdceb21c8c8c\">Zero Downtime Node Patching in a Kubernetes Cluster<\/a> was originally published in <a href=\"https:\/\/engineering.salesforce.com\/\">Salesforce Engineering<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/zero-downtime-node-patching-in-a-kubernetes-cluster-cdceb21c8c8c?source=rss----cfe1120185d3---4\" target=\"_blank\" rel=\"noopener\">Read More<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Authors: Vaishnavi Galgali, Arpeet Kale, Robert\u00a0Xue Introduction The Salesforce Einstein Vision and Language services are deployed in an AWS Elastic Kubernetes Service (EKS) cluster. One of the primary security and compliance requirements is operating system patching. The cluster nodes that the services are deployed on need to have regular operating system updates. Operating system patching&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2021\/08\/31\/zero-downtime-node-patching-in-a-kubernetes-cluster\/\">Continue reading <span class=\"screen-reader-text\">Zero Downtime Node Patching in a Kubernetes Cluster<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-268","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":454,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/looking-at-the-kubernetes-control-plane-for-multi-tenancy\/","url_meta":{"origin":268,"position":0},"title":"Looking at the Kubernetes Control Plane for Multi-Tenancy","date":"August 31, 2021","format":false,"excerpt":"The Salesforce Platform-as-a-Service Security Assurance team is constantly assessing modern compute platforms for security level and features. We use the insights from these research efforts to provide fast and comprehensive support to engineering teams who explore platform options that adequately support their security requirements. Unsurprisingly, Kubernetes is one of the\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":284,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/hadoop-hbase-on-kubernetes-and-public-cloud-part-i\/","url_meta":{"origin":268,"position":1},"title":"Hadoop\/HBase on Kubernetes and Public Cloud (Part I)","date":"August 31, 2021","format":false,"excerpt":"Authors: Dhiraj Hegde, Ashutosh Parekh, and Prashant\u00a0MurthyAt Salesforce, we run a large number of HBase and HDFS clusters in our own data centers. More recently, we have started deploying our clusters on Public Cloud infrastructure to take advantage of the on-demand scalability available there. As part of this foray onto\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":279,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/notary-a-certificate-lifecycle-management-controller-for-kubernetes\/","url_meta":{"origin":268,"position":2},"title":"Notary: A Certificate Lifecycle Management Controller for Kubernetes","date":"August 31, 2021","format":false,"excerpt":"Authors: Vaishnavi Galgali, Savithru Lokanath, Arpeet\u00a0KaleIntroductionAll services in the Einstein Vision and Language Platform use TLS\/SSL certificates to encrypt communication between microservices. The certificates are generated in AWS Certificate Manager (ACM) and stored in the AWS Secrets Manager in the form of keystores and truststores (private and public keys). Certificate\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":287,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/optimizing-eks-networking-for-scale\/","url_meta":{"origin":268,"position":3},"title":"Optimizing EKS networking for scale","date":"August 31, 2021","format":false,"excerpt":"Authors: Savithru Lokanath, Arpeet Kale, VaishnavigalgaliElastic Kubernetes Service (EKS) is a service under the Amazon Web Services (AWS) umbrella that provides managed Kubernetes service. It significantly reduces the time to deploy, manage, and scale the infrastructure required to run production-scale Kubernetes clusters. AWS has simplified EKS networking significantly with its\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":900,"url":"https:\/\/fde.cat\/index.php\/2024\/07\/22\/data-clouds-lightning-fast-migration-from-amazon-ec2-to-kubernetes-in-6-months\/","url_meta":{"origin":268,"position":4},"title":"Data Cloud\u2019s Lightning-Fast Migration: From Amazon EC2 to Kubernetes in 6 Months","date":"July 22, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we delve into the journeys of distinguished engineering leaders. Today, we feature Archana Kumari, Director of Software Engineering at Salesforce. Archana leads our India-based Data Cloud Compute Layer team, which played a pivotal role in a recent transition from Amazon EC2 to Kubernetes for\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":662,"url":"https:\/\/fde.cat\/index.php\/2022\/12\/14\/how-salesforce-uses-immutable-infrastructure-in-hyperforce\/","url_meta":{"origin":268,"position":5},"title":"How Salesforce uses Immutable Infrastructure in Hyperforce","date":"December 14, 2022","format":false,"excerpt":"Credits go to: Armin Bahramshahry, Software Engineering Principal Architect @ Salesforce\u00a0&\u00a0Shan Appajodu, VP, Software Engineering for Developer Productivity Experiences @ Salesforce. To leverage the scale and agility of the world\u2019s leading public cloud platforms, our Technology and Products team at Salesforce has worked together over the past few years to\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/268","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=268"}],"version-history":[{"count":1,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/268\/revisions"}],"predecessor-version":[{"id":442,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/268\/revisions\/442"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=268"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=268"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=268"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}