Avoid downtime with Resource Requests and Limits¶
Note
This section helps you implement ISO 27001, specifically:
- A.12.1.3 Capacity Management
Important
This guardrail is enabled by default and will deny violations. As a result, resources that violate this policy will not be created.
Problem¶
A major source of application downtime is insufficient capacity. For example, if a Node reaches 100% CPU utilization, then application Pods hosted on it will run slow, leading to bad end-user experience. If a Node runs into memory pressure, the application will run slower, as less memory is available for the page cache. High memory pressure may lead to the Node triggering the infamous Out-of-Memory (OOM) Killer, killing a victim, either your application or a platform component.
Solution¶
To avoid running into capacity issues, Kubernetes allows Pods to specify resource requests and limits for each of its containers. This achieves two benefits:
- It ensures that Pods are scheduled to Nodes that have the requested resources.
- It ensures that a Pod does not exceed its resource limits, hence limiting its blast radius and protecting other application or platform Pods.
How Does Welkin Help?¶
To make sure you don't forget to configure resource requests and limits, the administrator can configure Welkin to deny creation of Pods without explicit resource specifications.
If you get the following error:
Error: UPGRADE FAILED: failed to create resource: admission webhook "validation.gatekeeper.sh" denied the request: [denied by require-resource-requests] Container "welkin-user-demo" has no resource requests
Then you are missing resource requests for some containers of your Pods. The user demo gives a good example to get you started.
If your administrator has not enforced this policy yet, you can view current violations of the policy by running:
kubectl get k8sresourcerequests.constraints.gatekeeper.sh require-resource-requests -ojson | jq .status.violations