We don't want a rogue microservice to eat up all the CPU and memory in the cluster. Pods can be restricted to use a specific % of CPU and a flat amount of memory. They can be restricted either individually or at namespace level.
Pods are individual instances of our microservice. They are wrappers around your Docker containers -- but they can contain more than just one container. You're unlikely to have more than one container running inside a pod, unless you need something like a timed cron-job or a queue listener which the main service running inside the pod depends on. That additional container is called a sidecar. To summarize, for simplicity's sake, a pod is an instance of a container with some K8s specific metadata.
We'll use resource quotas for individual pods in a future article. For now, let's focus on namespace-level resource quotas and limits. In our scenario we have two namespaces - Production and Integration. We're running these two environments on the same cluster, so we want to make sure that our Integration environment doesn't blow up Production when we run some tests. We can thus limit Integration to have a finite amount of resources available to it, while letting Production consume the rest.
To do this, we'll be writing a new YAML file. Inside the Cluster folder you created in the previous article add a new file
resources.yml. Here's what it looks like to start:
apiVersion: v1 kind: ResourceQuota metadata: name: resourcequotas namespace: integration spec: hard: requests.cpu: 1500m requests.memory: 2Gi limits.cpu: 3000m limits.memory: 4Gi
Then, apply it:
kubectl apply -f resources.yml
Great, so, what did we do there? Let's talk about it.
The first few lines are the same as every other YAML file you're gonna write. The
metadata are all the same keys. The kind is of type ResourceQuota, because that's what we're adding! There's a new addition to the metadata map called
namespace. This tells K8s to apply these settings to a specific namespace.
Next, we have the
spec - short for specification. This map is where you define what you want the ResourceQuota to look like. The
hard keyword tells Kubernetes that these are hard limits... that is, they cannot be skirted. If you define a pod which has a request.cpu higher than limit.cpu, it will fail to start.
Inside the hard map we have 4 KVPs. The values defined here are valid across all running pods and are a sum of all request and limit resources. Let's look at a scenario to illustrate:
PodA requests 100m. PodB requests 300m. PodC requests 1000m. Total, the requests are 1400m. That's within the hard limit of 1500m, so we're okay. If the sum of all requests from our pods exceeds that of the request quotas, some of the pods will fail to start. Same goes for limits.
You might be wondering ... what is 1500m? 'm' stands for millicpu, or 1/1000th of a CPU. When you provision your cluster, you pick the size of your nodes. For example, you might have 2 vCPUs per node and 8GB of memory. 2 vCPUs is 2000m. 4 vCPUs is 4000m. You can also use decimals for CPU: 0.1, 0.3, 1. That's, 100m, 300m and 1000m respectively. Usually, you'll want to stick to the M notation, but just know that decimals are possible.
For memory, you can use GB, MB, Kb for 1000 or Gi, Mi, Ki for 1024... etc. Nothing new here.
Okay great, we have our quotas for the integration environment! Let's take at one more thing - Limit Ranges.
Resource quotas set limits on namespaces. However, they do not set limits on individual pods. This means that even though a rogue, untested microservice in integration won't bring down our cluster, it may bring down our integration environment! That's because within a namespace, a pod may consume as much memory as it wants. That's where limit ranges come in.
Let's define one, then talk about it. Inside the same
resource.yml file, after the last written line, add the following:
apiVersion: v1 kind: LimitRange metadata: name: limitranges namespace: integration spec: limits: - default: cpu: 300m memory: 256Mi defaultRequest: cpu: 10m memory: 128Mi max: cpu: 600m memory: 1024Mi min: cpu: 10m memory: 32Mi type: Container
Let's say you have a pod in which you don't specify request or limit values (remember, you can specify these on pod level -- we'll get to it in a later article). With the LimitRange set for the namespace, said pod would automatically get the default limit/request values under default/defaultRequest.
The max and min maps specify limits that an individual pod can define inside its YAML file. If you have a pod where you DO define these limits, they cannot exceed the limits defined in the namespace. These limits are meant to ensure the stability of the cluster.
Lastly, we assign these limits to a container via the
type KVP. As I mentioned above, a pod is a wrapper around a container or multiple containers. If you have two containers inside a pod, each one would get the defaults and max/min values defined above.
Go ahead and apply the limit ranges:
kubectl apply -f resources.yml
Defining request quotas and limit ranges at namespace level ensures that the cluster won't get hosed. In addition to CPU and Memory resource limits, there are also object limits. You can limit how many pods can run inside a namespace for example. We didn't touch on those here, but it's possible. Check the resources at the bottom of this article for more information.
You can view more information about the integration namespace and see the resource quotas and limits with this command:
kubectl describe namespace integration
In the next article we'll introduce Deployments.
Did you find this article valuable?
Support Paul K by becoming a sponsor. Any amount is appreciated!