How to fix if POD that are OOM Killed

An OOMKilled event occurs when a Kubernetes pod exceeds its allocated memory limits, causing the Out of Memory (OOM) Killer in the Linux kernel to terminate the container to prevent the node from running out of memory. The OOM Killer is a mechanism in the kernel that identifies which processes should be killed to free up memory when the system runs out of available memory resources.

How to Check if a Pod Was OOMKilled

How to Troubleshoot OOMKilled Pods:

Step 1: Check the Pod Logs

Check the pod status to confirm the OOMKilled reason:

kubectl get pod <pod-name> 
kubectl describe pod <pod-name> 

Look for an event under State that says OOMKilled:

State:   Reason:       OOMKilled

Usually the application will Also notify it something about the memory usage:

kubectl logs <your-pod-name>

Why It Happens

No resource limit: If you didn’t set memory limits, it might try to use as much as it can—until the node says “nope.”
Limit too low: You set a memory limit, but your app needs more than that.
Memory leak: App consumes more and more memory over time.
Bursty behavior: Some apps have memory spikes that exceed the limit.

How to Fix It

1. Set Proper Memory Requests and Limits

To ensure that your pods are allocated appropriate resources, you should define both memory requests and limits in the pod or deployment configuration.

requests specify the amount of memory that will be guaranteed to the pod. The scheduler uses this value to determine which node can run the pod based on available resources.
limits specify the maximum memory the container can use. If the container exceeds this limit, the pod will be terminated by the OOM killer.

resources:
  requests:
    memory: "512Mi"
  limits:
    memory: "1Gi"

Then apply it:

kubectl apply -f your-deployment.yaml 

2. Increase the Memory Limits if Necessary

If you’ve already set memory limits and the pod is still being OOMKilled, consider increasing the limits if your application needs more memory. You can scale the limits higher to accommodate your workload:

resources:
  requests:
    memory: "1Gi"
  limits:
    memory: "2Gi"

3. Optimize the Application’s Memory Usage

If the OOMKills are due to memory leaks or inefficient memory usage in the application itself, you should profile and optimize the application:

Use memory profiling tools specific to your application’s language, such as memory_profiler for Python or heapdump for Node.js.
Review the application code to identify potential memory leaks or inefficient memory allocation patterns.

How to Troubleshoot OOMKilled Pods:​

Step 1: Check the Pod Logs​

Why It Happens​

How to Fix It​

1. Set Proper Memory Requests and Limits​

2. Increase the Memory Limits if Necessary​

3. Optimize the Application’s Memory Usage​