Memory Problem: Process killed by OOM Killer

Issue

  • Jenkins suddenly crashed and /var/log/kern.log or /var/log/dmesg.log shows:
    [XXXXX] Out of memory: Kill process <JENKINS_PID> (java) score <SCORE> or sacrifice child
    [XXXXX] Killed process <JENKINS_PID> (java) total-vm:XXXkB, anon-rss:XXXkB, file-rss:XXXkB, shmem-rss:XXXkB
    

Environment

Background

A Java process is made up of:

  • Java heap space (set via -Xms and -Xmx)
  • the Metaspace (previously PermGen in Java 7)
  • the Native Memory area

Each one of these areas will use RAM. The memory footprint of Jenkins (a Java application) is the sum of the maximum Java heap size, the Metaspace size and the native memory area.

It is important to understand that the Operating System itself and any other processes running on the machine have their own requirements regarding RAM and CPU. The Operating System uses a certain amount of RAM which leaves the remaining RAM to be split among Jenkins and any other processes on the machine.

Resolution

(This does not indicate a problem with Jenkins. It indicates that the Operating System is unable to provide enough resources for all the programs it has been asked to run.)

The OOM Killer is a function of the linux kernel that kill rogue processes that are requesting more memory that the OS can allocate so that the system can survive. The function applies some heuristics (it gives each process a score) to decide which process to kill when the system is in such state. The process monopolizing the most amount of memory and not releasing enough of it is more likely to be killed.

If you are affected by this error, there could be different causes:

  1. Too much memory is allocated to Jenkins
  2. Other processes are running on the same machine as Jenkins

Following are recommendations for each case.

1) Too much memory allocated to Jenkins

You cannot (shouldn’t) allocate as much memory as there is available on the machine. That is because the Operating System needs resources to manage the system.

We recommend keeping a ratio of Total Memory Available / Maximum Memory allocated to Jenkins JVM to 70%.

2) Other processes are impacting Jenkins

In this scenario, Jenkins is not the only process running on the machine but it is killed because it is the process consuming the most memory on the OS.

We strongly recommend that Jenkins be the only non-process running on the machine hosting it. Should you run other processes, like for example monitoring agents, ensure that they are not overloading the system or otherwise that enough resources are available to handle the load on the machine.

How to find the culprit

It is possible to check the processes consuming the most memory at any time on the machine with commands like:

$ top -o mem

or:

$ ps aux --sort -pmem

You can also dig into the /var/log/kern.log or the /var/log/dmesg.log. In these logs, locate the “Out of memory: Kill process ” message. Just above that message, the kernel dumps the stats of the processes that were running. For example:

[...]
[XXXXX] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
[XXXXX] [  480]     0   480    13863      113      26        0         -1000 auditd
[XXXXX] [12345]   123 12345  4704977  3306330    6732        0             0 java
[XXXXX] [11939]     0 11939    46699      328      48        0             0 crond
[XXXXX] [11942]     0 11942    28282       45      12        0             0 sh
[XXXXX] [16789]   456 16789  1695936    38643     165        0             0 java
[...]
[XXXXX] Out of memory: Kill process 12345 (java) score 869 or sacrifice child
[...]

In this example, the Jenkins PID was 12345 and it was killed. We can see that Jenkins was consuming ~4.7 GB memory (see total_vm) and there is also another process of PID 16789 that was consuming ~1.6 GB of memory. You can then investigate more about this other process and see what it does by running the following command:

$ ps -f <pid>

Resources

For more details about the OOM Killer and this particular issue, have a look at the following links:

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.