Why is my instance crashing with Out Of Memory errors?

Issue

You observe that your controller(formerly know as master) is crashing due to OutOfMemory errors. After getting a Heap dump for the instance, you see a lot of instances loaded for pipeline steps:

XXXXXX  instances of "org.jenkinsci.plugins.workflow.cps.nodes.StepEndNode", loaded by "hudson.ClassicPluginStrategy$AntClassLoader2 @ 0xxxxxxx"

But the most interesting thing is that you see a high amount of FileSystemException entries in the Heap dump data. These exceptions are being logged inside of the corresponding files {{name_of_project}}/builds/build_number/workflow/*.xml.

There is a side effect of these large amount of exception being logged and it is that these exceptions will not only consume disk space but also memory and time as the build is being loaded.

Environment

Resolution

This issue was traced back to Jenkins Core, more specifically to a change in PathRemover behavior. This was traced in the Community Jira as JENKINS-61841

Calls to Util.deleteRecursive and other methods that end up calling PathRemover.forceRemoveRecursive on large directories can end up throwing instances of CompositeIOException with a very large number of nested exceptions, leading to excessive memory usage.

The fix for this potential problem was released as part of the 2.235.x line.

Workaround

Check for any potential permissions issues or storage issues preventing these I/O operations from happenning and try to schedule an update to version 2.235.x or higher.

Tested product/plugin versions

Have more questions?

0 Comments

Please sign in to leave a comment.