You observe that your controller(formerly know as master) is crashing due to
OutOfMemory errors. After getting a Heap dump for the instance, you see a lot of instances loaded for pipeline steps:
XXXXXX instances of "org.jenkinsci.plugins.workflow.cps.nodes.StepEndNode", loaded by "hudson.ClassicPluginStrategy$AntClassLoader2 @ 0xxxxxxx"
But the most interesting thing is that you see a high amount of
FileSystemException entries in the Heap dump data. These exceptions are being logged inside of the corresponding files
There is a side effect of these large amount of exception being logged and it is that these exceptions will not only consume disk space but also memory and time as the build is being loaded.
- CloudBees CI (CloudBees Core)
- CloudBees CI (CloudBees Core) on modern cloud platforms - Managed Master
- CloudBees CI (CloudBees Core) on modern cloud platforms - Operations Center
- CloudBees CI (CloudBees Core) on traditional platforms - Client Master
- CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
- CloudBees Jenkins Enterprise
- CloudBees Jenkins Enterprise - Managed Master
- CloudBees Jenkins Enterprise - Operations Center
- CloudBees Jenkins Platform - Client Master
- CloudBees Jenkins Platform - Operations Center
- CloudBees Jenkins Distribution
- Jenkins LTS < 2.235.x
This issue was traced back to Jenkins Core, more specifically to a change in
PathRemover behavior. This was traced in the Community Jira as JENKINS-61841
Calls to Util.deleteRecursive and other methods that end up calling PathRemover.forceRemoveRecursive on large directories can end up throwing instances of CompositeIOException with a very large number of nested exceptions, leading to excessive memory usage.
The fix for this potential problem was released as part of the 2.235.x line.
Check for any potential permissions issues or storage issues preventing these I/O operations from happenning and try to schedule an update to version 2.235.x or higher.