OpenShift pods terminating due to Out-of-Memory condition

Issue

CloudBees CI Controller and Operations Center pods running in OpenShift are restarting frequently. The Jenkins logs contain repeated occurrences of java.lang.OutOfMemoryError: unable to create new native thread.

Environment

Resolution

OpenShift enforces a default container PID limit of 1024, which caps the number of processes at 1024. This value is likely based on the assumption that pods are running microservices, in which case, 1024 may be reasonable limit. Jenkins is not a microservice though, and so it makes sense to increase this limit. OpenShift provides steps for modifying the container pid limit in their documentation:

https://docs.openshift.com/container-platform/4.7/post_installation_configuration/machine-configuration-tasks.html#create-a-containerruntimeconfig_post-install-machine-configuration-tasks

Note the appropriate PID limit will vary based on the environment. PID consumption for an OC or controller depends on many factors: installed plugins and plugin configurations, number of jobs, job type (e.g. pipeline, freestyle) and composition, and job scheduling pattens (among other things). Whether CloudBees CI is running in a VM, on bare-metal, or in a container does not significantly impact PID consumption. Therefore, the PID limit recommendations for CI Traditional would also apply to CI Modern. This CRI-O issue provides more insight into container PIDs limit.

References

Have more questions?

0 Comments

Please sign in to leave a comment.