Kubernetes agents are stuck in Pending due to the `NoDelayProvisionerStrategy`

Issue

  • After upgrading CloudBees Core to release 2.190.2.2, Kubernetes planned agents are stuck in “Pending” and builds are hanging forever, waiting for the planned agents to come online with “Waiting for next available executor”.
  • In such cases, the agent pod are not even scheduled

Environment

Related Issue(s)

Explanation

This is a bug in the kubernetes plugin introduced by JENKINS-56307 in version 1.19.1 of the kubernetes plugin.

This version introduces a new Node Provisioner strategy NoDelayProvisionerStrategy that is enabled by default. The strategy provisions a node as soon as the Node Provisioner detects a need for more agents. As opposed to the default strategy that makes his decision based on load estimates.

There is a bug in the implementation that causes agents to never be provisioned and builds to hang.

Resolution

This issue has been fixed in version 1.21.2 of the kubernetes plugin.

Solution

The solution is to upgrade the Kubernetes plugin to version 1.21.2 that contains the fix

Note At the moment of writing this article, this version is not yet available under the CloudBees Assurance Program

Workaround

The workaround is to disable the NoDelayProvisionerStrategy. This can be done by adding the system property -Dio.jenkins.plugins.kubernetes.disableNoDelayProvisioning=true to the Master’s startup. This requires to restart the Master in order to take effect. See How to add Java arguments to Jenkins to under stand how to do this.

Have more questions?

1 Comments

  • 1
    Avatar
    Aaron Nassiry

    we happen to be using version 1.21.1 of the k8s plugin and still seeing this issue.

Please sign in to leave a comment.