Jenkins stops processing builds in the build queue after an error appears in the logs

Issue

Jenkins jobs will sit in the build queue, and not get started, even when there are build agents available for the chosen ‘label’, and you see stack traces in the logs similar to the one shown below:

SEVERE  hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXXX failed

How do we know what is causing this queue freeze?

Environment

Resolution

The Queue.MaintainTask is a periodic task which is run in the instance and it is responsible for maintenance operations such as adding elements to the queue, assigning elements in the queue to nodes or executors, etc. If for some reason, this task fails, this causes the queue to become unresponsive and the jobs eventually stop being run as they stay stuck in the queue.

In order to determine what is the cause for this problem, we need to pay special attention to the full stack trace of the error which shows up in the logs.

You will be able to see some potential causes below, the intent of the list below is to allow you understand the pattern that you might follow to determine the root cause of the failure, thus helping you recover the instance as fast as possible.

SEVERE  hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from XXXX/XXX:XXX
                at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
                at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
                at hudson.remoting.Channel.call(Channel.java:957)
                at hudson.Launcher$RemoteLauncher.launch(Launcher.java:1059)
                at hudson.Launcher$ProcStarter.start(Launcher.java:455)
                at com.cloudbees.jenkins.plugins.nodesplus.CustomNodeProbeBuildFilterProperty.getProbeResult(CustomNodeProbeBuildFilterProperty.java:180)

In the stacktrace we can clearly see that there is a correlation between the getProbeResult method and the task failure.

 SEVERE  hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.lang.NoClassDefFoundError: Could not initialize class org.apache.logging.log4j.core.impl.Log4jLogEvent
    at org.apache.logging.log4j.core.impl.DefaultLogEventFactory.createEvent(DefaultLogEventFactory.java:54)
    at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:401)
    at org.apache.logging.log4j.core.config.DefaultReliabilityStrategy.log(DefaultReliabilityStrategy.java:49)
    at org.apache.logging.log4j.core.Logger.logMessage(Logger.java:146)
    at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2116)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2100)
    at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:1994)
    at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1966)
    at org.apache.logging.log4j.spi.AbstractLogger.error(AbstractLogger.java:739)
    at com.microfocus.application.automation.tools.octane.events.WorkflowListenerOctaneImpl.onNewHead(WorkflowListenerOctaneImpl.java:79)

In this case, the exception thrown is different but the effect is the same, the periodic task starts failing.

SEVERE  hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*.XXX.*XXX
^
    at java.util.regex.Pattern.error(Pattern.java:1957)
    at java.util.regex.Pattern.sequence(Pattern.java:2125)
    at java.util.regex.Pattern.expr(Pattern.java:1998)
    at java.util.regex.Pattern.compile(Pattern.java:1698)
    at java.util.regex.Pattern.<init>(Pattern.java:1351)
    at java.util.regex.Pattern.compile(Pattern.java:1028)
    at java.util.regex.Pattern.matches(Pattern.java:1133)
    at java.lang.String.matches(String.java:2121)
    at hudson.plugins.buildblocker.BlockingJobsMonitor.checkForPlannedBuilds(BlockingJobsMonitor.java:162)
    at hudson.plugins.buildblocker.BlockingJobsMonitor.checkForQueueEntries(BlockingJobsMonitor.java:86)
    at hudson.plugins.buildblocker.BuildBlockerQueueTaskDispatcher.checkAccordingToProperties(BuildBlockerQueueTaskDispatcher.java:151)

Again, the stacktrace will allow us to determine the source of the problem affecting the queue.

SEVERE    hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.lang.IndexOutOfBoundsException: Index: 0
    at java.util.Collections$EmptyList.get(Collections.java:4454)
    at org.jenkinsci.plugins.workflow.graph.StandardGraphLookupView.bruteForceScanForEnclosingBlock(StandardGraphLookupView.java:150)
    at org.jenkinsci.plugins.workflow.graph.StandardGraphLookupView.findEnclosingBlockStart(StandardGraphLookupView.java:197)
    at org.jenkinsci.plugins.workflow.graph.StandardGraphLookupView.findAllEnclosingBlockStarts(StandardGraphLookupView.java:217)
SEVERE  hudson.triggers.SafeTimerTask#run: Timer task hudson.model.Queue$MaintainTask@XXX failed
java.lang.NullPointerException
    at org.jenkinsci.plugins.blockqueuedjob.condition.JobResultBlockQueueCondition.isBlocked(JobResultBlockQueueCondition.java:70)
    at org.jenkinsci.plugins.blockqueuedjob.BlockItemQueueTaskDispatcher.canRun(BlockItemQueueTaskDispatcher.java:35)
    at hudson.model.Queue.getCauseOfBlockageForItem(Queue.java:1197)

Workaround

Depending on the situation that you are experiencing, you might need to upgrade or disable (if possible) the corresponding plugin showing up as responsible for the failure.

Have more questions?

1 Comments

  • 0
    Avatar
    Ryan Campbell

    Please note that this general class of errors is tracked as https://issues.jenkins-ci.org/browse/JENKINS-59886

    The instances above are actually bugs in those particular plugins -- these RuntimeExceptions should be checked and appropriate responses made to the given extension point.

Please sign in to leave a comment.