dedicated JNLP agents formerly agents are not able to connect

Issue

Your JNLP agent cannot be connected with your Jenkins master.

Environment

Resolution

In order to successfully connect a JNLP agent with your Jenkins environment there are a few important pre-requisites:

  • The CloudBees Core instance must be listening on the JNLP port
  • The CloudBees Core instance must be reachabe at HTTP level from the JNLP agent
  • The CloudBees Core instance must be reachable at TCP level from the JNLP agent

Ensure that the CloudBees Core instance is listening on the JNLP port

Go to Manage Jenkins -> Configure Global Security and ensure that the JNLP port was configured with a either fixed, or random port and that the Agent protocol Inbound TCP Agent Protocol/4 (TLS encryption) is at least enabled.

Take a thread dump of the instance going to <JENKINS_URL>/threadDump from your web browser and look for the TCP agent listener thread.

TCP agent listener port=31966
"TCP agent listener port=31966" Id=89 Group=main RUNNABLE (in native)
	at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
	at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
	-  locked java.lang.Object@7b1b2a2
	at hudson.TcpSlaveAgentListener.run(TcpSlaveAgentListener.java:186)

Ensure that the CloudBees Core instance is reachable through HTTP

From the agent side run the command curl -ILv <JENKINS_URL> and check if you are getting the Jenkins headers such as:

...
< X-Hudson: 1.395
X-Hudson: 1.395
< X-Hudson-CLI-Port: 31966
X-Hudson-CLI-Port: 31966
< X-Jenkins: 2.204.1.3
X-Jenkins: 2.204.1.3
< X-Jenkins-CLI-Host: ec2-74-159-31-69.compute-1.amazonaws.com
X-Jenkins-CLI-Host: ec2-74-159-31-69.compute-1.amazonaws.com
...

Ensure that the CloudBees Core instance is reachable through TCP

You can test if the CliudBees Core instance is reachable through TCP protocol by either telnet <JENKINS_HOSTNAME> <JNLP_PORT> or curl <JENKINS_HOSTNAME>:<JNLP_PORT>.

Ensure that the Java version is at least on the same line on both master and agent

A good practice is to run the exactly same Java version in both Jenkins and agent, but when this is not possible it is recommendable to be running at least the same base line.

Run java -version in both Jenkins master box and agent to check the java version you are running in both.

Ensure that the version of agent.jar matches with the one

The main problem of running JNLP as an agent Launcher is that when you upgrade Jenkins agent.jar is not automatically upgraded on the agent - which by the way happens in SSH Launcher out of the box.

Check that agent.jar is the same using for example md5sum agent.jar. agent.jar can be downloaded from Jenkins master from the URL below:

<JENKINS_URL>/jnlpJars/agent.jar

Use jenkins-cli to check the connection

In the agent box download <JENKINS_URL>/jnlpJars/jenkins-cli.jar from Jenkins master and execute the command below:

java -jar jenkins-cli.jar -s http://<CJOC_URL>/ --username=<USERNAME> --password=<PASSWORD> help

Check that the JNLP port and hostname are right

Launch the commands below and check that the port and hostname are the right ones:

curl -I <JENKINS_URL>/computer/<AGENT>/slave-agent.jnlp
curl -I <JENKINS_URL>/tcpSlaveAgentListener/

Curl command can be available on a Windows box using for example curl Download Wizard

Load balancer or ha-proxy

If you are using a load balancer or a ha-proxy and you are not running Jenkins on ha mode, you might want to bypass any of them through the Agent advance option of Tunnel connection through.

Clear the Java Web Start Cache

If, when starting the JNLP file, you see an error like the one below, run the command javaws -clearcache to clear the cache of the java webstart program.

java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(Unknown Source)
	at java.net.SocketInputStream.read(Unknown Source)
	at sun.security.ssl.InputRecord.readFully(Unknown Source)
	at sun.security.ssl.InputRecord.read(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.performInitialHandshake(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
	at sun.security.ssl.SSLSocketImpl.startHandshake(Unknown Source)
	at sun.net.www.protocol.https.HttpsClient.afterConnect(Unknown Source)
	at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.access$200(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection$9.run(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection$9.run(Unknown Source)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.security.AccessController.doPrivilegedWithCombiner(Unknown Source)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
	at com.sun.deploy.net.HttpUtils.followRedirects(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doRequest(Unknown Source)
	at com.sun.deploy.net.BasicHttpRequest.doGetRequestEX(Unknown Source)
	at com.sun.deploy.cache.ResourceProviderImpl.checkUpdateAvailable(Unknown Source)
	at com.sun.deploy.cache.ResourceProviderImpl.isUpdateAvailable(Unknown Source)
	at com.sun.deploy.cache.ResourceProviderImpl.getResource(Unknown Source)
	at com.sun.deploy.cache.ResourceProviderImpl.getResource(Unknown Source)
	at com.sun.javaws.LaunchDownload$DownloadTask.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)

Information to be attached in case you need to open a Support ticket at CloudBees Support

  • Architecture diagram so we can understand how it looks like your environment
  • A support bundle
  • md5sum of agent.jar in both boxes
  • Content of <JENKINS_URL>/computer//slave-agent.jnlp
  • Content of <JENKINS_URL>/computer//config.xml
  • Output of commands below launched from agent box
  • The agent and the master logs which demonstrates that the connectivity is broken
curl -I <JENKINS_URL>/computer/<AGENT>/slave-agent.jnlp
curl -I <JENKINS_URL>/tcpSlaveAgentListener/
curl -ILv <JENKINS_URL>

Have more questions?

4 Comments

  • -1
    Avatar
    John Mellor

    Ok, what is a "CJOC_URL"?  Gobbledegook phrase meaning "Canadian Joint Operations Command"?

    If the slave.jar has a different md5, how do I interpret that?  What API versions are compatible, and how do I determine that compatibility?

    I have a JNLP connection issue in K8S, and this document falls far short of what is required to debug this.

  • 1
    Avatar
    Denys Digtiar

    CJOC stands for CloudBees Jenkins Enterprise. For the purposes of this article, it should have just been JENKINS_URL.

    If the hash sum is different it means the versions are different between master and agent. The backward compatibility is maintained but the recommendation is to keep the slave.jar at the same version on both sides. Therefore if md5 is different, replace agent's slave.jar with the one downloaded from the master.

  • 0
    Avatar
    Byron Kim

    In the Load balancer or ha-proxy section, there should be a comment about Idle Timeouts if you're Jenkins node is running through a LB/proxy.  This can cause JNLP connection timeout errors

    The default for ELB for instance is 60s and was causing some builds on slave nodes to fail on certain steps that took a long time to respond.

  • 0
    Avatar
    Mark Kenneally

    CLI authentication format has changed, it should be:

    java -jar jenkins-cli.jar -s http://<JENKINS_URL>/  -auth <username>:<APItoken> help

Please sign in to leave a comment.