Operations Center-Client Master connectivity issues

Symptoms

  • CM appears as disconnected in OC
  • CM shows “Connect to Operations Center” when it should be already connected
  • CM license expires
  • Shared slaves/cloud are not leased to CM

Diagnostic/Treatment

  • Pre-condition: CM and OC were previously correctly connected.

OC-CM connectivity issue at HTTP level

OC logs

[Mon Oct 24 12:12:09 UTC 2016] Starting discovery on http://cjoc.jenkins.example.com:8888/
[Mon Oct 24 12:12:19 UTC 2016] Discovery on http://cjoc.jenkins.example.com:8888/ failed (will retry) - Could not connect to Jenkins server: http://cjoc.jenkins.example.com:8888/
java.net.ConnectException: Could not connect to Jenkins server: http://cjoc.jenkins.example.com:8888/
    at com.cloudbees.opscenter.agent.AgentProtocolEndpointLocator.locate(AgentProtocolEndpointLocator.java:556)
    at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.run(OperationsCenterConnectorSetTask.java:170)
    at java.lang.Thread.run(Thread.java:745)
    at com.cloudbees.opscenter.client.plugin.AgentThread.run(AgentThread.java:39)
Caused by: java.util.concurrent.ExecutionException: java.net.ConnectException: connection timed out: cjoc.jenkins.example.com/192.168.1.44:8888 to http://cjoc.jenkins.example.com:8888/instance-identity/
    at com.ning.http.client.providers.netty.NettyResponseFuture.abort(NettyResponseFuture.java:328)
    at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:108)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:418)
    at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:380)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:140)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: connection timed out: cjoc.jenkins.example.com/192.168.1.44:8888 to http://cjoc.jenkins.example.com:8888/instance-identity/
    at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:104)
    ... 12 more
Caused by: org.jboss.netty.channel.ConnectTimeoutException: connection timed out: cjoc.jenkins.example.com/192.168.1.44:8888
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:137)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    ... 3 more

CM logs

[Mon Oct 24 14:50:28 UTC 2016] Starting discovery on http://cjoc.jenkins.example.com:8888/
[Mon Oct 24 14:50:29 UTC 2016] Discovery on http://cjoc.jenkins.example.com:8888/ failed (will retry) - Could not connect to Jenkins server: http://cjoc.jenkins.example.com:8888/
java.net.ConnectException: Could not connect to Jenkins server: http://cjoc.jenkins.example.com:8888/
    at com.cloudbees.opscenter.agent.AgentProtocolEndpointLocator.locate(AgentProtocolEndpointLocator.java:556)
    at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.run(OperationsCenterConnectorSetTask.java:170)
    at java.lang.Thread.run(Thread.java:745)
    at com.cloudbees.opscenter.client.plugin.AgentThread.run(AgentThread.java:39)
Caused by: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused: cjoc.jenkins.example.com/192.168.1.44:8888 to http://cjoc.jenkins.example.com:8888/instance-identity/
    at com.ning.http.client.providers.netty.NettyResponseFuture.abort(NettyResponseFuture.java:328)
    at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:108)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:427)
    at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:418)
    at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:380)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:109)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: cjoc.jenkins.example.com/192.168.1.44:8888 to http://cjoc.jenkins.example.com:8888/instance-identity/
    at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:104)
    ... 12 more
Caused by: java.net.ConnectException: Connection refused: cjoc.jenkins.example.com/192.168.1.44:8888
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:150)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
    at org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    ... 3 more
  • A proxy might be configured in CJE and OC is not added as No Proxy Host

Ensure that if under Manage Jenkins → Manage Plugins → Advanced Tab there is a proxy configured then CJOC hostname is added to No Proxy Host. i.e cjoc.jenkins.example.com

  • CJOC is not reachable from CJE at HTTP level

From CJE host try to curl CJOC instance:

curl -I http://cjoc.jenkins.example.com:8888/

The header X-Jenkins should appear on the output: X-Jenkins: 2.7.19.0.1 (CloudBees Jenkins Operations Center 2.7.19.0.1-fixed)

If this header does not appear it means OC is not reachable from CM, which means you need to talk to your networking administrator to resolve this issue.

Related KB articles:
* How to troubleshoot client master connections

Open a Support ticket if you are stuck at this point adding a support bundle from BOTH CJOC and CM and the connectivity logs:
* http://cje.jenkins.example.com:8080/operations-center/log
* http://cjoc.jenkins.example.com:8888/job/Master-1/log

OC-CM connectivity issue at TCP level

OC logs

[Mon Oct 24 12:15:57 UTC 2016] Starting discovery on http://cjoc.jenkins.example.com:8888/
[Mon Oct 24 12:15:57 UTC 2016] Discovery on http://cjoc.jenkins.example.com:8888/ completed
 Agent address: cjoc.jenkins.example.com/192.168.1.44
 Agent port:  50000
 Identity: 99:e1:56:84:ad:62:80:7e:b1:b8:33:37:72:59:37:49
[Mon Oct 24 12:15:57 UTC 2016] Trying protocol: OperationsCenter2
[Mon Oct 24 12:15:57 UTC 2016] Opening TCP socket connection to cjoc.jenkins.example.com/192.168.1.44 on port 50000

[Mon Oct 24 12:16:07 UTC 2016] Error trying to establish connection to AgentProtocolEndpoint{address=cjoc.jenkins.example.com/192.168.1.44:50000, publicKey=99:e1:56:84:ad:62:80:7e:b1:b8:33:37:72:59:37:49}
java.net.SocketTimeoutException
   at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:118)
   at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.connect(OperationsCenterConnectorSetTask.java:117)
   at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.connectOnce(OperationsCenterConnectorSetTask.java:140)
   at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.run(OperationsCenterConnectorSetTask.java:194)
   at java.lang.Thread.run(Thread.java:745)
   at com.cloudbees.opscenter.client.plugin.AgentThread.run(AgentThread.java:39)
[Mon Oct 24 12:16:07 UTC 2016] Sleeping for 10s before retrying

CM logs

[Mon Oct 24 14:33:08 UTC 2016] Starting discovery on http://cjoc.jenkins.example.com:8888/
[Mon Oct 24 14:33:08 UTC 2016] Discovery on http://cjoc.jenkins.example.com:8888/ completed
Agent address: cjoc.jenkins.example.com/192.168.1.44
Agent port:  50000
Identity: 99:e1:56:84:ad:62:80:7e:b1:b8:33:37:72:59:37:49
[Mon Oct 24 14:33:08 UTC 2016] Trying protocol: OperationsCenter2
[Mon Oct 24 14:33:08 UTC 2016] Opening TCP socket connection to cjoc.jenkins.example.com/192.168.1.44 on port 50000
[Mon Oct 24 14:33:18 UTC 2016] Error trying to establish connection to AgentProtocolEndpoint{address=cjoc.jenkins.example.com/192.168.1.44:50000, publicKey=99:e1:56:84:ad:62:80:7e:b1:b8:33:37:72:59:37:49}
java.net.SocketTimeoutException
  at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:118)
  at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.connect(OperationsCenterConnectorSetTask.java:117)
  at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.connectOnce(OperationsCenterConnectorSetTask.java:140)
  at com.cloudbees.opscenter.agent.OperationsCenterConnectorSetTask.run(OperationsCenterConnectorSetTask.java:194)
  at java.lang.Thread.run(Thread.java:745)
  at com.cloudbees.opscenter.client.plugin.AgentThread.run(AgentThread.java:39)

This means there is an OC-CM connectivity issue at TCP level.

Usually this happens because intermediate elements like haproxy, firewall or ELB are blocking the connection.

CloudBees recommends to use the System Property below on the instance to bypass those intermediate elements.

-Dhudson.TcpSlaveAgentListener.hostName=<MACHINE_HOSTNAME>

Related KB articles:
* JNLP configuration on a HA environment

[Tue Feb 14 09:31:36 AEST 2017] Trying protocol: OperationsCenter2
[Tue Feb 14 09:31:36 AEST 2017] Opening TCP socket connection to cjoc.jenkins.example.com/127.0.0.1 on port 50001
[Tue Feb 14 09:31:46 AEST 2017] Socket connection is closed
[Tue Feb 14 09:31:46 AEST 2017] Connection refused: Connection closed before acknowledgement sent
com.cloudbees.opscenter.agent.protocol.impl.ConnectionRefusalException: Connection closed before acknowledgement sent
	at com.cloudbees.opscenter.agent.protocol.impl.AckFilterLayer.onRecvClosed(AckFilterLayer.java:280)
	at com.cloudbees.opscenter.agent.protocol.FilterLayer.abort(FilterLayer.java:163)
	at com.cloudbees.opscenter.agent.protocol.impl.AckFilterLayer.access$000(AckFilterLayer.java:43)
	at com.cloudbees.opscenter.agent.protocol.impl.AckFilterLayer$1.run(AckFilterLayer.java:176)
	at com.cloudbees.opscenter.agent.protocol.IOHub$DelayedRunnable.run(IOHub.java:935)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

This issue happens if the JNLP_PORT advertised by CJOC is incorrect. Most likely because the System Property -Dhudson.TcpSlaveAgentListener.port=<JNLP_PORT> is set CJOC but the <JNLP_PORT> points to an application that is not Jenkins.

Open a Support ticket if you are stuck at this point adding a support bundle from BOTH CJOC and CM and the connectivity logs:
* http://cje.jenkins.example.com:8080/operations-center/log
* http://cjoc.jenkins.example.com:8888/job/Master-1/log

OC-CM connectivity issue at TLS level

Log messages

  • Exception in CJE console:

```
nov 15, 2016 2:52:03 PM com.cloudbees.opscenter.client.plugin.OperationsCenterRegistrar$PushRegistrationConfirmation
WARNING: Pre-validation discovery on https://cjoc.jenkins.example.com:8888/ failed
javax.net.ssl.SSLHandshakeException: TLS Handshake exception establishing connection to Jenkins server: https://cjoc.jenkins.example.com:8888/. You might need to trust server’s self-signed certificate on global security configuration.

Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141)
at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280)
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382)

```

  • Message in the CJE user interface

TLSHandshakeException.png

The most common problem related with this issue is that you are using a SSL self signed certificate to publish the CJOC and CJE needs to have installed that certificate in the truststore.

Follow these steps to fix the problem

  1. Download the CJOC self signed certificate, you have two ways to do it
  2. Using the openssl command (change cjoc.jenkins.example.com:8888 with the host of your cjoc instance and the ssl port configured, ie cjoc.example.com:443)
       openssl s_client -tls1 -showcerts -connect cjoc.jenkins.example.com:8888 </dev/null 2>/dev/null|openssl x509 -outform PEM > cjoc-certificate.pem
    
  3. Downloading directly from your browser (ie Chrome)
    1. In the address bar, click the little lock with the X. This will bring up a small information screen. Click the button that says “Certificate Information.”
    2. Click and drag the image to your desktop and the certificate will be saved on the disk.
  4. Access via SSH to your CJE instance and copy the downloaded certificate to a temporal directory

  5. Install the certificate in the java cacert

       keytool -import -alias cjoc.jenkins.example.com -file cjoc-certificate.pem -keystore $JAVA_HOME/jre/lib/security/cacert
    

    Open a Support ticket if you are stuck at this point adding a support bundle from BOTH CJOC and CM and the connectivity logs:

OC-CM connectivity issue at TLS Hostname verification

Log messages

  • Exception in CJE console:
nov 15, 2016 2:52:03 PM com.cloudbees.opscenter.client.plugin.OperationsCenterRegistrar$PushRegistrationConfirmation <init>
WARNING: Pre-validation discovery on https://cjoc.local:8443/ failed
javax.net.ssl.SSLException: TLS hostname verification failure establishing connection to Jenkins server: https://cjoc.local:8443/ Certificate subject: CN=another.local issuer: CN=another.local
	at com.cloudbees.opscenter.agent.AgentProtocolEndpointLocator.locate(AgentProtocolEndpointLocator.java:415)
	at com.cloudbees.opscenter.client.plugin.OperationsCenterRegistrar$PushRegistrationConfirmation.<init>(OperationsCenterRegistrar.java:500)
	at com.cloudbees.opscenter.client.plugin.OperationsCenterRegistrar$DescriptorImpl.doPushRegistration(OperationsCenterRegistrar.java:316)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

  • Message in the CJE user interface

TLSHostnameVerification.png

This message occurs when you are using in your CJOC isntance a self-signed certificate issued for another host. In the previous case the certificate was issued for another.local and the CJOC is running on cjoc.local.

The solution will be create a new self-signed certificate for cjoc.local and use it to run the CJOC

  1. Create a new self-signed certificate,

    keytool -genkey -keyalg RSA -alias cjoc.local -keystore cjoc.local.jks -storepass jenkins -dname "cn=cjoc.local"
    

  2. Run CJOC using the new self-signed certificate

  3. Install the new certificate in CJE, see OC-CM connectivity issue at TLS level

Open a Support ticket if you are stuck at this point adding a support bundle from BOTH CJOC and CM and the connectivity logs:
* http://cje.jenkins.example.com:8080/operations-center/log
* http://cjoc.jenkins.example.com:8888/job/Master-1/log

Notes

Operations Center Agent (currently 2.32.0.1 latest at time of writing) does not support TLS SNI

To workaround the problem you could use the certificate CJE is expecting as the default one in the CJOC reverse proxy side. This should work as long as it is the default one.

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.