Kubernetes agents are failing with 'SocketTimeoutException: timeout'

Issue

  • My pods are getting created but some builds are failing or getting disconnected with an error similar to the following in the console output or master logs:
java.net.SocketException: Socket closed
    at java.net.SocketInputStream.read(SocketInputStream.java:204)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
    at sun.security.ssl.InputRecord.read(InputRecord.java:503)
    at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:975)
    at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:933)
    at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
    at okio.Okio$2.read(Okio.java:140)
    at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
Caused: java.net.SocketTimeoutException: timeout
    at okio.Okio$4.newTimeoutException(Okio.java:232)
    at okio.AsyncTimeout.exit(AsyncTimeout.java:285)
    at okio.AsyncTimeout$2.read(AsyncTimeout.java:241)
    at okio.RealBufferedSource.indexOf(RealBufferedSource.java:354)
    at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:226)

Environment

Related Issue(s)

Explanation

The exception java.net.SocketTimeoutException: timeout is caused by the read (or request) timeout being exceeded during the connection between the Jenkins master and a kubernetes agent. This timeout applies after the connection has been established. It is set to 15s by the kubernetes plugin by default.

Before Kubernetes plugin version 1.22.3, a value of 0 results in a Read Timeout of 10s: no timeout is explicitly set to the kubernetes client and the default timeout of the okhttp client is used.

Since Kubernetes plugin version 1.22.3, the minimum value possible for the Read Timeout is 15s.

Resolution

If an instance is impacted by this problem, consider increasing the Read Timeout in the Kubernetes Cloud configuration.

CloudBees Core on Modern Platform

If using CloudBees Core on Modern Platform, this can be done from the Operations Center.

In the Operations Center, select the “All” view and configure the item “kubernetes shared cloud”. Then adjust the value of the Read Timeout. Once saved, it may take a few seconds for the change to be applied to all managed masters.

Note: An issue in the kubernetes plugin prior to version 1.22.3 prevents from passing the configured value of the Read Timeout of a kubernetes shared cloud to the connected masters. The impact is that the kubernetes cloud configuration that synchronizes across the connected masters has a value of 0 for the Read Timeout field, that results in a timeout of 10s and it cannot be changed. The fix will be included in the January 2020 release and we will update the article accordingly. If severely impacted and need help before that release, please open a CloudBees Support request.

Any Master

Note: An issue in the kubernetes plugin prior to version 1.14.9 prevents from persisting the configured value of the read timeout. The kubernetes plugin must first be upgraded to version 1.14.9 or later (CloudBees Core 2.164.3.2 or later).

Go to Manage Jenkins > Configure System > Cloud and adjust the Read Timeout of the Kubernetes cloud configuration.

Have more questions?

0 Comments

Please sign in to leave a comment.