UnknownHostException caused by DNS Resolution issue with Alpine Images

Issue

  • Communication between Master, Operations Center or Agents are failing with DNS resolution issues, exposed as a UnknownHostException pointing to internal DNS. For example java.net.UnknownHostException: cjoc.cloudbees-core.svc.cluster.local or java.net.UnknownHostException: cjoc.cloudbees-core.svc.cluster.local.

Environment

Explanation

Based on various investigation, Alpine suffered from DNS resolution problems that can be encountered when running in Kubernetes. One reason is that kubernetes (actually coredns) by default configured the DNS resolver with ndots set to 5. Another is that Alpine uses a specific DNS resolution library: musl.

With ndots set to 5 and the musl library, if the DNS Client fails the search for a first path with an unexpected error (or if it is NXDOMAIN), it does not try others. This specific behavior has a negative impact in some environment and causes host (even kubernetes internal endpoints) resolution to fail. In many cases, the problem is caused by DNS servers returning incorrect answers (NOERROR instead of NXDOMAIN). Essentially the DNS server says the domain exists but there is no entry for the requested type (A). As a result, the musl dns client (used by Alpine) stops resolution. Whereas the glibc library (used by other unix distribution) would not. For more information have a look at the following:

Resolution

Starting from version 2.204.1.3, CloudBees Core supports a new variant of docker images based on the UBI and it becomes the default variant. Those images are not using the musl library and therefore are not impacted by those DNS resolution problems.

The recommend solution is to upgrade to version 2.204.1.3 or later.

Workaround

The workaround if using alpine images (for some agents for example) is to customize the dnsConfig of pods and set the ndots to 1:

dnsConfig:
  options:
    - name: ndots
      value: "1"

Agents

For agents, add the following snippet to the YAML configuration of the Pod template:

apiVersion: "apps/v1"
kind: "Pod"
spec:
  dnsConfig:
    options:
      - name: ndots
        value: "1"

Operations Center**

Add following snippet to the cjoc Statefulset:

apiVersion: "apps/v1"
kind: "StatefulSet"
spec:
  template:
    spec:
      dnsConfig:
        options:
          - name: ndots
            value: "1"

Managed Masters

Add following snippet to the Managed Master item configuration and restart the Master from CJOC:

apiVersion: "apps/v1"
kind: "StatefulSet"
spec:
  template:
    spec:
      dnsConfig:
        options:
          - name: ndots
            value: "1"

*Note: To have this applied to any newly created master, add this snippet to Manage Jenkins > Configure System > Kubernetes Master Provisioning > Advanced > YAML.

Reference

Have more questions?

0 Comments

Please sign in to leave a comment.