Required Data: Kubernetes Cloud

Issue

  • My Kubernetes Cloud configuration does not work.
  • I am having issues with spining up an agent by using a Kubernetes Pod Template.

Quick check

Please, notice that many issues are related to the selected image for the Pod Template, so before continuing please verify if your Pod Template can spin up an agent using the jenkins/jnlp-slave image as you can read on the description of the plugin:

Tested with jenkins/jnlp-slave, see the Docker image source code.

Required Data Kubernetes Cloud

This article describes how to collect the minimum required information for Kubernetes Cloud on a Client/Managed Master so that it can be efficiently troubleshooted.

If the required data is bigger than 20 MB you will not be able to use ZenDesk to upload all the information. On this case we would like to encourage you to use our upload service in order to attach all the required information.

Environment

Required Data check list

  • From CloudBees Jenkins

    • Jenkins log recorder for Kubernetes Cloud Plugin
    • Jenkins Kubernetes Cloud description
    • Jenkins Kubernetes Pod Template description
    • (Optional) Items required from An issue with a Build of a Job
  • From Kubernetes

    • Kubernetes Cluster Description
    • Agent Events
    • Agent Container logs

From CloudBees Jenkins

Jenkins log recorder for Kubernetes Cloud Plugin

Configure a couple of new Jenkins log recorder:

  1. org.csanchez.jenkins.plugins.kubernetes at ALL level
  2. okhttp3 at DEBUG level

When you generate the support bundle ensure to select All loggers currently enabled

Important:

  1. Reproduce the issue in order to populate those logs before producing the support bundle.
  2. After you verified that those logs have been populated, do not leave those logs enabled in a production environment. This is just for troubleshooting. Then they should be removed

Jenkins Kubernetes Cloud description

The Jenkins Kubernetes Cloud configuration is saved under $JENKINS_HOME/config.xml you have 2 options here:

  • When you generate the support bundle ensure to select the Jenkins Global Configuration File (Encrypted secrets are redacted) option.
  • Send $JENKINS_HOME/config.xml directly.

Jenkins Kubernetes Pod Template description

The Jenkins Kubernetes Pod Template description of the agent you are having issues with. Two options:

Additionally:

  • If the agent is getting provisioned, the Console Output of the job displays the yaml description
  • In case you are not using jenkins/jnlp-slave, attach the Dockerfile

From the Kubernetes Cluster

Not all the items are needed, it depends on the situation. For instance, if the pod is not Running you cannot get the container logs.

Kubernetes Cluster Description

Kubernetes Cloud description including:

  • The Cloud provider where the cluster is hosted (Openshift, AWS, etc)
  • The cloudbees-cluster-details.txt as result of:
$> kubectl get node,statefulset,pod,svc,ingress,endpoints,cm,pvc,pv -o wide -n <yournamespace> > cloudbees-details.txt

Notes:

1.- Replace by the namespace where you have deployed the CJE cluster (normally cje). If you are using more than one namespace for distributing for applications, please include them.

2.- For openshift installation replace ingress by route object.

Agent Pods

Get the status of the Jenkins agent pod

> kubectl get -a pods --watch

Agent Events

You have a couple of options to fetch that information:

via events

  • From the Jenkins Console logs, get the agent-UID-example for the build:
...
[Pipeline] { (hide)
[Pipeline] node
Still waiting to schedule task
‘agent-UID-example’ is offline
...

Then, search for the Jenkins agent events for the UID. Make sure you are in the correct cluster and namespace.

$> kubectl get events -n <agent-namespace> | grep agent-UID-example > agent-events.txt
89s         Normal    Scheduled            pod/agent-UID-example                          Successfully assigned cje-support-general/agent-UID-example to gke-cluster-support-gene-default-pool-33d2ba61-t4dl
87s         Normal    Pulling              pod/agent-UID-example                          Pulling image "gcr.io/image2/executor:debug"
87s         Normal    Pulled               pod/agent-UID-example                          Successfully pulled image "example.io/image2/executor:debug"
86s         Normal    Created              pod/agent-UID-example                          Created container image2
85s         Normal    Started              pod/agent-UID-example                          Started container image2
85s         Normal    Pulled               pod/agent-UID-example                          Container image "maven:3.3.9-jdk-8-alpine" already present on machine
84s         Normal    Created              pod/agent-UID-example                          Created container image1
84s         Normal    Started              pod/agent-UID-example                          Started container image1
84s         Normal    Pulled               pod/agent-UID-example                          Container image "cloudbees/cloudbees-core-agent:2.204.2.2" already present on machine
83s         Normal    Created              pod/agent-UID-example                          Created container jnlp
82s         Normal    Started              pod/agent-UID-example                          Started container jnlp
  • If the agent-UID-example is not displayed you have the following alternatives:
> kubectl get events -n <agent-namespace> --watch
> kubectl get events -n <agent-namespace> --sort-by=.metadata.creationTimestamp > agent-events.txt

via describe pod

$> kubectl describe pods my-jenkins-agent > agent-describe-pod.txt

Agent container logs

If they are Running, use logs to get the log output for each of the containers (e.g example-container) that are running in the pod, including jnlp.

kubectl logs my-jenkins-agent -c example-container  > jenkins-agent-example-container -logs.txt

If you are trying to find out the reasonm behind being the Pod Killed or Terminated before getting logs, you could adjust the following properties for the troubleshooting (Important: get back to your standard setup after troubleshooting):

  • Increase Timeout in seconds for Jenkins connection, 2 options here:
    • Adding the Java Property to the issued master -Dorg.csanchez.jenkins.plugins.kubernetes.PodTemplate.connectionTimeout=60000
    • Updating Kubernetes Cloud Templates UI to 60000
  • Set podRetention to always()

References

Have more questions?

0 Comments

Please sign in to leave a comment.