- A big amount of files in a volume causes a timeout when trying to attach the said volume
- There’s a timeout when waiting for volumes to attach when using Kubernetes
- There’s a timeout when trying to attach volumes to a Jenkins instance or Master
- I am seeing a warning or an event similar to the one below when trying to attach a volume:
Unable to mount volumes for pod "<POD_IDENTIFIER>": timeout expired waiting for volumes to attach or mount for pod "jenkins"/"<POD_NAME>". list of unmounted volumes=[output]. list of unattached volumes=[output <VOLUMES>]
- CloudBees CI (CloudBees Core)
- CloudBees CI (CloudBees Core) on modern cloud platforms - Managed Master
- CloudBees CI (CloudBees Core) on modern cloud platforms - Operations Center
- CloudBees CI (CloudBees Core) on traditional platforms - Client Master
- CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
- CloudBees Jenkins Enterprise
- CloudBees Jenkins Enterprise - Managed Master
- CloudBees Jenkins Enterprise - Operations Center
- CloudBees Jenkins Platform - Client Master
- CloudBees Jenkins Platform - Operations Center
- CloudBees Jenkins Distribution
- Jenkins LTS
Since CloudBees CI pod are running as non-root users, Operations Center pod and Controller (formerly Master) pods have the
fsGroup set to the Jenkins group
1000 by default. This is so that the volume can be writable by the Pod user.
With this setting, Kubernetes checks and changes the ownership and permissions of all files and directories of each volume mounted to the pod. When a volume is or become very large, this can take a lot of time and slow down the Pod startup. In Kubernetes 1.20, there is beta feature File System Change Policy that can help to reduce the time it takes to set the permissions. See Configure volume permission and ownership change policy for Pods and Kubernetes 1.20: Granular Control of Volume Permission Changes.
When impacted, this shows up as multiple occurrences of
timeout expired waiting for volumes to attach or mount for pod in the pod events. However, this exception can happen when the volume backend is not yet provisioned and attached to the host.
Check if Pod is impacted
When seeing this timeout, first check that the volume is correctly provisioned and attached to the hosts - for example in AWS / EKS, check that the EBS volume is attached to the host.
- If the volume is attached, then this is most likely the problem
- If the volume is not attached, then this is a different problem related to the provisioning of the external storage and the use of
fsGroupis most likely not related.
The workaround for the problem is to remove the
fsGroup. Go to the Controller item configuration, leave the FS Group field empty and Save. Then Restart the Controller from Operations Center.
Note: It is required to keep the
1000 on volume creation - such as when creating a Controller. It is safe to remove the
fsGroup of existing volumes. The recommended strategy is to only remove
fsGroup when impacted by this particular problem.
- AKS - Azure Kubernetes Service
- EKS - Amazon Kubernetes Service