Provisioning of Pods taking an abnormal amount of time

Issue

  • A large volume of pods are waiting to be scheduled and waiting in the queue, pods going into failed and pending status :
Pod randomPodName marked as unschedulable can be scheduled on ip-XXX-XX-XX-XX.ec2.internal. Ignoring in scale up."

Environment

Related Issue

Explanation

This is caused by an issue in Kubernetes and the cluster autoscaler prior to version 1.11.7

Resolution

Kubernetes versions affected

  • For a temporary workaround the node reported can be tainted so kubernetes no longer schedules jobs on this node
    Setting a taint to the node will prevent new pods from being scheduled there and after few seconds, the autoscaler should start scaling things properly.
  • Long term resolution it is recommended to upgrade to Kubernetes 1.11.7 and Cluster Autoscaler supported version
    following the guidelines listed in the related issue. Autoscaler fails to scale up nodes with pending pods
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.