How To Increase CloudBees High Availability Timeout

Issue

Environment

Description

Often times HA failover is a sign of an underlying issue. Commonly, long running GC Cycles that last longer than the default ~10s can be at the root cause of these issues. Therefore, following the Best Practices is a must.

If you are suffering HA Failover too often, we encourage you to Submit a Support Request so we can diagnose the root cause.

The CloudBees High Availability (HA) Plugin utilizes jgroups which has a configurable jgroups.xml file that can live inside of ${JENKINS_HOME} If you do not have the file, you can download its reference version from Amazon S3.

The following <FD> node within jgroups.xml is what determines the timeout period before failover. It essentially works like: timeout*max_tries (+ verify_suspect). Therefore, with the default settings:

<FD timeout="3000" max_tries="3"/><VERIFY_SUSPECT timeout="1500"/>

3000*3(+1500) = ~10seconds

Resolution

If you have an immediate or outlying need to increase the timeout, you can increase the values:

<FD timeout="3000" max_tries="10"/><VERIFY_SUSPECT timeout="1500"/>

3000*9(+1500) = ~30seconds

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.