How To Increase CloudBees High Availability Timeout




Often times HA failover is a sign of an underlying issue. Commonly, long running GC Cycles that last longer than the default timeout (10s for versions lower than 2.303.2.5, 30s for version 2.303.2.5 and greater) can be at the root cause of these issues. Therefore, following the Best Practices is a must.

If you are suffering HA Failover too often, we encourage you to Submit a Support Request so we can diagnose the root cause.

The CloudBees High Availability (HA) Plugin utilizes jgroups which has a configurable jgroups.xml file that can live inside of ${JENKINS_HOME} By default this file is not present; if you want to customize the jgroups settings you will need to create the file. This article has reference copies of the file which you can use as a basis. Be sure to choose the file that matches your version of CloudBees CI.

The following <FD> node within jgroups.xml is what determines the timeout period before failover. It essentially works like: timeout*max_tries (+ verify_suspect). Therefore, with the default settings:

<FD timeout="3000" max_tries="3"/><VERIFY_SUSPECT timeout="1500"/>

3000*3(+1500) = ~10seconds


If you have an immediate or outlying need to increase the timeout, you can increase the values:

<FD timeout="3000" max_tries="10"/><VERIFY_SUSPECT timeout="1500"/>

3000*9(+1500) = ~30seconds

This 30 second timeout became the default timeout in relase 2.303.2.5 with change Increased High Availability (HA) default timeout (BEE-106).

Have more questions?


Please sign in to leave a comment.