Issue
- I need to increase the default timeout for the CloudBees High Availability (HA) Plugin
Environment
- CloudBees CI (CloudBees Core)
- CloudBees CI (CloudBees Core) on traditional platforms - Client controller
- CloudBees CI (CloudBees Core) on traditional platforms - Operations Center
- CloudBees Jenkins Platform - Client controller
- CloudBees Jenkins Platform - Operations Center
- CloudBees High Availability (HA) Plugin
Description
Often times HA failover is a sign of an underlying issue. Commonly, long running GC Cycles that last longer than the default timeout (10s for versions lower than 2.303.2.5, 30s for version 2.303.2.5 and greater) can be at the root cause of these issues. Therefore, following the Best Practices is a must.
If you are suffering HA Failover too often, we encourage you to Submit a Support Request so we can diagnose the root cause.
The CloudBees High Availability (HA) Plugin utilizes jgroups
which has a configurable jgroups.xml
file that can live inside of ${JENKINS_HOME}
By default this file is not present; if you want to customize the jgroups settings you will need to create the file. This article has reference copies of the file which you can use as a basis. Be sure to choose the file that matches your version of CloudBees CI.
The following <FD>
node within jgroups.xml
is what determines the timeout period before failover. It essentially works like: timeout*max_tries (+ verify_suspect)
. Therefore, with the default settings:
<FD timeout="3000" max_tries="3"/><VERIFY_SUSPECT timeout="1500"/>
3000*3(+1500)
= ~10seconds
Resolution
If you have an immediate or outlying need to increase the timeout, you can increase the values:
<FD timeout="3000" max_tries="10"/><VERIFY_SUSPECT timeout="1500"/>
3000*9(+1500)
= ~30seconds
This 30 second timeout became the default timeout in relase 2.303.2.5 with change Increased High Availability (HA) default timeout (BEE-106)
.
0 Comments