Issue
- We would like to apply patches to worker / controller
- We would like to apply patches to worker / controller periodically
Environment
- CloudBees Jenkins Enterprise (CJE) - AWS/Anywhere
Resolution
Updating the OS images used in CJE requires a restart of the components, namely the controllers and workers. This may produce a lot of downtime.
You can reduce this downtime by launching new worker nodes and then destroy the old ones once the new ones are up. Then restart each controller, one by one.
With this approach, the downtime is reduced to the re-provisioning of CJOC and the Managed controllers to the new workers.
Note: The rollout of the workers is manageable. It could be done one instance at a time - i.e. add one worker then remove an old one - or all at once - i.e. add N workers then remove the N old ones.
AWS
Pre-requisites:
- Prepare new AMIs
- Ensure enough IPs are available in VPC(s)
Constraints:
- Ensure 2 controllers are available at all time
- Ensure CJOC is / can be provisioned when executing
worker-add
andworker-remove
operations
Process:
- Use cje upgrade –config-only –force to update the CJE config to use the new AMI (see How to change controllers and workers AMI)
- Use the cje prepare worker-add operation to add the new worker(s), using the new AMI
- Provision a “Test” controller to check that the provisioning works on the new worker(s) - CPU / Memory resources can be tweaked to ensure only the new worker(s) can accept the offer
- Use cje prepare worker-remove to remove the old worker(s)
- Use cje prepare controller-restart to restart each of the controllers, one at a time. By ensuring that only one controller is down at a time, there shouldn’t be any downtime.
Note: Step 3) are optional but recommended to ensure that the patched worker(s) behave.
Anywhere
Pre-requisites:
- Prepare new hosts for the new workers / controllers
Constraints:
- [IMPORTANT] Ensure new controllers have the same IPs / DNS hostnames (the Mesos / Zookeeper cluster is based on the IPs / DNS hostnames provided, changing these IPs / DNS hostnames would require to re-create the cluster)
- Ensure 2 controllers are available at all time
- Ensure CJOC is / can be provisioned when executing
worker-add
andworker-remove
operations
Process:
- Use the cje prepare worker-add operation to add the new worker(s)
- Provision a “Test” controller to check that the provisioning works on the new worker(s) - CPU / Memory resources can be tweaked to ensure only the new worker(s) can accept the offer
- Use cje prepare worker-remove to remove the old worker(s)
- Shutdown the old worker(s)
- Replace each controller one at a time. By ensuring that only one controller is down at a time, there shouldn’t be any downtime. For each controller
- Shutdown the old controller
- Update the Load Balancer / DNS (removing the old controller / adding the new controller)
- Use cje prepare controller-restart to initialize the new controller
Note: Step 2) is optional but recommended to ensure that the patched worker(s) behave.
0 Comments