Any recommendations around managing workspaces in CloudBees CD
A general overview on workspaces can be found here:
Every customer will arrange their workspace needs differently. Setting up an initial preferred workspace and creating another as the “default” for backup purposes is a good starting point to ensure some redundancy.
Some other common separation considerations would be:
- creating different workspaces for different OS
- creating different workspaces for different zones in the environment
- creating specific workspaces for handling specific job types
- separating development work from production
- separating output from highly secure work
You can use CloudBees CD ACL restrictions on a workspace to guarantee the desired levels of access for the placement or viewing of any workspace files from within the product.
However, it is equally important the the procedure or pipeline steps involved are written in a manner that limits access to the physical files to the appropriate groups and users as well.
Apart from the default zone, a zone for CloudBees CD sits behind a gateway, ensuring that the work being done by any resource inside that zone is protected from outside access.
This would demand a local workspace for this zone (or just disconnected workspaces on particular resources inside this zone) be available.
If the zone is large enough, or has a large block of localized users, you will probably want to install a webserver inside this zone as well.
This way users from this zone can connect to the webserver inside this zone to be able to directly view any log files being stored in the workspaces associated with this zone.
If certain output needs to be shared across the zones, then the ecremotefilecopy technique (see: https://docs.cloudbees.com/docs/cloudbees-cd/latest/automation-platform/installed-tools#ecremotefilecopy) can be used to transfer workspace data on an as-needed basis back to a default-zone workspace.
This can allow the more commonly used webservers to have access to some key files being generated inside the secured zone.
Some advantages of allocating separate workspaces for different types of work, are
- Simplified process for instituting cleanup rules with varying retention rates
- Depending on labelling, this may make it easier to identify work type when viewing files directly in the file system
Workspace cleanup can be managed on a job-by-job basis or by using traditional disk space cleanup via “rm” or “del”, as outlined near the end of this article:
If you setup data retention rules early and are consistent with running them, then the built in job-by-job approach can help to keep your workspace content trimmed. However it’s fairly common for customers to neglect this setup in the early stages of their solution planning and only consider cleaning things up much later, after large amounts of data have taken up disk space.
For example, the customer has 3 years of history and now wants to institute a policy for keeping only 6 months of results. If they setup the 6-month rule, this will mean that 2.5 years worth of content will need trimming.
Individually deleting workspace data tied to a long list of historical work will be time-consuming for the background deleter. In some cases, this may result in incomplete cleanups when the original resource from an older job is no longer part of your existing resource set if the Data Retention feature is programmed to be cleaning up workspace data for a job after viewing the job details, and assuming that the means of connecting with that resource is through the assigned resource.
There is no direct workspace -> resource based perspective inside the database, so it relies on this simplified model for identifying how to clean out the workspace data.
Therefore using a “tradtional” cleanup approach to remove large batches of files tends to fit better under such circumstances, as identifying the directories created beyond a certain date is something scriptable via shell or windows commands.
With such large batches of files. Such file removals can be completed in seconds and may save time to managing a diskspace shortage crisis if administrative oversight is not more consistent.