KBEC-00446 - Some considerations on the Change-Tracking feature

Summary

Change-Tracking is a feature intended to allow you to view specific changes being made to objects inside the Flow system.
These changes involve some performance overhead concerns that this article provides some perspective around.

What is Change-Tracking?

The Change Tracking (CT) feature was added in v5.3 in 2014 with an aim to offer a means to help identify when an object was changed in the system.  Here is the overview provided in the documentation:

https://docs.cloudbees.com/docs/cloudbees-flow/latest/change-tracking/

Similar to how SCM systems work, the idea with CT was to capture baseline copies of any object, and then record any changes with a goal towards providing visibility into who and when some change may have entered the system – potentially to dig deeper to learn why, although comments with changes is not recorded. 

What are some of the concerns?

Certainly it was a good idea in theory to try to collect change information, but to some extent, the implementation was over-baked, as the granularity of collecting minor tweaks can become excessive.  When this feature was released back in 2014, these concerns came with risks, and the documentation identifies some of those risks in these 2 sections:

https://docs.cloudbees.com/docs/cloudbees-flow/latest/change-tracking/ctperformance
https://docs.cloudbees.com/docs/cloudbees-flow/latest/change-tracking/ctbest-practices

To summarize these concerns:

A)  There is a cost to activate CT for any project, as every existing object in the project needs to be created in the CT tables as the baseline item.
B)  The effort for activating any project will be incurred at system start time.
C)  The amount of churn on a project needs to be considered.  Projects with heavy churn, or automated updates should not use CT.
D)  Projects may benefit by being split to keep more stable data in 1 location to help with CT costs.  (However this can result in adding costs due to “context-switching" at runtime).
E)  If you store significantly high volumes of data under the “/server” area, there is an option to deactivate such data from CT use.
F)  Because of (A)+(B), it is important to acknowledge that toggling ON any projects on your production system must be done carefully, since the system “downtime” can increase

How well does Change-Tracking work?

In practice, what we’ve seen is that customers often are not reviewing the risks associated with this feature in advance to make appropriate decisions for their usage and have experienced a downstream performance drag at a later date.  This often results in the customer simply turning the feature off.

The snapshot feature for pipelines currently depends on the CT feature.  As such, CT was setup to be active by default for any new project.  This also allowed any project to grow organically with any CT data, and not have to incur a “startup” cost to activate the project.  Since releasing the CT feature, a number of customers who don’t use snapshots have ultimately decided to turn off the CT feature altogether.

While CT can provide perspective on smaller projects, such projects are not typically as important to know who changed what, so there is essentially a “Catch-22” reality with wanting to use this feature.  On larger sized projects, the CT feature tends to take a significant amount of time to present data for it to be usable in a general sense.

What is the future of Change-Tracking?

With the known limitations and usability issues from customer experience, it is intended for the CT feature to be revisited by our Engineering team.  One consideration is that what the feature tries to cover might need to be scaled back to reduce the volume of updates being collected.

It is also true that a DSL feature did not exist at the time that Change Tracking began being built.  In fact, DSL came about late in the stages of the CT work and ultimately saw the light of day in the subsequent v5.4 Feature Release in April of 2015.  With DSL now also offering an overwrite option (v9.2), this ensures code written on a separate system can be idempotent for placement on the production system.  Therefore the best practice recommendation that we now want to encourage is one where customers do development work with all it’s numerous iterations against a DEV system (or perhaps still PROD) , storing that work on their SCM, and fully review any design choices before deploying those changes to their PROD Flow systems.  As such, the need for Change Tracking inside of the production system can be significantly reduced.
 

How does this affect my current system? 

Every customer makes different implementation choices for their own reasons, and experiences different usage models as well.  As such, it is not possible to know without experimentation what the ultimate impact will be to your day-to-day use by making any alterations.  We do know that customers who have turned this feature of altogether have avoided some significant performance drags, as have those who have some very large projectsPR who simply deactivated those projects from collecting data.  We would expect such benefits to be seen in your environment as well.  But to what extent these changes will have in day-to-day use is not easy to say precisely.  That said, in a general sense, if your need for snapshots does not look likely in the near to medium term, changing these settings would be recommended to help your teams remove some burdens that will likely be removed anyhow in a future version update, after the project to trim this feature gets executed.

 

What are the options for improving our system at this time?

1.  You should first start by reviewing your projects and determine which projects should have CT turned off.  Some of your larger projects which you know are constantly churning should be recognized as inappropriate candidates for use with CT.  This will help to stop any further growth of CT data, and should allow work involved with these projects to proceed more efficiently.

2.  Since you are not using Snapshots at this time, and don’t see this in your future plans at present, the system-wide deactivation would be an option for you.  This only requires that you add 1 line to your database.properties file and a system restart. BUT…

a.  Any decision down the road to re-activate a project will incur the 1x data creation cost of re-establishing the new object baselines for each project that you are using.

b.  The cost for reactivation of any project could be established using a test system

3.  If you believe that the “/system” area is one which is heavily used (churning) and could benefit from being removed from the CT collection, you could de-activate this sub-system from data recording.  This doesn’t affect any project usage per se, but would help some regular operations perform more efficiently.  BUT….

a.  As in (2), any decision after the fact to re-activate the /system data will incur the cost for re-activation by establishing a copy of every object at the /system level.  This could be tested for your system by setting up a test system, de-activating, and then re-activating to see how much time a re-activation would require.

Where can I find out about the different settings involved?

Details for enabling or disabling of CT in any area can be found on these 2 pages: 

https://docs.cloudbees.com/docs/cloudbees-flow/latest/change-tracking/change-tracking-config
https://docs.cloudbees.com/docs/cloudbees-flow/latest/change-tracking/ctperformance

Please file tickets for any subsequent questions about this feature functionality if you would like to see further clarifications.

Applies to

  • Product versions: 5.4 releases of Flow or later
  • OS versions: Linux and Windows

Have more questions?

0 Comments

Please sign in to leave a comment.