What are common questions I can answer with Run Insights of CD platform monitoring?
AnsweredFirst of here is a detailed article explaining the different metrics and widget that are part of your Run Insights dashboard: https://go.cloudbees.com/docs/cloudbees-documentation/devoptics-user-guide/run_insights/
What are some questions we can answer with Run Insights or how does it help identify areas of improvement?
- What is the current CD platform efficiency?
- When is the earliest time to restart the servers in your cluster with least impact?
- When is a good time for scheduled maintenance or upgrades?
- How to identify executor inefficiencies and increase performance?
- How can I cut down infrastructure costs?
- What is the infrastructure capacity and usage per node?
-
Official comment
- What is the current CD platform efficiency?
The activity bar gives live insights into the current usage and performance of your Jenkins CD platform. Especially following metrics give good insights into current performance and impact on your teams:
- Runs Waiting to Start: shows how many job/pipeline runs are currently in the queue waiting to start
- Average Time Waiting to Start: provides an estimate of how long before builds can start
- When is the earliest time to restart servers in your cluster with least impact?
Current Time to Idle in the activity bar on top provides an estimate of how long it will be before all currently active jobs are completed. Paired with Runs Waiting to Start and the analysis below of Activity per hour per day it gives a good estimation on expected load on the system in the next hours to identify a good time to restart servers in your cluster with the least impact as possible.
- When is a good time for scheduled maintenance or upgrades?
Analysis of Activity per hour per day it gives a visibility into historic usage and load to schedule a maintenance window or upgrades.
- How to identify executor inefficiencies and increase performance?
If the number of Idle Executors is high and Runs waiting to start is also high then the load on your cluster may be mismatched or that the jobs/pipelines waiting to start may have restrictions preventing them from running on the available executors. For example, job/pipeline restrictions may be preventing available executors from being used during these times.
- How can I cut down infrastructure costs?
Idle Executors show how many executors are waiting to receive jobs. It also shows an indication of expected idle executor count for this time and day to easily spot anomalies in real-time. Too many idle executors indicate unnecessary infrastructure costs.
- What is the infrastructure capacity and usage per node?
The the Runs per Node per Label tab helps you understand infrastructure capacity and usage.
- The Average Executors in Use column displays the average percentage of used executor capacity for each label
- The Average Queue Time column displays the average time that jobs spend in the queue for each given label over the selected time period
- The Average Queue Length column displays the average length of the job queue for each given label over the selected time period
High number of executors in use + high queue length suggests supply is out-stripping demand at certain points in the work day.
High queue time / high queue length and low average executors in may indicate you have jobs set up using label expressions (e.g. A && B), and the number of nodes that actually satisfy that are low.
Please sign in to leave a comment.
Comments
1 comment