KBEC-00504 - Testing considerations after an upgrade or install of CD

Problem

What are some sanity tests to help ensure our system is safely installed or upgraded?

Solution

The most commonly used sanity test to confirm if your CD system is safely setup is to confirm whether you can run a basic “helloworld” type job against a resource. Potentially repeating that a few times, once against the local resource, and then again against some other remote resource(s).

Being able to run a simple job from the UI would help to ensure:
a) That you are able to login - perhaps with the default admin user, or perhaps with your own account if you have configured LDAP/AD.
b) That your webService is up and running
c) That your server connection to the DB is up and running
d) That you have your agents up and available for use
e) That at least one workspace is available for use, and able to store the resulting data
f) That the WebServer is properly configured to access the workspace to allow the results to be displayed.

This is a good starting list for consideration but is not a comprehensive list, as other services/elements of the full solution could be added.
Below are some suggested exploraions should any of these system elements listed not be functioning as expected. Recognize that issues with some of these elements could overlap in causing issues for other elements:

a) WebService status?

  • Perhaps the webService is up, and you are seeing some error like this:
Web server could not connect to the CD server at https://<address> . Perhaps it is not running or your service is configured incorrectly
  • Review the settings in your /conf folder for this webserver

  • Check on the webserver machine to see if the webserver service is even up?

/etc/init.d/commanderApache start /opt/electriccloud/electriccommander/startup/ecmdrAgent 
  • Is the agent on the webServer machine running?

  • Check the status of the specific webserver agent from the CD server machine using:

ectool getResource <resourceName>
  • It’s happened a few times that a customer takes down the installer file yet in the transfer, the installer is somehow corrupted, but appears to start cleanly. So you may want to repeat the effort of taking down the installer and trying to re-install this element from scratch.

See: KBEC-00389-Checking status of installer when receiving an error with init tcl during installation

If this is an HA system, perhaps the Webserver is having issues with the Load Balancer in front of the Servers.

Also consider
- KBEC-00394-Steps-to-check-your-Apache-and-PHP-version
- KBEC-00386-Confirming your ElectricFlow server addresses
- KBEC-00281-Configuring Load Balancers in ElectricFlow Clusters
- KBEC-00041-ElectricFlow TCP port usage diagram and descriptions

b) Is your DB accessible?

  • Is the DB up and running
  • Has the DB schema update completely finished?
  • Is your CD server configured to communicate with the correct username/password for this DB (check the database.properties file)
  • Should you explore using the ecconfigure command to update any setting for the dB?
  • Is the DB version you have been using still compatible with the newer version of the CD server you have installed?

Also consider:
- KBEC-00437-Required changes after changing the hostname of the CloudBees Flow server
- KBEC-00390-Using ecconfigure to change hostnames post install

c) Login test?

  • If you can’t access the WebServer, you can still confirm being able to login to the system directly using the API from the command line

From the server machine:

ectool login <username> <password>

From a remote machine:

ectool --server <FlowServer IP or hostname> <username> <password>
  • If your username from LDAP/AD is not working, you can always try using your default administrator account to login to the UI or the command line:
ectool login admin <password>
  • If you suspect the problem could be with the LDAP/AD configuration, check with your administrator on the settings you are using
    Also consider:
    KBEC-00413-Login access restriction

d) Are agents accessible?

  • Check on the UI to confirm that the agents you are trying to use with your job have a Green Status
  • If you believe the agent is up, but not yet Green, then perhaps these agents need a ping to get activated , so you may want to try using:
ectool pingResource <resourceName>
  • If the hostname/IP-address of the server (or load balancer for an HA setup) has changed, then a ping may be required

  • Check to ensure that your agents SSL settings match expectations if you still encounter connectivity issues.

  • Check the version of the agent in question. While upgrading of agents is rarely a strict requirement for a server update, for some use-cases, or when working with some plugins, you may also need to be using an updated version of the agent to ensure all compatibilities are in place.

e) Is the workspace involved actually available on the host machines?

  • Check your configuration settings to make sure you are pointing to the right place
  • Confirm that the resource using this workspace has the folder setup with appropriate permissions for the user that is running the Agent service
  • Confirm that the workspace is mounted if this is relying on a shared space
  • See KBEC-00266-Using an externally mounted drive to store Commander logs

Also consider:
KBEC-00336-Workspace Issue Agent returned error 404

f) Webserver agent access to the workspace?

  • If this is not properly configured, you won’t be able to view the log output for a jobStep from the UI
  • Confirm inside the actual workspace folders that log output was indeed generated for your test job
  • As with (e), confirm that the webServer agent user has the proper permissions to access this folder
  • As with (e), confirm that the workspace is properly mounted if relying on a shared space

g) Plugins

  • Run a job that relies on a plugin that was updated during the upgrade to confirm that all necessary pieces are in place

h) Repository Server

  • Run a job that tries to retrieve a previously existing artifact

i) Devops Insight / CloudBees Analytics

  • Confirm that dashboards can be viewed
  • Run a test job or pipeline to confirm that associated dashboards properly reflect the new executions

j) Credentials

  • Run a job that relies on credentials to confirm that no issues with any credential change will block use

eg: When upgrading to v9.2 or higher, from an earlier version, the older 128-bit passkey file is replaced with a new 256-bit passkey, this will update all existing credentials
- Testing an upgrade of your production system will require that the passkey file from the production system be used on the test system, otherwise credentials in the DB will be changed. This may not be a concern depending on the types of tests you intend to perform, but something to consider

Have more questions?

0 Comments

Please sign in to leave a comment.