CJP Performance Best Practices for Linux

The poor performance of a Jenkins instance is frequently due misconfigurations and/or the side effect of not following the good practices.

The following document pretends avoid the easiest and more frequent performance issues that usually happens in a enterprise context.

The article is divided into best practices which should be applied at infrastructure level and at application level.

Infrastructure

Open files and new processes

Jenkins is an application which usually produces more open files in the OS than the default values set-up in almost any Linux distribution. When there is a migration or an installation of a fresh Jenkins instance, this is usually the first issue you will face due Jenkins not being able to open as many files as it requires.

The stacktrace we will see on Jenkins when this happens is:

Caused by: java.io.IOException: Too many open files
	at java.io.UnixFileSystem.createFileExclusively(Native Method)
	at java.io.File.createNewFile(File.java:1006)
	at java.io.File.createTempFile(File.java:1989)

To access to the default values in the OS you can use the command ulimit -a.

max user processes              (-u) 1024
open files                      (-n) 1024

The recommended values below should be set-up in /etc/security/limits.conf.

jenkins      soft   nofile  4096
jenkins      hard   nofile  8192
jenkins      soft   nproc   30654        
jenkins      hard   nproc   30654
  • Note that this assumes jenkins is the Unix user running the Jenkins process. If you’re running JOC, the user is probably jenkins-oc.

You can find detailed information about this problem in our KB article
* Too many open files

Huge pages

Some Unix distributions has Transparent Huge Pages (THP) enabled which is known to cause performance issues with Java workload on big servers. Im case you would like to have more background on this, you can take a look at this CentOS issue: http://bugs.centos.org/view.php?id=5716 and this JDK issue: https://bugs.openjdk.java.net/browse/JDK-8024838.

The recommendation for CJP is to disable the THP. For this, run this command as root:

echo "never" > /sys/kernel/mm/redhat_transparent_hugepage/enabled

Detailed information about disabling THP can be found in the RHEL KB: https://access.redhat.com/solutions/46111

$JENKINS_HOME on a shared location through NFS or similar

This is perhaps the most frequent mistake done by Jenkins administrators and which has a big important performance impact in the instance. It is an important performance optimization that the .war file is not extracted to the $JENKINS_HOME/war directory in the shared filesystem so the application read operations do not go through the shared location.

Some configurations may do this by default, but .war extraction can easily be redirected to a local cache (ideally SSD for better Jenkins core I/O) on the container/VM’s local filesystem with the JENKINS_ARGS properties --webroot=$LOCAL_FILESYSTEM/war --pluginroot=$LOCAL_FILESYSTEM/plugins. For example, on Debian installations, where $NAMErefers to the name of the jenkins instance: --webroot=/var/cache/$NAME/war --pluginroot=/var/cache/$NAME/plugins.

  • Note (if Jenkins is running in a web container): The --pluginroot and --webroot options are specific to Winstone. The alternative to --pluginroot is to add the system property -Dhudson.PluginManager.workDir=$LOCAL_FILESYSTEM/plugins. There is no need for an alternative to --webroot since the .war is extracted in a local directory of the container manager. For example in Tomcat, if the application name is jenkins the .war is extracted under $CATALINA_HOME/webapps/jenkins.

  • Note: --pluginroot option and -Dhudson.PluginManager.workDir system property only work since jenkins-1.649 so if the argument is added in a jenkins version lower than this one, jenkins might not be able to start.

$JENKINS_HOME is read intensively during the start-up. If bandwidth to your shared storage is limited, you’ll see the most impact in startup performance. Large latency causes a similar issue, but this can be mitigated somewhat by using a higher value in the bootup concurrency by a system property -Djenkins.InitReactorRunner.concurrency=8.

NFS

Use NFSv3 or NFSv4.1 or bigger as NFSv4.0 is NOT recommende due known performance problems. Please follow the best practices on regards NFS on the KB below.

Application

Master build executors

Never use the master agent to build as it will be sharing the resources with the Jenkins master. Go to Manage Jenkins -> Manage Nodes -> master and…

  • Set-up # of executors to 0
  • Change the Usage strategy to Only build jobs with label expressions matching this node

Memory

Heap memory

It is recommended to start Jenkins with at very least the -Xmx parameter so we don’t use the value which will be assigned by the JDK on the application start. The typical error messages you will seeing the Jenkins logs in the one below.

java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: Java heap space

First step would be to increase the memory available for the JVM. To increase the heap size, you can add or update -Xms and -Xmx parameters in the command-line argument of the JVM.

In the case your instance demands more than -Xmx10g any of the following points could be happening

  • Other factors, apart from the Heap Size, are not taking into consideration for improving your performance. please keep reading this article.
  • Your instance is not able to support its current workload (number of jobs and configuration) so you should consider about escalating horizontally your infra (more masters) to divide this workload more efficiently.

Garbage collector

The right strategy for the GC will make your masters more responsive and stable, especially with large heap sizes.

G1GC ( -XX:+UseG1GC) should be used for 4GB or larger Java heaps with multi-core systems and Concurrent Mark Sweep (CMS) ( -XX:+UseConcMarkSweepGC) should be used for 2GB - 4GB Java heaps with multi-core systems.

A large explanation about the importance of the GC in the Jenkins context can be found in the Blog Post Joining the Big Leagues: Tuning Jenkins GC For Responsiveness and Stability.

A summary of this article is done in Prepare Jenkins for support under the section B. Java Parameters.

Triggers

When it possible use Webhook as explained on [SCM Best Practices][]

If for any reason, you cannot go out from SCM polling then you should limit the concurrent SCM polling to not more than 10 under Manage Jenkins -> Configure System [SCM Polling -> Max # of concurrent polling]

The problem is usually that Jenkins users create bad SCM pollings like * * * * * [pool every minute].

The below script can be executed under Manage Jenkins->Script Console to provide the SCM polling value of all the jobs configured in the instance.

import hudson.triggers.*;
import hudson.maven.MavenModuleSet;

println("--- SCM Polling for FreeStyle jobs ---");
List<FreeStyleProject> freeStyleProjectList = Jenkins.getInstance().getAllItems(FreeStyleProject.class);
for (FreeStyleProject freeStyleProject : freeStyleProjectList) {
  SCMTrigger scmTrigger = freeStyleProject.getSCMTrigger();
  if (scmTrigger!= null) {
    String spec = scmTrigger.getSpec();
    if (spec != null) {
      println(freeStyleProject.getFullName() + " with spec " + spec);
    }
  }
}

println("--- SCM Polling for Maven jobs ---");
List<MavenModuleSet> mavenModuleSetList = Jenkins.getInstance().getAllItems(MavenModuleSet.class);
for (MavenModuleSet mavenModuleSet : mavenModuleSetList) {
  SCMTrigger scmTrigger = mavenModuleSet.getTrigger(SCMTrigger.class);
  if (scmTrigger!= null) {
    String spec = scmTrigger.getSpec();
    if (spec != null) {
      println(mavenModuleSet.getFullName() + " with spec " + spec);
    }
  }
}

JobConfigHistory

The JobConfigHistory is one of the most used plugin in enterprises context, however most of the times it is not well configured, which produces important performance issues.

  • Use different history directory than default : specially when the $JENKINS_HOME is under a NFS system or similar
  • Max number of history entries to keep to avoid the option Do not save duplicate history to produce slowness on the instance looking for duplicates.
  • Max number of days to keep history entries to avoid the option Do not save duplicate history to produce slowness on the instance looking for duplicates.
  • System configuration exclude file pattern consider to add |com.cloudbees.opscenter.client.plugin.OperationsCenterCredentialsProvider so the way the product works do not produce slowness itself.

You can find detailed information on the KB JobConfigHistory Plugin Best Practices

Folder plugin

After version 5.14 (December 5, 2016) of the CloudBees+Folders+Plugin a better caching for the folder health was implemented, however on big instances we still recommend to disable the weather column for a better performance.

REST API

It is very common to see instances which are receiving a huge amount of REST API calls without the notice of the Jenkins administrators.

The number of REST API calls received can be monitorized by the monitoring plugin accessing to http/s:///monitoring. The Jetty or Tomcat access logs could also advertised if this might be or not a problem.

To block all the REST API calls you can use the CloudBees Enterprise plugin request Filter.

If your business is relying on REST API, then you should follow the best practices Best Practice For Using Jenkins REST API

Suffering performance issues?

Please, file a support ticket on the CloudBees support platform attaching proactively the information requested in Required Data: High CPU On Linux

[SCM Best Practices]:

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.