I have run the script listDProcessesNativeStacks.sh and shared the data with CloudBees Support.
What is this and how does it help us understand what is causing the hanging of my master?
- Linux environment
- CloudBees Core
- CloudBees Core on modern cloud platforms - Managed Master
- CloudBees Core on modern cloud platforms - Operations Center
- CloudBees Core on traditional platforms - Client Master
- CloudBees Core on traditional platforms - Operations Center
- CloudBees Jenkins Enterprise
- CloudBees Jenkins Enterprise - Managed Master
- CloudBees Jenkins Enterprise - Operations Center
- CloudBees Jenkins Team
- CloudBees Jenkins Platform - Client Master
- CloudBees Jenkins Platform - Operations Center
- Jenkins LTS
The script listDProcessesNativeStacks.sh uses a combination of
cat to identify processes in a D state
and dump their native stack.
It will usually be used in combination with the output of jenkinshangWithJstack.sh to help us identify what
exactly the native thread of a process is doing.
What is a D state process
It is a process that is in an uninterruptible sleep. Usually this means that the process is waiting on I/O.
What does this script bring to jenkinshangWithJstack
With the jenkinshangWithJstack.sh script, we only have a java view of the stack. It means that we are missing
what is happening at OS level. For instance, in the following stack we can only infer that the JVM is trying to write
something, but we have no idea what is happening at a lower level:
"Executor #-1 for master : executing myJob #11" Id=1305944 Group=main RUNNABLE (in native) at sun.nio.ch.FileDispatcherImpl.pwrite0(Native Method) at sun.nio.ch.FileDispatcherImpl.pwrite(FileDispatcherImpl.java:66) at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:89) at sun.nio.ch.IOUtil.write(IOUtil.java:51)
All we can say is that the JVM is waiting on a native I/O operation. Now, running the listDProcessesNativeStacks.sh in
this context, we can extract more information:
jenkins+ 10656 4238440 2146620 ? D 00:00:00 [<ffffffff81168d1e>] sleep_on_page+0xe/0x20 [<ffffffff81168aa6>] wait_on_page_bit+0x86/0xb0 [<ffffffff81168be1>] filemap_fdatawait_range+0x111/0x1b0 [<ffffffff8116abff>] filemap_write_and_wait_range+0x3f/0x70 [<ffffffffa0422c7e>] nfs_file_fsync+0x7e/0x100 [nfs] [<ffffffff8120ff8b>] vfs_fsync+0x2b/0x40 [<ffffffffa0422f0a>] nfs_file_flush+0x7a/0xb0 [nfs] [<ffffffff811dc9f4>] filp_close+0x34/0x80 [<ffffffff811fd348>] __close_fd+0x78/0xa0 [<ffffffff811de103>] SyS_close+0x23/0x50 [<ffffffff81646d52>] tracesys+0xdd/0xe2 [<ffffffffffffffff>] 0xffffffffffffffff
Now, we can start investigating the NFS.
But how exactly can you use this?
The script is simple to use. It is designed to work on any linux system with
cat (even with the busybox
You’ll need to run it with sudo, or with the root user. You don’t have any parameter to pass to it.
You can set up the output directory with the
D_PROCESSES_OUTPUT_DIR environment variable.
In case you run with sudo, make sure to pass the environment variable to the script by using the -E switch, e.g.:
export D_PROCESSES_OUTPUT_DIR=/tmp sudo -E ./listDProcessesNativeStacks.sh
Make sure to attach the output of the script to the Support Ticket.