Elasticsearch NoShardAvailableActionException

Issue

Something goes wrong when you try to access to Analytics Dashboard, showing a red flag and such stacktrace:

Courier Fetch Error: unhandled error Error: Request to Elasticsearch failed: {"_index":"kibana-4-cloudbees","_type":"config","_id":"4.1.2","error":"NoShardAvailableActionException[[kibana-4-cloudbees][2] null]"}
at http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:43092:38
at Function.Promise.try (http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:46435:26)
at http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:46413:27
at Array.map (native)
at Function.Promise.map (http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:46412:30)
at callResponseHandlers (http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:43064:22)
at http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:43182:16
at wrappedCallback (http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:20893:81)
at wrappedCallback (http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:20893:81)
at http://operations-center-server/plugin/operations-center-analytics-viewer/index.js?_b=7562:20979:26

This article is talking about kibana-4-cloudbees lucene index but the exception applies to other indexes too. Take care about the corrupted index name because appears at:

{"_index":"kibana-4-cloudbees","_type":"config","_id":"4.1.2",
"error":"NoShardAvailableActionException[[kibana-4-cloudbees][2] null]"}

You could use the resolution method to solve the issue in those other indexes by replacing the name.

Environment

  • CloudBees Jenkins Operations Center
  • Remote Elasticsearch / Kibana

Resolution

The kibana-4-cloudbees lucene index that stores the Analytics Dashboard data (metrics, configurations…) is corrupted. There are many issues that can produce that kind of index corruption but use to be a bad performance of ES service.

How to get a fresh index

kibana-4-cloudbees contains Analytics Dashboard configuration so if you don’t have a backup those configurations are lost. You could get a fresh kibana index following next steps (all custom dashboards will be removed after follow these steps):

  1. Stop CloudBees Jenkins Operations Center
  2. Stop Elasticsearch
  3. Backup and remove the content of the kibana-4-cloudbees index folder that use to be at /var/lib/elasticsearch/nodes/number_of_nodes_you_have_defined/indices/kibana-4-cloudbees.
  4. Start Elasticsearch
  5. Start CloudBees Jenkins Operations Center (**will regenerate the index with default Dashboard configuration**)

ES basic-config recommendations

By default, Elasticsearch starts with 1G of heap and mlockall to false. But the recommended configuration is 2G and mlockall to true (in many cases because mlockall change may impact on the infrastructure and swapping is not bad on many platforms).

Give Half Your Memory to Lucene

Your Elasticsearch instance has the default memory configuration is just 1G but elastic recommends at least Give Half Your Memory to Lucene, so you need to change #ES_HEAP_SIZE=2g (2G of heap is not enough for big instances, It’s a minimal recommended configuration) at /etc/sysconfig/elasticsearch.

Swapping Is the Death of Performance

Swapping is something that we should take in care because Swapping Is the Death of Performance, so review Elastic documentation related to that.

Remember that Swapping is there for a reason and its disabling may impact the infrastructure due to related reasons. Swapping is not bad on many platforms.

References

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.