Agent-skills elasticsearch-security-troubleshooting

install
source · Clone the upstream repo
git clone https://github.com/elastic/agent-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/elastic/agent-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/elasticsearch/elasticsearch-security-troubleshooting" ~/.claude/skills/elastic-agent-skills-elasticsearch-security-troubleshooting && rm -rf "$T"
manifest: skills/elasticsearch/elasticsearch-security-troubleshooting/SKILL.md
source content

Elasticsearch Security Troubleshooting

Diagnose and resolve common Elasticsearch security issues. This skill provides a structured triage workflow for authentication failures, authorization errors, TLS problems, API key issues, role mapping mismatches, Kibana login failures, and license-expiry lockouts.

For authentication methods and API key management, see the elasticsearch-authn skill. For roles, users, and role mappings, see the elasticsearch-authz skill. For license management, see the elasticsearch-license skill.

For diagnostic API endpoints, see references/api-reference.md.

Deployment note: Diagnostic API availability differs between self-managed, ECH, and Serverless. See Deployment Compatibility for details.

Jobs to Be Done

  • Diagnose HTTP 401 authentication failures
  • Diagnose HTTP 403 permission denied errors
  • Troubleshoot TLS/SSL handshake or certificate errors
  • Investigate expired or invalid API keys
  • Debug role mappings that do not grant expected roles
  • Fix Kibana login failures, redirect loops, or CORS errors
  • Recover from a license-expiry lockout
  • Determine why a user lacks access to a specific index

Prerequisites

ItemDescription
Elasticsearch URLCluster endpoint (e.g.
https://localhost:9200
or a Cloud deployment URL)
AuthenticationAny valid credentials — even minimal — to reach the cluster
Cluster privileges
monitor
for read-only diagnostics;
manage_security
for fixes

Prompt the user for any missing values. If the user cannot authenticate at all, start with TLS and Certificate Errors or License Expiry Recovery.

Diagnostic Workflow

Route the symptom to the correct section:

SymptomSection
HTTP 401,
authentication_exception
Authentication Failures
HTTP 403,
security_exception
, access denied
Authorization Failures
SSL/TLS handshake error, certificate rejectedTLS and Certificate Errors
API key rejected, expired, or ineffectiveAPI Key Issues
Role mapping not granting expected rolesRole Mapping Issues
Kibana login broken, redirect loop, CORS errorKibana Authentication Issues
All users locked out, paid features disabledLicense Expiry Recovery

Each section follows a Gather - Diagnose - Resolve pattern.

Diagnostic Toolkit

Use these APIs at the start of any security investigation:

curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"

Confirms identity, realm, and roles. If this fails with 401, the problem is authentication.

curl <auth_flags> "${ELASTICSEARCH_URL}/_xpack"

Confirms whether security is enabled (

features.security.enabled
). If security is disabled, all security APIs return errors.

curl -X POST "${ELASTICSEARCH_URL}/_security/user/_has_privileges" \
  <auth_flags> \
  -H "Content-Type: application/json" \
  -d '{
    "index": [
      { "names": ["'"${INDEX_PATTERN}"'"], "privileges": ["read"] }
    ]
  }'

Tests whether the authenticated user holds specific privileges without requiring

manage_security
.

curl <auth_flags> "${ELASTICSEARCH_URL}/_license"

Check license type and status. An expired paid license disables paid realms and features.

Authentication Failures (401)

A 401 response means Elasticsearch could not verify the caller's identity.

Gather

curl -v <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate" 2>&1

The

-v
flag shows headers and the response body. Look for:

  • WWW-Authenticate
    header — indicates which auth schemes the cluster accepts.
  • authentication_exception
    in the response body — the
    reason
    field describes what failed.

Diagnose

SymptomLikely cause
unable to authenticate user
Wrong username or password
unable to authenticate with provided credentials
Credentials do not match any realm in the chain
user is not enabled
The native user account is disabled
token is expired
API key or bearer token has expired
No
WWW-Authenticate
header
Security may be disabled; check
GET /_xpack

If the user authenticates via an external realm (LDAP, AD, SAML, OIDC), the realm chain order matters. Elasticsearch tries realms in configured order and stops at the first match. If a higher-priority realm rejects the credentials before the intended realm is reached, authentication fails.

Resolve

CauseAction
Wrong credentialsVerify username/password or API key value. See elasticsearch-authn.
Disabled user
PUT /_security/user/{name}/_enable
. See elasticsearch-authz.
Expired API keyCreate a new API key. See API Key Issues.
Realm chain orderCheck
elasticsearch.yml
realm order (self-managed only).
Security disabledEnable
xpack.security.enabled: true
in
elasticsearch.yml
and restart.
Paid realm after expiryLicense expired — see License Expiry Recovery.

Authorization Failures (403)

A 403 response means the user is authenticated but lacks the required privileges.

Gather

Test the specific privileges the operation requires:

curl -X POST "${ELASTICSEARCH_URL}/_security/user/_has_privileges" \
  <auth_flags> \
  -H "Content-Type: application/json" \
  -d '{
    "index": [
      { "names": ["logs-*"], "privileges": ["read", "view_index_metadata"] }
    ],
    "cluster": ["monitor"]
  }'

The response contains a

has_all_requested
boolean and per-resource breakdowns.

Also check the user's effective roles:

curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"

Inspect the

roles
array and
authentication_realm
to confirm the user is who you expect.

Diagnose

SymptomLikely cause
has_all_requested: false
for an index
Role is missing the required index privilege
has_all_requested: false
for a cluster
Role is missing the required cluster privilege
User has fewer roles than expectedRoles array was replaced (not merged) on last update
API key returns 403 on previously allowedAPI key privileges are a snapshot — role changes after
operationcreation do not propagate to existing keys

Resolve

CauseAction
Missing index privilegeAdd the privilege to the role or create a new role. See elasticsearch-authz.
Missing cluster privilegeAdd the cluster privilege. See elasticsearch-authz.
Roles replaced on updateFetch current roles first, then update with the full array. See elasticsearch-authz.
Stale API key privilegesCreate a new API key with updated
role_descriptors
. See elasticsearch-authn.

TLS and Certificate Errors

TLS errors prevent the client from establishing a connection at all.

Gather

curl -v --cacert "${CA_CERT}" "https://${ELASTICSEARCH_HOST}:9200/" 2>&1 | head -30

Look for:

  • SSL certificate problem: unable to get local issuer certificate
    — CA not trusted.
  • SSL certificate problem: certificate has expired
    — certificate past its validity date.
  • SSL: no alternative certificate subject name matches target host name
    — hostname mismatch.

For deeper inspection (self-managed only):

openssl s_client -connect "${ELASTICSEARCH_HOST}:9200" -showcerts </dev/null 2>&1

This displays the full certificate chain, expiry dates, and subject alternative names.

Diagnose

Error messageLikely cause
unable to get local issuer certificate
Missing or wrong CA certificate
certificate has expired
Server or CA certificate past expiry
no alternative certificate subject name matches
Certificate SAN does not include the hostname
self-signed certificate
Self-signed cert not in the trust store
SSLHandshakeException
(Java client)
Truststore missing the CA or wrong password

Resolve

CauseAction
Wrong CA certPass the correct CA with
--cacert
or add it to the system trust store.
Expired certificateRegenerate certificates with
elasticsearch-certutil
(self-managed).
Hostname mismatchRegenerate the certificate with the correct SAN entries.
Self-signed certDistribute the CA cert to all clients or use a publicly trusted CA.
Quick workaroundUse
curl -k
/
--insecure
to skip verification. Not for production.

On ECH, TLS is managed by Elastic — certificate errors usually indicate the client is not using the correct Cloud endpoint URL. On Serverless, TLS is fully managed and transparent.

API Key Issues

Gather

Retrieve the key's metadata:

curl "${ELASTICSEARCH_URL}/_security/api_key?name=${KEY_NAME}" <auth_flags>

Check

expiration
,
invalidated
, and
role_descriptors
in the response.

Diagnose

SymptomLikely cause
401 when using the keyKey expired or invalidated
403 on operations that should be allowedKey was created with insufficient
role_descriptors
Derived key has no accessAPI key created another API key — derived keys have no privilege
Key works for some indices but not others
role_descriptors
scope is too narrow

Resolve

CauseAction
Expired keyCreate a new key with appropriate
expiration
. See elasticsearch-authn.
Invalidated keyCreate a new key. Invalidated keys cannot be reinstated.
Wrong scopeCreate a new key with correct
role_descriptors
. See elasticsearch-authn.
Derived key problemUse
POST /_security/api_key/grant
with user credentials instead. See elasticsearch-authn.

Role Mapping Issues

Role mappings grant roles to users from external realms. When they fail silently, users authenticate but get no roles.

Gather

curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"

Note the

username
,
authentication_realm.name
, and
roles
array.

curl <auth_flags> "${ELASTICSEARCH_URL}/_security/role_mapping"

List all mappings and inspect their

rules
and
enabled
fields.

Diagnose

SymptomLikely cause
User has empty
roles
array
No mapping matches the user's attributes
User gets wrong rolesA different mapping matched first or the rule is too broad
Mapping exists but does not apply
enabled
is
false
Mustache template produces wrong role nameTemplate syntax error or unexpected attribute value

Compare the user's

authentication_realm.name
and
groups
(from
_authenticate
) against each mapping's
rules
to find the mismatch.

Resolve

CauseAction
No matching ruleUpdate the mapping rules to match the user's realm and attributes.
Mapping disabledSet
"enabled": true
on the mapping.
Template errorTest the Mustache template with known attribute values. See elasticsearch-authz.
Rule too broadAdd
all
/
except
conditions to narrow the match. See elasticsearch-authz.

Kibana Authentication Issues

Missing
kbn-xsrf
header

All mutating Kibana API requests require the

kbn-xsrf
header:

curl -X PUT "${KIBANA_URL}/api/security/role/my-role" \
  <auth_flags> \
  -H "kbn-xsrf: true" \
  -H "Content-Type: application/json" \
  -d '{ ... }'

Without it, Kibana returns

400 Bad Request
with
"Request must contain a kbn-xsrf header"
.

SAML/OIDC redirect loop

Common causes:

  • Incorrect
    xpack.security.authc.realms.saml.*.sp.acs
    or
    idp.metadata.path
    in
    elasticsearch.yml
    .
  • Clock skew between the IdP and Elasticsearch nodes (SAML assertions have a validity window).
  • Kibana
    server.publicBaseUrl
    does not match the SAML ACS URL.

Verify the SAML realm configuration:

curl <auth_flags> "${ELASTICSEARCH_URL}/_security/_authenticate"

If this returns a valid user via a non-SAML realm, the SAML realm itself is not being reached. Check realm chain order.

Kibana cannot reach Elasticsearch

Kibana logs

Unable to retrieve version information from Elasticsearch nodes
. Verify the
elasticsearch.hosts
setting in
kibana.yml
points to a reachable endpoint and the credentials (
elasticsearch.username
/
elasticsearch.password
or
elasticsearch.serviceAccountToken
) are valid.

License Expiry Recovery

When a paid license expires, the cluster enters a security-closed state: paid realms (SAML, LDAP, AD, PKI) stop working and users authenticating through them are locked out. Native and file realms remain functional.

Quick triage

curl <auth_flags> "${ELASTICSEARCH_URL}/_license"

If

license.status
is
"expired"
, proceed with recovery.

Recovery steps

Follow the detailed recovery workflow in the elasticsearch-license skill. The critical first step depends on deployment type:

DeploymentFirst step
Self-managedLog in with a file-based user (
elasticsearch-users
CLI) or native user.
ECHContact Elastic support or renew via the Cloud console.
ServerlessNot applicable — licensing is fully managed by Elastic.

Examples

User gets 403 when querying logs

Symptom: "I get a 403 when searching

logs-*
."

  1. Verify identity:
curl -u "joe:${PASSWORD}" "${ELASTICSEARCH_URL}/_security/_authenticate"

Response shows

"roles": ["viewer"]
.

  1. Test privileges:
curl -X POST "${ELASTICSEARCH_URL}/_security/user/_has_privileges" \
  -u "joe:${PASSWORD}" \
  -H "Content-Type: application/json" \
  -d '{"index": [{"names": ["logs-*"], "privileges": ["read"]}]}'

Response:

"has_all_requested": false
— the
viewer
role does not include
read
on
logs-*
.

  1. Fix: create a
    logs-reader
    role and assign it to Joe. See elasticsearch-authz.

API key stopped working

Symptom: "My API key returns 401 since yesterday."

  1. Check the key:
curl -u "admin:${PASSWORD}" "${ELASTICSEARCH_URL}/_security/api_key?name=my-key"

Response shows

"expiration": 1709251200000
— the key expired.

  1. Fix: create a new API key with a suitable
    expiration
    . See elasticsearch-authn.

SAML login redirects to error

Symptom: "Clicking the SSO button in Kibana redirects to an error page."

  1. Check if the SAML realm is reachable by authenticating with a non-SAML method:
curl -u "elastic:${PASSWORD}" "${ELASTICSEARCH_URL}/_security/_authenticate"
  1. Verify the IdP metadata URL is accessible from the Elasticsearch nodes (self-managed):
curl -s "${IDP_METADATA_URL}" | head -5
  1. Check for clock skew — SAML assertions are time-sensitive. Ensure NTP is configured on all nodes.
  2. Verify
    server.publicBaseUrl
    in
    kibana.yml
    matches the SAML ACS URL configured in the IdP.

Users locked out after license expired

Symptom: "Nobody can log in to Kibana. We use SAML."

  1. Check license:
curl -u "admin:${PASSWORD}" "${ELASTICSEARCH_URL}/_license"

Response shows

"status": "expired"
,
"type": "platinum"
.

  1. The SAML realm is disabled because the paid license expired. Follow the recovery steps in elasticsearch-license: log in with a file-based or native user, then upload a renewed license or revert to basic.

Guidelines

Always start with
_authenticate

Run

GET /_security/_authenticate
as the first diagnostic step. It reveals the user's identity, realm, roles, and authentication type in a single call. Most issues become apparent from this response alone.

Check the license early

Before investigating realm or privilege issues, verify the license is active with

GET /_license
. An expired paid license disables realms and features, producing symptoms that mimic misconfiguration.

Use
_has_privileges
before manual inspection

Instead of reading role definitions and mentally computing effective access, use

POST /_security/user/_has_privileges
to test specific privileges directly. This is faster and accounts for role composition, DLS, and FLS.

Avoid superuser credentials

Never use the built-in

elastic
superuser for day-to-day troubleshooting. Create a dedicated admin user or API key with
manage_security
privileges. Reserve the
elastic
user for initial setup and emergency recovery only.

Do not bypass TLS in production

Using

curl -k
or
--insecure
skips certificate verification and masks real TLS issues. Use it only for initial diagnosis, then fix the underlying certificate problem.

Deployment Compatibility

Diagnostic tool and API availability differs across deployment types.

Tool / APISelf-managedECHServerless
_security/_authenticate
YesYesYes
_security/user/_has_privileges
YesYesYes
_xpack
YesYesLimited
_license
YesYes (read)Not available
_security/api_key
(GET)
YesYesYes
_security/role_mapping
YesYesYes
elasticsearch-users
CLI
YesNot availableNot available
openssl s_client
on nodes
YesNot availableNot available
Elasticsearch logsYesVia Cloud UIVia Cloud UI

ECH notes:

  • No node-level access, so the
    elasticsearch-users
    CLI and direct log/certificate inspection are not available.
  • TLS is managed by Elastic — certificate errors typically indicate an incorrect endpoint URL.
  • Use the Cloud console for log inspection and deployment configuration.

Serverless notes:

  • Licensing APIs are not exposed. License-related lockouts do not occur.
  • Native users do not exist — authentication issues are handled at the organization level.
  • TLS is fully managed and transparent.