Skip to content

Troubleshooting Kubernetes Leases

Updated  by Bryan.Seay@sysdig.com

Summary

Agent version 12.0.0 and later will try to use Kubernetes Leases to control how data is pulled from the Kubernetes API Server. If it cannot create leases, it will fall back to a previous algorithm. This document addresses how to fix problems when the agent tries to create leases.

Symptom

Agent logs show one of the following errors:

Error, lease_pool_manager[2989554]: Cannot access leases objects: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:sysdig-agent:sysdig-agent" cannot list resource "leases" in API group "coordination.k8s.io" in the namespace "sysdig-agent"
---
Error, k8s_parser:245: cointerface[10125]: delegation: error creating the lease manager: unable to Init leasePoolManager as agent doesn't have lease permissions: unauthorized to get leases:

Resolution

Prerequisites

  • Sysdig Agent v12.0.0+
  • Kubernetes v1.14+

Benefits of using Kubernetes Leases

By using Kubernetes Leases the Agent can efficiently control when and how it pulls data from the Kubernetes API server. In small Kubernetes clusters (< 50 nodes) this is a nice-to-have feature which gives an easy insight into what the Agent is doing. However, in larger Kubernetes clusters (>= 200 nodes), using leases is strongly recommended to ensure the Agent does not overload the Kubernetes API server.
Agent privileges
For most Kubernetes objects, the Agent has get, list and watch privileges. But for leases it uses get, list, watch, create and update. This is required so the Agent can properly create and update lease objects that are used to make distributed decisions.

If the Agent is not given create and update privileges it will fail right after boot and fall back to the previous method of gathering Kubernetes data. This method puts significantly more load on the Kubernetes API server. Thus its use is not recommended for Kubernetes clusters with more than 200 nodes, or for any cluster where the Kubernetes API server does not have a sufficient amount of CPU headroom.
Configuring existing Agent installations
 CAUTION: If you deployed the Agent using the Sysdig Helm Chart, starting with sysdig/sysdig-deploy v1.7.2+ and sysdig-deploy subchart agent v1.7.0+ leases permissions are now configured as Role. Due to this RoleBinding has changed, too.

GitHub - sysdiglabs/charts - role.yaml
GitHub - sysdiglabs/charts - clusterrole.yaml

GitHub - sysdiglabs/charts - rolebinding.yaml
GitHub - sysdiglabs/charts - daemonset.yaml


 NOTE: This is only applicable to users who configured an Agent before September 2021 and who are not using Sysdig Helm Charts to upgrade the Agent version.

Existing users need to update the ClusterRole and DaemonSet to match the latest version:


Step 1: Add leases permission to ClusterRole sysdig-agent

The following patch will automatically update the Agent's ClusterRole if you are using namespace sysdig-agent. It will add the ability to get, list, watch, create and update leases.

kubectl patch clusterrole sysdig-agent -n sysdig-agent --patch='[{"op": "add", "path": "/rules/-", "value": {"apiGroups": ["coordination.k8s.io"], "resources": ["leases"], "verbs": ["get", "list", "create", "update", "watch"]}}]' --type json

Alternatively, edit the ClusterRole sysdig-agentdirectly and add the following:

rules:
- apiGroups:
  - coordination.k8s.io
  resources:
  - leases
  verbs:
- get
  - list
  - create
  - update
  - watch


Step 2: Add DownwardAPI to sysdig-agentDaemonSet

The following will pass the Agent's pod name and namespace down to the Agent so that it knows this information prior to ever contacting the Kubernetes API server. Edit the DaemonSet and add the green highlighted lines:

spec:
  template:
  spec:
    volumes:
    - name: podinfo
      downwardAPI:
        defaultMode: 420
        items:
        - fieldRef:
            apiVersion: v1
            fieldPath: metadata.namespace
          path: namespace
        - fieldRef:
            apiVersion: v1
            fieldPath: metadata.name
          path: name
    containers:
    - name: sysdig-agent
      volumeMounts:
      - mountPath: /etc/podinfo
        name: podinfo

 Known issues

 NOTE: Cold-start leases are intentionally spreading out the load on the Kubernetes API server. Thus it takes longer for any given Agent to build its cache which can lead to missing metadata every time an Agent pod or Agent process is restarted.

Environment

* Related Products: Agent

* Related Versions: 

  • Agent 12.0.0+
  • Kubernetes v1.14+

* On-Prem or SaaS