Summary
Symptom
Agent logs show the following error:
Error, lease_pool_manager[2989554]: Cannot access leases objects: leases.coordination.k8s.io is forbidden: User "system:serviceaccount:sysdig-agent:sysdig-agent" cannot list resource "leases" in API group "coordination.k8s.io" in the namespace "sysdig-agent"
Resolution
Prerequisites
Sysdig Agent v12.0.0 or above (a subset of features exist since v11.3.0)
Kubernetes v1.14 or above
Benefits of Using Leases
Using leases, the agent can efficiently control when and how it pulls data from the API Server. In small kubernetes clusters (less than 50 nodes), this is a nice-to-have feature which gives an easy insight into what the agent is doing. In large clusters (greater than 200 nodes), using leases is strongly recommended to ensure that the agent does not overload the API server.
Agent Privileges to Create and Update Leases
For most Kubernetes objects, the agent has `get, list and watch` privileges. But for leases, it uses `get, list, watch, create, and update`. This is needed so that the agent can create and update the lease objects that are used to make distributed decisions.
If the agent isn’t given create and update permissions, then it will fail right after boot and fall back to the previous method of gathering Kubernetes data. This method has a larger impact on the API Server and is not recommended for Kubernetes clusters larger than 200 nodes or any cluster where the API Server(s) do not have a significant amount of cpu headroom.
Configuring Existing Agent Installation to Use Leases
Note: This is only applicable to users who configured an agent before September 2021 and who aren’t using helm charts to upgrade their agent version.
Existing users need their clusterrole and daemonset update to match the latest version:
sysdig-cloud-scripts/sysdig-agent-clusterrole.yaml at master · draios/sysdig-cloud-scripts
sysdig-cloud-scripts/sysdig-agent-daemonset-v2.yaml at master · draios/sysdig-cloud-scripts
Step 1: Add lease permissions to clusterrole
The following patch will automatically update the agent’s clusterrole if using the `sysdig-agent` namespace. It will add the ability to read and write leases.
$ kubectl patch clusterrole sysdig-agent -n sysdig-agent --patch='[{"op": "add", "path": "/rules/-", "value": {"apiGroups": ["coordination.k8s.io"], "resources": ["leases"], "verbs": ["get", "list", "create", "update", "watch"]}}]' --type json
Alternatively, edit the ClusterRole and add the following:
rules:
- apiGroups:
- coordination.k8s.io
resources:
- leases
verbs:
- get
- list
- create
- update
- watch
Step 2: Add DownwardAPI to daemonset
The following will pass the agent’s pod name and namespace down to the agent so that the agent knows this information before ever contacting the API Server.
Edit the Daemonset and add the green lines:
spec:
template:
spec:
volumes:
- name: podinfo
downwardAPI:
defaultMode: 420
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- fieldRef:
apiVersion: v1
fieldPath: metadata.name
path: name
containers:
- name: sysdig-agent
volumeMounts:
- mountPath: /etc/podinfo
name: podinfo
Known Issues
The cold-start leases are intentionally spreading out the load on the API Server. Since it takes longer for any given agent to build its cache, this can lead to missing metadata when an agent pod or process is restarted.