Alerting rules
The tables below list each available alert rule with a description of what it monitors, whether it is editable, and the set conditions - or default conditions - required to trigger it.
Portainer
Backup Failure
Alert when a backup operation fails.
Not editable. Triggers a critical alert for 1 failed backup operation.
Security
Brute Force Attack
Alert on a specified number of authentication failures in a specified amount of time.
Editable. Defaults to triggering a critical alert when there is over 150 failed authentications over a 5-minute window.
High Authentication Failures (Single User)
Alert when a single users authentication fails exceed a specified amount in a specified amount of time.
Editable. Defaults to triggering a warning alert when a single user fails authentication over 10 times in a 5-minute window.
TLS Certificate Expired
Alert when TLS certificate for Portainer has expired.
Not editable. Triggers a critical alert for a expired TLS certificate.
Environment
Environment Down
Alert when an environment becomes unhealthy or down (status 2).
Not editable. Triggers a critical alert for 1 down or unhealthy environment.
Environment High CPU Usage %
Alert when environment CPU usage exceeds the configured threshold.
Editable for each severity. Defaults to triggering a critical alert at 95% CPU usage, warning alert at 85% CPU usage, Info alert at 70% CPU usage.
Environment High Memory Usage %
Alert when environment memory usage exceeds the configured threshold.
Editable for each severity. Defaults to triggering a critical alert at 95% memory usage, warning alert at 85% memory usage, Info alert at 70% memory usage.
Environment High Network Usage (Inbound and Outbound)
Alert when environment network usage exceeds the configured threshold in the specified amount of time.
Editable for each severity. Defaults to triggering a critical alert at 1 GB usage, warning alert at 500 MB usage, Info alert at 200 MB usage in less that 15 minutes.
Etcd Unhealthy
Alert when the Kubernetes API server cannot reach etcd. Use of this alert requires a standard etcd setup - the API server must connect to etcd over a real network or local socket. Not reliable on distributions that use SQLite as the backing store (such as single-node k3s or k0s), as those always report etcd as healthy regardless of storage state. Check your distribution's documentation before enabling.
Editable. Defaults to triggering a critical alert when the Kubernetes API server cannot reach etcd for 5 minutes.
Kubernetes API Server TLS Certificate Expiry
Alert when the Kubernetes API server TLS certificate expires within the configured number of days.
Editable for each severity. Defaults to triggering a critical alert at 7 days until expiry, warning alert at 14 days until expiry.
Kubernetes API Server Unhealthy
Alert when the Kubernetes API server reports unhealthy on its liveness probe /livez. This detects a degraded-but-running API server.
Editable. Defaults to triggering a critical alert when the Kubernetes API server reports unhealthy for 5 minutes.
Node NotReady
Alert when a node is NotReady for the configured duration. Cordoned nodes are excluded.
Editable. Defaults to triggering a critical alert when the node is NotReady for 5 minutes.
Last updated
Was this helpful?