> For the complete documentation index, see [llms.txt](https://docs.portainer.io/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.portainer.io/sts/user/alerting/alerting-rules.md).

# Alerting rules

The tables below list each available alert rule with a description of what it monitors, whether it is editable, and the set conditions - or default conditions - required to trigger it.

<figure><img src="/files/qmBxG2VlOMAUYd2usNyH" alt=""><figcaption></figcaption></figure>

### Portainer&#x20;

| Rule           | Description                          | Trigger details                                                                                                     |
| -------------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------- |
| Backup Failure | Alert when a backup operation fails. | <p>Not editable. <br>Triggers a <strong>critical</strong> alert for <strong>1</strong> failed backup operation.</p> |

### Security

| Rule                                       | Description                                                                                             | Trigger details                                                                                                                                                                         |
| ------------------------------------------ | ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Brute Force Attack                         | Alert on a specified number of authentication failures in a specified amount of time.                   | <p>Editable.<br>Defaults to <strong>critical</strong> when there is over <strong>150</strong> failed authentications over a <strong>5</strong>-minute window.</p>                       |
| High Authentication Failures (Single User) | Alert when a single users authentication fails exceed a specified amount in a specified amount of time. | <p>Editable.<br>Defaults to <strong>warning</strong> when a single user fails authentication <strong>over</strong> <strong>10</strong> times in a <strong>5</strong>-minute window.</p> |
| TLS Certificate Expired                    | Alert when TLS certificate for Portainer has expired.                                                   | <p>Not editable. <br>Triggers a <strong>critical</strong> alert for a expired TLS certificate.</p>                                                                                      |

### Environment

| Rule                                                  | Description                                                                                                                                                                                                                                                                                                                                                                                                           | Trigger details                                                                                                                                                                                                                                                                                             |
| ----------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Environment Down                                      | Alert when an environment becomes unhealthy or down (status 2).                                                                                                                                                                                                                                                                                                                                                       | <p>Not editable. <br>Triggers a <strong>critical</strong> alert for <strong>1</strong> down or unhealthy environment.</p>                                                                                                                                                                                   |
| Environment High CPU Usage %                          | Alert when environment CPU usage exceeds the configured threshold.                                                                                                                                                                                                                                                                                                                                                    | <p>Editable for each severity.<br>Defaults to:<br>- <strong>Critical</strong> at <strong>95%</strong> CPU usage,<br>- <strong>Warning</strong> at <strong>85%</strong> CPU usage,<br>- <strong>Info</strong> at <strong>70%</strong> CPU usage.</p>                                                         |
| Environment High Memory Usage %                       | Alert when environment memory usage exceeds the configured threshold.                                                                                                                                                                                                                                                                                                                                                 | <p>Editable for each severity.<br>Defaults to:<br>- <strong>Critical</strong> at <strong>95%</strong> memory usage,<br>- <strong>Warning</strong> at <strong>85%</strong> memory usage,<br>- <strong>Info</strong> at <strong>70%</strong> memory usage.</p>                                                |
| Environment High Network Usage (Inbound and Outbound) | Alert when environment network usage exceeds the configured threshold in the specified amount of time.                                                                                                                                                                                                                                                                                                                | <p>Editable for each severity.<br>Defaults to:<br>- <strong>Critical</strong> at <strong>1 GB</strong> usage,<br>- <strong>Warning</strong> at <strong>500 MB</strong> usage,<br>- <strong>Info</strong> at <strong>200 MB</strong> usage<br>in <strong>less that</strong> <strong>15</strong> minutes.</p> |
| Etcd Unhealthy                                        | Alert when the Kubernetes API server cannot reach etcd. Use of this alert requires a standard etcd setup - the API server must connect to etcd over a real network or local socket. Not reliable on distributions that use SQLite as the backing store (such as single-node k3s or k0s), as those always report etcd as healthy regardless of storage state. Check your distribution's documentation before enabling. | <p>Editable.<br>Defaults to <strong>critical</strong> when the Kubernetes API server cannot reach etcd for <strong>5</strong> minutes.</p>                                                                                                                                                                  |
| Kubernetes API Server TLS Certificate Expiry          | Alert when the Kubernetes API server TLS certificate expires within the configured number of days.                                                                                                                                                                                                                                                                                                                    | <p>Editable for each severity.<br>Defaults to:<br>- <strong>Critical</strong> at <strong>7</strong> days until expiry,<br>- <strong>Warning</strong> at <strong>14</strong> days until expiry.</p>                                                                                                          |
| Kubernetes API Server Unhealthy                       | Alert when the Kubernetes API server reports unhealthy on its liveness probe `/livez`. This detects a degraded-but-running API server.                                                                                                                                                                                                                                                                                | <p>Editable.<br>Defaults to <strong>critical</strong> when the Kubernetes API server reports unhealthy for <strong>5</strong> minutes.</p>                                                                                                                                                                  |
| Kubernetes API Server High Request Latency            | Alert when Kubernetes API server internal request latency exceeds the configured threshold.                                                                                                                                                                                                                                                                                                                           | <p>Editable for each severity. Defaults to:<br>- <strong>Critical</strong> when internal request latency exceeds <strong>4</strong> seconds,<br>- <strong>Warning</strong> when it exceeds <strong>2</strong> seconds<br>For <strong>more than</strong> <strong>5</strong> minutes.</p>                     |
| Node NotReady                                         | Alert when a node is NotReady for the configured duration. Cordoned nodes are excluded.                                                                                                                                                                                                                                                                                                                               | <p>Editable.<br>Defaults to <strong>critical</strong> when the node is NotReady for <strong>5</strong> minutes.</p>                                                                                                                                                                                         |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.portainer.io/sts/user/alerting/alerting-rules.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
