Alerts¶
Compliant Kubernetes (CK8S) includes alerts via Alertmanager.
Important
By default, you will get some platform alerts. This may benefit you, by giving you improved "situational awareness". Please decide if these alerts are of interest to you or not. Feel free to silence them, as the Compliant Kubernetes administrator will take responsibility for them.
Your focus should be on user alerts or application-level alerts, i.e., alerts under the control and responsibility of the Compliant Kubernetes user. We will focus on user alerts in this document.
Compliance needs¶
Many regulations require you to have an incident management process. Alerts help you discover abnormal application behavior that need attention. This maps to ISO 27001 – Annex A.16: Information Security Incident Management.
Enabling user alerts¶
User alerts are handled by a project called AlertManager, which needs to be enabled by the administrator. Get in touch with the administrator and they will be happy to help.
Configuring user alerts¶
User alerts are configured via the Secret alertmanager-alertmanager
located in the alertmanager
namespace. This configuration file is specified here.
# retrieve the old configuration
kubectl get -n alertmanager secret alertmanager-alertmanager -o jsonpath='{.data.alertmanager\.yaml}' | base64 -d > alertmanager.yaml
# edit alertmanager.yaml as needed
# patch the new configuration
kubectl patch -n alertmanager secret alertmanager-alertmanager -p "'{\"data\":{\"alertmanager.yaml\":\"$(base64 -w 0 < alertmanager.yaml)\"}}'"
Make sure to configure and test a receiver for you alerts, e.g., Slack or OpsGenie.
Note
If you get an access denied error, check with your Compliant Kubernetes administrator.
Accessing user AlertManager¶
If you want to access AlertManager, for example to confirm that its configuration was picked up correctly, or to configure silences, proceed as follows:
- Type:
kubectl proxy
. - Open this link in your browser.
Configuring alerts¶
Before setting up an alert, you must first collect metrics from your application by setting up either ServiceMonitors or PodMonitors. In general ServiceMonitors are recommended over PodMonitors, and it is the most common way to configure metrics collection.
Then create a PrometheusRule
following the examples below or the upstream documentation with an expression that evaluates to the condition to alert on. Prometheus will pick them up, evaluate them, and then send notifications to AlertManager.
The API reference for Prometheus Operator describes how the Kubernetes resource is configured and the configuration reference for Prometheus describes the rules themselves.
In Compliant Kubernetes the Prometheus Operator in the workload cluster is configured to pick up all PrometheusRules, regardless in which namespace they are or which labels they have.
Running Example¶
The user demo already includes a PrometheusRule, to configure an alert:
{{- if .Values.prometheusRule.enabled -}}
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: {{ include "ck8s-user-demo.fullname" . }}
labels:
{{- include "ck8s-user-demo.labels" . | nindent 4 }}
spec:
groups:
- name: ./example.rules
rules:
- alert: ApplicationIsActuallyUsed
expr: rate(http_request_duration_seconds_count[1m])>1
{{- end }}
The screenshot below gives an example of the application alert, as seen in AlertManager.
Detailed example¶
PrometheusRules have two features, either the rules alerts based on expression, or the rules records
based on a expression.
The former is the way to create alerting rules and the latter is a way to precompute complex queries that will be stored as separate metrics:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
prometheus: example
role: alert-rules
name: prometheus-example-rules
spec:
groups:
- name: ./example.rules
# interval: 30s # optional parameter to configure how often groups of rules are evaluated
rules:
- alert: ExampleAlert
expr: vector(1)
# for: 1m # optional parameter to configure how long an alert must be triggered to be fired
labels:
severity: high
annotations:
summary: "Example Alert has been fired!"
description: "The Example Alert has been fired! It shows the value {{ $value }}."
- record: example_record_metric
expr: vector(1)
labels:
record: example
For alert rules labels and annotations can be added or overridden that will become present in the resulting alert notifications, in addition the annotations support Go Templating allowing access to the evaluated value via the $value
variable and all labels from the expression using the $labels
variable.
For recording rules labels can be added or overridden that will become present in the resulting metric.