乐闻世界logo
搜索文章和话题

How to configure Prometheus alert rules and Alertmanager?

2月21日 15:40

Prometheus alert configuration and Alertmanager usage:

Alert Rule Configuration:

yaml
groups: - name: example_alerts rules: - alert: HighCPUUsage expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80 for: 5m labels: severity: warning annotations: summary: "High CPU usage on {{ $labels.instance }}" description: "CPU usage is {{ $value }}%"

Key Fields:

  • expr: Alert expression
  • for: Duration condition must be met
  • labels: Alert labels
  • annotations: Alert description

Alertmanager Configuration:

yaml
route: group_by: ['alertname', 'cluster'] group_wait: 10s group_interval: 10s repeat_interval: 12h receiver: 'default' receivers: - name: 'default' email_configs: - to: 'alert@example.com' from: 'prometheus@example.com' webhook_configs: - url: 'http://webhook.example.com/alert'

Alert Grouping:

  • group_by: Group by labels
  • group_wait: Wait time to merge alerts in same group
  • group_interval: Interval between alerts in group
  • repeat_interval: Repeat notification interval

Alert Inhibition:

yaml
inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'instance']

Alert Silencing:

  • Create silence rules via API
  • Support time ranges and matchers
  • Suitable for maintenance windows

Best Practices:

  • Set reasonable alert thresholds to avoid alert fatigue
  • Use tiered alerts (info, warning, critical)
  • Regularly review and optimize alert rules
  • Combine with Grafana for visual alerts
标签:Prometheus