乐闻世界logo
搜索文章和话题

What types of health check mechanisms does Consul have? How to configure and use health checks

2月21日 16:12

Consul's health check mechanism is a key feature that ensures service availability. It monitors service status through various check methods and promptly notifies when services become unavailable.

Health Check Types

1. Script Check

Check service health status by executing scripts or commands:

json
{ "check": { "id": "script-check", "name": "Script Health Check", "args": ["/usr/local/bin/check_script.sh"], "interval": "10s", "timeout": "5s" } }

Script returns 0 for healthy, non-zero for unhealthy.

2. HTTP Check

Check service endpoints via HTTP requests:

json
{ "check": { "id": "http-check", "name": "HTTP Health Check", "http": "http://localhost:8080/health", "method": "GET", "header": { "Authorization": ["Bearer token"] }, "interval": "10s", "timeout": "5s", "tls_skip_verify": false } }

HTTP status code 2xx indicates healthy, others indicate unhealthy.

3. TCP Check

Check service port via TCP connection:

json
{ "check": { "id": "tcp-check", "name": "TCP Health Check", "tcp": "localhost:3306", "interval": "10s", "timeout": "5s" } }

Successful connection establishment indicates healthy.

4. gRPC Check

Check service via gRPC call:

json
{ "check": { "id": "grpc-check", "name": "gRPC Health Check", "grpc": "localhost:9090", "grpc_use_tls": true, "interval": "10s", "timeout": "5s" } }

5. Docker Check

Check Docker container status:

json
{ "check": { "id": "docker-check", "name": "Docker Health Check", "docker_container_id": "abc123", "shell": "/bin/bash", "script": "curl -s http://localhost:8080/health", "interval": "10s" } }

6. TTL Check

TTL (Time To Live) based check where services need to periodically update status:

json
{ "check": { "id": "ttl-check", "name": "TTL Health Check", "ttl": "30s" } }

Services need to periodically call API to update status:

bash
curl -X PUT http://localhost:8500/v1/agent/check/pass/ttl-check

Health Check Status

Consul defines the following health statuses:

  • passing: Healthy, service running normally
  • warning: Warning, service may have issues but still available
  • critical: Critical, service unavailable
  • maintenance: Maintenance mode, service temporarily unavailable

Check Parameter Configuration

Core Parameters

  • interval: Check interval, such as "10s", "1m"
  • timeout: Check timeout
  • failures_before_critical: How many consecutive failures before marking as critical
  • successes_before_passing: How many consecutive successes before marking as passing
  • deregister_critical_service_after: How long to automatically deregister after service becomes critical

Advanced Parameters

json
{ "check": { "id": "advanced-check", "name": "Advanced Health Check", "http": "http://localhost:8080/health", "interval": "10s", "timeout": "5s", "failures_before_critical": 3, "successes_before_passing": 2, "deregister_critical_service_after": "5m", "status": "passing", "notes": "Custom health check" } }

Service-Level Health Check

Health checks can be bound to service registration:

json
{ "service": { "name": "web", "port": 8080, "check": { "id": "web-check", "http": "http://localhost:8080/health", "interval": "10s" } } }

Multiple Health Checks

A service can have multiple health checks:

json
{ "service": { "name": "web", "port": 8080, "checks": [ { "id": "web-http", "http": "http://localhost:8080/health", "interval": "10s" }, { "id": "web-disk", "script": "/usr/local/bin/check_disk.sh", "interval": "30s" } ] } }

Health Check API

Query Health Checks

bash
# Query all checks curl http://localhost:8500/v1/health/state/any # Query passing status checks curl http://localhost:8500/v1/health/state/passing # Query specific service checks curl http://localhost:8500/v1/health/checks/web

Manually Update Check Status

bash
# Mark as passing curl -X PUT http://localhost:8500/v1/agent/check/pass/ttl-check # Mark as warning curl -X PUT http://localhost:8500/v1/agent/check/warn/ttl-check # Mark as critical curl -X PUT http://localhost:8500/v1/agent/check/fail/ttl-check

Best Practices

  1. Set reasonable check intervals: Too frequent increases load, too long affects failure detection speed
  2. Set appropriate timeout: Avoid misjudgment due to network latency
  3. Use multiple checks: Verify service health from different angles
  4. Configure failure thresholds: Avoid frequent status switching due to temporary failures
  5. Monitor checks themselves: Ensure health check mechanism works properly

Failure Handling

When service health check fails:

  1. Automatic deregistration: Service removed from service list
  2. Load balancer adjustment: Traffic no longer routed to unhealthy services
  3. Alert notification: Can trigger alerts via Consul Watch or external monitoring systems
  4. Automatic recovery: Service automatically re-registers after recovery

Consul's health check mechanism is flexible and powerful, capable of meeting service monitoring needs in various scenarios.

标签:Consul