What types of health check mechanisms does Consul have? How to configure and use health checks - 面试题

Consul's health check mechanism is a key feature that ensures service availability. It monitors service status through various check methods and promptly notifies when services become unavailable.

Health Check Types

1. Script Check

Check service health status by executing scripts or commands:

json
{
  "check": {
    "id": "script-check",
    "name": "Script Health Check",
    "args": ["/usr/local/bin/check_script.sh"],
    "interval": "10s",
    "timeout": "5s"
  }
}

Script returns 0 for healthy, non-zero for unhealthy.

2. HTTP Check

Check service endpoints via HTTP requests:

json
{
  "check": {
    "id": "http-check",
    "name": "HTTP Health Check",
    "http": "http://localhost:8080/health",
    "method": "GET",
    "header": {
      "Authorization": ["Bearer token"]
    },
    "interval": "10s",
    "timeout": "5s",
    "tls_skip_verify": false
  }
}

HTTP status code 2xx indicates healthy, others indicate unhealthy.

3. TCP Check

Check service port via TCP connection:

json
{
  "check": {
    "id": "tcp-check",
    "name": "TCP Health Check",
    "tcp": "localhost:3306",
    "interval": "10s",
    "timeout": "5s"
  }
}

Successful connection establishment indicates healthy.

4. gRPC Check

Check service via gRPC call:

json
{
  "check": {
    "id": "grpc-check",
    "name": "gRPC Health Check",
    "grpc": "localhost:9090",
    "grpc_use_tls": true,
    "interval": "10s",
    "timeout": "5s"
  }
}

5. Docker Check

Check Docker container status:

json
{
  "check": {
    "id": "docker-check",
    "name": "Docker Health Check",
    "docker_container_id": "abc123",
    "shell": "/bin/bash",
    "script": "curl -s http://localhost:8080/health",
    "interval": "10s"
  }
}

6. TTL Check

TTL (Time To Live) based check where services need to periodically update status:

json
{
  "check": {
    "id": "ttl-check",
    "name": "TTL Health Check",
    "ttl": "30s"
  }
}

Services need to periodically call API to update status:

bash
curl -X PUT http://localhost:8500/v1/agent/check/pass/ttl-check

Health Check Status

Consul defines the following health statuses:

passing: Healthy, service running normally
warning: Warning, service may have issues but still available
critical: Critical, service unavailable
maintenance: Maintenance mode, service temporarily unavailable

Check Parameter Configuration

Core Parameters

interval: Check interval, such as "10s", "1m"
timeout: Check timeout
failures_before_critical: How many consecutive failures before marking as critical
successes_before_passing: How many consecutive successes before marking as passing
deregister_critical_service_after: How long to automatically deregister after service becomes critical

Advanced Parameters

json
{
  "check": {
    "id": "advanced-check",
    "name": "Advanced Health Check",
    "http": "http://localhost:8080/health",
    "interval": "10s",
    "timeout": "5s",
    "failures_before_critical": 3,
    "successes_before_passing": 2,
    "deregister_critical_service_after": "5m",
    "status": "passing",
    "notes": "Custom health check"
  }
}

Service-Level Health Check

Health checks can be bound to service registration:

json
{
  "service": {
    "name": "web",
    "port": 8080,
    "check": {
      "id": "web-check",
      "http": "http://localhost:8080/health",
      "interval": "10s"
    }
  }
}

Multiple Health Checks

A service can have multiple health checks:

json
{
  "service": {
    "name": "web",
    "port": 8080,
    "checks": [
      {
        "id": "web-http",
        "http": "http://localhost:8080/health",
        "interval": "10s"
      },
      {
        "id": "web-disk",
        "script": "/usr/local/bin/check_disk.sh",
        "interval": "30s"
      }
    ]
  }
}

Health Check API

Query Health Checks

bash
# Query all checks
curl http://localhost:8500/v1/health/state/any

# Query passing status checks
curl http://localhost:8500/v1/health/state/passing

# Query specific service checks
curl http://localhost:8500/v1/health/checks/web

Manually Update Check Status

bash
# Mark as passing
curl -X PUT http://localhost:8500/v1/agent/check/pass/ttl-check

# Mark as warning
curl -X PUT http://localhost:8500/v1/agent/check/warn/ttl-check

# Mark as critical
curl -X PUT http://localhost:8500/v1/agent/check/fail/ttl-check

Best Practices

Set reasonable check intervals: Too frequent increases load, too long affects failure detection speed
Set appropriate timeout: Avoid misjudgment due to network latency
Use multiple checks: Verify service health from different angles
Configure failure thresholds: Avoid frequent status switching due to temporary failures
Monitor checks themselves: Ensure health check mechanism works properly

Failure Handling

When service health check fails:

Automatic deregistration: Service removed from service list
Load balancer adjustment: Traffic no longer routed to unhealthy services
Alert notification: Can trigger alerts via Consul Watch or external monitoring systems
Automatic recovery: Service automatically re-registers after recovery

Consul's health check mechanism is flexible and powerful, capable of meeting service monitoring needs in various scenarios.