Consul's health check mechanism is a key feature that ensures service availability. It monitors service status through various check methods and promptly notifies when services become unavailable.
Health Check Types
1. Script Check
Check service health status by executing scripts or commands:
json{ "check": { "id": "script-check", "name": "Script Health Check", "args": ["/usr/local/bin/check_script.sh"], "interval": "10s", "timeout": "5s" } }
Script returns 0 for healthy, non-zero for unhealthy.
2. HTTP Check
Check service endpoints via HTTP requests:
json{ "check": { "id": "http-check", "name": "HTTP Health Check", "http": "http://localhost:8080/health", "method": "GET", "header": { "Authorization": ["Bearer token"] }, "interval": "10s", "timeout": "5s", "tls_skip_verify": false } }
HTTP status code 2xx indicates healthy, others indicate unhealthy.
3. TCP Check
Check service port via TCP connection:
json{ "check": { "id": "tcp-check", "name": "TCP Health Check", "tcp": "localhost:3306", "interval": "10s", "timeout": "5s" } }
Successful connection establishment indicates healthy.
4. gRPC Check
Check service via gRPC call:
json{ "check": { "id": "grpc-check", "name": "gRPC Health Check", "grpc": "localhost:9090", "grpc_use_tls": true, "interval": "10s", "timeout": "5s" } }
5. Docker Check
Check Docker container status:
json{ "check": { "id": "docker-check", "name": "Docker Health Check", "docker_container_id": "abc123", "shell": "/bin/bash", "script": "curl -s http://localhost:8080/health", "interval": "10s" } }
6. TTL Check
TTL (Time To Live) based check where services need to periodically update status:
json{ "check": { "id": "ttl-check", "name": "TTL Health Check", "ttl": "30s" } }
Services need to periodically call API to update status:
bashcurl -X PUT http://localhost:8500/v1/agent/check/pass/ttl-check
Health Check Status
Consul defines the following health statuses:
- passing: Healthy, service running normally
- warning: Warning, service may have issues but still available
- critical: Critical, service unavailable
- maintenance: Maintenance mode, service temporarily unavailable
Check Parameter Configuration
Core Parameters
- interval: Check interval, such as "10s", "1m"
- timeout: Check timeout
- failures_before_critical: How many consecutive failures before marking as critical
- successes_before_passing: How many consecutive successes before marking as passing
- deregister_critical_service_after: How long to automatically deregister after service becomes critical
Advanced Parameters
json{ "check": { "id": "advanced-check", "name": "Advanced Health Check", "http": "http://localhost:8080/health", "interval": "10s", "timeout": "5s", "failures_before_critical": 3, "successes_before_passing": 2, "deregister_critical_service_after": "5m", "status": "passing", "notes": "Custom health check" } }
Service-Level Health Check
Health checks can be bound to service registration:
json{ "service": { "name": "web", "port": 8080, "check": { "id": "web-check", "http": "http://localhost:8080/health", "interval": "10s" } } }
Multiple Health Checks
A service can have multiple health checks:
json{ "service": { "name": "web", "port": 8080, "checks": [ { "id": "web-http", "http": "http://localhost:8080/health", "interval": "10s" }, { "id": "web-disk", "script": "/usr/local/bin/check_disk.sh", "interval": "30s" } ] } }
Health Check API
Query Health Checks
bash# Query all checks curl http://localhost:8500/v1/health/state/any # Query passing status checks curl http://localhost:8500/v1/health/state/passing # Query specific service checks curl http://localhost:8500/v1/health/checks/web
Manually Update Check Status
bash# Mark as passing curl -X PUT http://localhost:8500/v1/agent/check/pass/ttl-check # Mark as warning curl -X PUT http://localhost:8500/v1/agent/check/warn/ttl-check # Mark as critical curl -X PUT http://localhost:8500/v1/agent/check/fail/ttl-check
Best Practices
- Set reasonable check intervals: Too frequent increases load, too long affects failure detection speed
- Set appropriate timeout: Avoid misjudgment due to network latency
- Use multiple checks: Verify service health from different angles
- Configure failure thresholds: Avoid frequent status switching due to temporary failures
- Monitor checks themselves: Ensure health check mechanism works properly
Failure Handling
When service health check fails:
- Automatic deregistration: Service removed from service list
- Load balancer adjustment: Traffic no longer routed to unhealthy services
- Alert notification: Can trigger alerts via Consul Watch or external monitoring systems
- Automatic recovery: Service automatically re-registers after recovery
Consul's health check mechanism is flexible and powerful, capable of meeting service monitoring needs in various scenarios.