TCP Keep-Alive Mechanism Explained
TCP Keep-Alive is a mechanism to detect whether a connection is alive, used to timely discover and clean up invalid connections.
Keep-Alive Mechanism Principle
Workflow
- Idle Wait: After the connection is idle for a period of time (default 2 hours), start sending Keep-Alive probe packets
- Send Probe: Send a TCP segment without data, sequence number is current sequence number minus 1
- Wait for Response:
- Receive ACK: Connection is normal, reset timer
- Receive RST: Connection is reset by the other party, close connection
- Timeout no response: Connection may be invalid, continue probing
Probe Parameters
- tcp_keepalive_time: How long to wait after connection is idle before sending probes (default 7200 seconds)
- tcp_keepalive_intvl: Interval between probe packets (default 75 seconds)
- tcp_keepalive_probes: Maximum number of probes (default 9 times)
Purpose of Keep-Alive
1. Detect Invalid Connections
- Network interruption, device failure, etc. cause connection failure
- Timely discover and clean up invalid connections, release resources
2. Prevent Connection False Death
- Application layer did not close connection normally, but connection is invalid
- Keep-Alive can detect this situation
3. Keep Connection Active
- Some intermediate devices (such as NAT) will clean up long-idle connections
- Keep-Alive can keep connection from being cleaned up
Disadvantages of Keep-Alive
1. Resource Consumption
- Need to maintain connection state and timers
- With large number of connections, Keep-Alive increases system burden
2. Risk of Misjudgment
- Network latency may cause misjudgment of connection failure
- May mistakenly close normal but high-latency connections
3. Not Timely
- Default wait time is long (2 hours)
- Cannot quickly detect connection failure
Application Layer Heartbeat vs Keep-Alive
Keep-Alive
- Advantage: Operating system level support, no need for application layer implementation
- Disadvantage: Fixed parameters, not flexible enough, long wait time
Application Layer Heartbeat
- Advantage: Flexible and controllable, can carry business data, more timely detection
- Disadvantage: Need application layer implementation, increase development cost
Configuration Example
Linux System Configuration
bash# View Keep-Alive parameters sysctl net.ipv4.tcp_keepalive_time sysctl net.ipv4.tcp_keepalive_intvl sysctl net.ipv4.tcp_keepalive_probes # Modify Keep-Alive parameters sysctl -w net.ipv4.tcp_keepalive_time=600 sysctl -w net.ipv4.tcp_keepalive_intvl=30 sysctl -w net.ipv4.tcp_keepalive_probes=3
Programming Configuration (Python)
pythonimport socket sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1) sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 600) sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 30) sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 3)
Related Questions
- What is the difference between Keep-Alive and heartbeat packets?
- How to optimize Keep-Alive parameters?
- Why is the default Keep-Alive time so long?