cURL 如何处理大文件下载和断点续传？ - 面试题

处理大文件下载和断点续传是 cURL 的重要功能，可以有效管理带宽、节省时间和处理网络中断。

基础下载功能

bash
# 基础下载（显示进度）
curl -O https://example.com/large-file.zip

# 指定输出文件名
curl -o myfile.zip https://example.com/large-file.zip

# 跟随重定向下载
curl -L -O https://example.com/download/file

# 静默下载
curl -s -O https://example.com/large-file.zip

断点续传

断点续传是 cURL 最强大的功能之一，可以在下载中断后从断点继续。

bash
# 断点续传（-C - 自动检测断点）
curl -C - -O https://example.com/large-file.zip

# 指定偏移量续传（从第 1024 字节开始）
curl -C 1024 -O https://example.com/large-file.zip

# 完整示例：带重试的断点续传
for i in {1..5}; do
    curl -C - -o large-file.zip https://example.com/large-file.zip && break
    echo "Attempt $i failed, retrying..."
    sleep 5
done

分块下载

将大文件分成多个部分并行下载，可以显著提升下载速度。

bash
# 下载文件的特定范围（字节）
# 下载前 1MB
curl -r 0-1048575 -o part1.zip https://example.com/large-file.zip

# 下载第 2 个 1MB
curl -r 1048576-2097151 -o part2.zip https://example.com/large-file.zip

# 下载剩余部分
curl -r 2097152- -o part3.zip https://example.com/large-file.zip

# 合并分块文件
cat part1.zip part2.zip part3.zip > complete.zip

# 删除分块文件
rm part1.zip part2.zip part3.zip

并行下载脚本

bash
#!/bin/bash
# 并行分块下载脚本

URL="https://example.com/large-file.zip"
FILE="large-file.zip"
CHUNKS=4
FILE_SIZE=$(curl -sI "$URL" | grep -i content-length | awk '{print $2}' | tr -d '\r')
CHUNK_SIZE=$((FILE_SIZE / CHUNKS))

echo "File size: $FILE_SIZE bytes"
echo "Chunk size: $CHUNK_SIZE bytes"

# 并行下载各分块
for i in $(seq 0 $((CHUNKS-1))); do
    START=$((i * CHUNK_SIZE))
    if [ $i -eq $((CHUNKS-1)) ]; then
        END=""
    else
        END=$((START + CHUNK_SIZE - 1))
    fi
    
    echo "Downloading chunk $i: $START-$END"
    curl -r "$START-$END" -o "${FILE}.part$i" "$URL" &
done

# 等待所有下载完成
wait

# 合并文件
for i in $(seq 0 $((CHUNKS-1))); do
    cat "${FILE}.part$i" >> "$FILE"
    rm "${FILE}.part$i"
done

echo "Download complete: $FILE"

限速下载

bash
# 限制下载速度为 100KB/s
curl --limit-rate 100K -O https://example.com/large-file.zip

# 限制为 1MB/s
curl --limit-rate 1M -O https://example.com/large-file.zip

# 限速 + 断点续传
curl --limit-rate 500K -C - -O https://example.com/large-file.zip

下载进度控制

bash
# 显示进度条
curl --progress-bar -O https://example.com/large-file.zip

# 自定义进度显示
curl -# -o large-file.zip https://example.com/large-file.zip

# 静默下载（无进度）
curl -s -o large-file.zip https://example.com/large-file.zip

# 下载统计信息
curl -w "\nDownloaded: %{size_download} bytes\nSpeed: %{speed_download} bytes/s\nTime: %{time_total}s\n" \
     -o large-file.zip \
     -s https://example.com/large-file.zip

服务器端支持检测

bash
# 检查服务器是否支持断点续传
curl -I https://example.com/large-file.zip | grep -i "accept-ranges"

# 如果返回 Accept-Ranges: bytes，则支持断点续传

# 检查文件大小
curl -I https://example.com/large-file.zip | grep -i "content-length"

下载恢复策略

bash
#!/bin/bash
# 智能下载恢复脚本

URL="https://example.com/large-file.zip"
OUTPUT="large-file.zip"
MAX_RETRIES=10
RETRY_COUNT=0

while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
    echo "Download attempt $((RETRY_COUNT + 1))..."
    
    if curl -C - -o "$OUTPUT" "$URL"; then
        echo "Download completed successfully!"
        exit 0
    else
        RETRY_COUNT=$((RETRY_COUNT + 1))
        echo "Download failed, waiting to retry..."
        sleep $((RETRY_COUNT * 2))  # 指数退避
    fi
done

echo "Download failed after $MAX_RETRIES attempts"
exit 1

多线程下载工具对比

特性	cURL	wget	aria2
断点续传	✅	✅	✅
多线程	需脚本	❌	✅
限速	✅	✅	✅
自动重试	需脚本	✅	✅
BitTorrent	❌	❌	✅

最佳实践

bash
# 1. 始终使用断点续传下载大文件
curl -C - -O https://example.com/large-file.zip

# 2. 配合重试机制
curl --retry 5 --retry-delay 2 -C - -O https://example.com/large-file.zip

# 3. 限速避免占用全部带宽
curl --limit-rate 2M -C - -O https://example.com/large-file.zip

# 4. 验证下载完整性
curl -o file.zip https://example.com/file.zip
md5sum file.zip
# 对比服务器提供的 MD5

# 5. 后台下载
nohup curl -C - -o large-file.zip https://example.com/large-file.zip > download.log 2>&1 &

常见问题解决

bash
# 问题 1：服务器不支持断点续传
# 解决：只能重新下载，或使用支持多源的工具

# 问题 2：下载速度过慢
# 解决：使用分块并行下载脚本，或更换下载源

# 问题 3：磁盘空间不足
# 解决：检查磁盘空间，或使用流式处理
df -h

# 问题 4：下载被中断
# 解决：使用 -C - 自动续传
curl -C - -O https://example.com/large-file.zip

# 问题 5：文件名乱码
# 解决：使用 -o 指定文件名
curl -o "$(echo -e 'filename.txt')" https://example.com/file