FFmpeg performance optimization is critical for processing large-scale audio/video tasks. Proper use of hardware acceleration and encoding parameters can significantly improve processing speed.
Hardware Acceleration
NVIDIA GPU Acceleration (NVENC/NVDEC)
bash# Use NVDEC for decoding ffmpeg -hwaccel cuda -i input.mp4 -c:v h264_nvenc output.mp4 # Use NVENC for encoding ffmpeg -i input.mp4 -c:v h264_nvenc -preset fast -b:v 5M output.mp4 # Complete GPU acceleration workflow ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 \ -c:v h264_nvenc -preset fast -b:v 5M output.mp4 # Specify GPU device ffmpeg -hwaccel_device 0 -i input.mp4 -c:v h264_nvenc output.mp4
Intel QSV Acceleration
bash# Use QSV for decoding and encoding ffmpeg -hwaccel qsv -i input.mp4 -c:v h264_qsv output.mp4 # QSV encoding parameters ffmpeg -i input.mp4 -c:v h264_qsv -preset fast -b:v 5M output.mp4
AMD VCE Acceleration
bash# Use VCE for encoding ffmpeg -i input.mp4 -c:v h264_amf -quality speed -b:v 5M output.mp4
VideoToolbox Acceleration (macOS)
bash# Use VideoToolbox for encoding ffmpeg -i input.mp4 -c:v h264_videotoolbox -b:v 5M output.mp4 # Use ProRes encoding ffmpeg -i input.mp4 -c:v prores_videotoolbox -profile:v 3 output.mov
Encoding Parameter Optimization
Preset Selection
bash# Ultra fast speed (lower quality) ffmpeg -i input.mp4 -c:v libx264 -preset ultrafast output.mp4 # Very fast speed (medium quality) ffmpeg -i input.mp4 -c:v libx264 -preset veryfast output.mp4 # Balance speed and quality ffmpeg -i input.mp4 -c:v libx264 -preset medium output.mp4 # Best quality (slower speed) ffmpeg -i input.mp4 -c:v libx264 -preset slow output.mp4
CRF Quality Control
bash# High quality (larger file) ffmpeg -i input.mp4 -c:v libx264 -crf 18 output.mp4 # Default quality ffmpeg -i input.mp4 -c:v libx264 -crf 23 output.mp4 # Low quality (smaller file) ffmpeg -i input.mp4 -c:v libx264 -crf 28 output.mp4
Thread Optimization
bash# Set thread count ffmpeg -i input.mp4 -threads 4 -c:v libx264 output.mp4 # Auto thread count ffmpeg -i input.mp4 -threads 0 -c:v libx264 output.mp4
Multi-threaded Processing
Parallel Processing Multiple Files
bash# Use GNU parallel find input_dir -name "*.mp4" | parallel -j 4 ffmpeg -i {} -c:v libx264 output_dir/{/.}.mp4 # Use xargs find input_dir -name "*.mp4" | xargs -P 4 -I {} ffmpeg -i {} -c:v libx264 output_dir/{/.}.mp4
Segmented Processing
bash# Segment and transcode then merge ffmpeg -i input.mp4 -c copy -f segment -segment_time 60 segment_%03d.mp4 for f in segment_*.mp4; do ffmpeg -i "$f" -c:v libx264 transcoded_"$f"; done ffmpeg -f concat -safe 0 -i <(for f in transcoded_segment_*.mp4; do echo "file '$PWD/$f'"; done) -c copy output.mp4
Memory Optimization
Reduce Memory Usage
bash# Use streaming processing ffmpeg -i input.mp4 -c:v libx264 -f null - # Limit buffer size ffmpeg -i input.mp4 -c:v libx264 -bufsize 1M output.mp4
Performance Analysis
View Encoding Information
bash# Show detailed encoding information ffmpeg -i input.mp4 -c:v libx264 -v verbose output.mp4 # Show performance statistics ffmpeg -i input.mp4 -c:v libx264 -stats output.mp4
Benchmark Testing
bash# Test decoding performance ffmpeg -benchmark -i input.mp4 -f null - # Test encoding performance ffmpeg -benchmark -i input.mp4 -c:v libx264 -f null -
Common Performance Issues and Solutions
High CPU Usage
bash# Reduce encoding complexity ffmpeg -i input.mp4 -c:v libx264 -preset ultrafast -tune fastdecode output.mp4 # Use hardware acceleration ffmpeg -hwaccel cuda -i input.mp4 -c:v h264_nvenc output.mp4
High Memory Usage
bash# Reduce thread count ffmpeg -i input.mp4 -threads 2 -c:v libx264 output.mp4 # Use streaming processing ffmpeg -i input.mp4 -c:v libx264 -f null -
Slow Processing Speed
bash# Use faster preset ffmpeg -i input.mp4 -c:v libx264 -preset ultrafast output.mp4 # Enable hardware acceleration ffmpeg -hwaccel cuda -i input.mp4 -c:v h264_nvenc output.mp4
Best Practices
- Choose appropriate acceleration based on hardware: Prioritize hardware encoding when GPU is available
- Balance speed and quality: Choose appropriate preset and CRF based on application scenario
- Set reasonable thread count: Typically set to 1-2 times the number of CPU cores
- Use streaming processing: Consider segmented processing for large files
- Monitor resource usage: Use system monitoring tools to observe CPU, GPU, and memory usage
Performance optimization needs to be adjusted based on specific hardware configuration and application scenarios. It is recommended to conduct multiple tests to find the best parameter combination.