Common text processing tools in Shell scripts include grep, sed, awk, and cut.
grep - Text Search Tool
Basic Usage
bash# Search for text in file grep "pattern" file.txt # Search multiple files grep "pattern" file1.txt file2.txt # Recursive search in directory grep -r "pattern" /path/to/directory # Case insensitive search grep -i "pattern" file.txt # Show line numbers grep -n "pattern" file.txt # Invert match (exclude) grep -v "pattern" file.txt # Show only matching filenames grep -l "pattern" *.txt # Count matching lines grep -c "pattern" file.txt
Regular Expressions
bash# Match start of line grep "^start" file.txt # Match end of line grep "end$" file.txt # Match digits grep "[0-9]" file.txt # Match specific count grep "a\{3\}" file.txt # Match 3 a's # Use extended regular expressions grep -E "pattern1|pattern2" file.txt
Practical Applications
bash# Find process ps aux | grep "nginx" # Find errors in logs grep "ERROR" /var/log/syslog # Find files containing specific content grep -r "TODO" ./src # Count code lines grep -c "^" *.py
sed - Stream Editor
Basic Usage
bash# Replace text sed 's/old/new/' file.txt # Global replacement sed 's/old/new/g' file.txt # Delete lines sed '3d' file.txt # Delete line 3 sed '/pattern/d' file.txt # Delete matching lines # Print specific lines sed -n '5p' file.txt # Print line 5 sed -n '1,5p' file.txt # Print lines 1-5 # Insert and append sed '2i\new line' file.txt # Insert before line 2 sed '2a\new line' file.txt # Append after line 2
Advanced Usage
bash# Use regular expressions sed 's/[0-9]\+//g' file.txt # Multiple replacements sed -e 's/old1/new1/g' -e 's/old2/new2/g' file.txt # In-place editing (modify original file) sed -i 's/old/new/g' file.txt # Edit with backup sed -i.bak 's/old/new/g' file.txt # Use variables var="pattern" sed "s/$var/replacement/g" file.txt
Practical Applications
bash# Replace values in config file sed -i 's/port=8080/port=9090/' config.ini # Delete comment lines sed '/^#/d' file.txt # Delete empty lines sed '/^$/d' file.txt # Format output sed 's/\s\+/ /g' file.txt
awk - Text Processing Tool
Basic Usage
bash# Print specific columns awk '{print $1}' file.txt # Print multiple columns awk '{print $1, $3}' file.txt # Specify delimiter awk -F: '{print $1}' /etc/passwd # Print line numbers awk '{print NR, $0}' file.txt # Conditional printing awk '$3 > 100 {print $0}' file.txt
Built-in Variables
bashNR # Current record number (line number) NF # Number of fields in current record $0 # Complete record $1, $2 # 1st, 2nd fields FS # Field separator (default space) OFS # Output field separator RS # Record separator (default newline) ORS # Output record separator
Patterns and Actions
bash# Pattern matching awk '/pattern/ {print $0}' file.txt # BEGIN and END blocks awk 'BEGIN {print "Start"} {print $0} END {print "End"}' file.txt # Calculate sum awk '{sum += $1} END {print sum}' file.txt # Calculate average awk '{sum += $1; count++} END {print sum/count}' file.txt
Practical Applications
bash# Calculate total file size ls -l | awk '{sum += $5} END {print sum}' # Find maximum value awk '{if ($1 > max) max = $1} END {print max}' file.txt # Format output awk '{printf "%-10s %10s\n", $1, $2}' file.txt # Process CSV file awk -F, '{print $1, $3}' data.csv
cut - Text Cutting Tool
Basic Usage
bash# Cut by characters cut -c 1-5 file.txt # Extract characters 1-5 cut -c 1,5,10 file.txt # Extract characters 1, 5, 10 # Cut by bytes cut -b 1-10 file.txt # Cut by fields cut -d: -f1 /etc/passwd # Extract 1st field cut -d: -f1,3 /etc/passwd # Extract 1st and 3rd fields
Practical Applications
bash# Extract usernames cut -d: -f1 /etc/passwd # Extract IP address ifconfig | grep "inet " | cut -d: -f2 | cut -d' ' -f1 # Extract file extension echo "file.txt" | cut -d. -f2
Combined Usage Examples
Log Analysis
bash# Count errors grep "ERROR" /var/log/app.log | wc -l # Find logs for specific time period sed -n '/2024-01-01 10:00/,/2024-01-01 11:00/p' /var/log/app.log # Extract IP addresses grep "ERROR" /var/log/app.log | awk '{print $5}' | cut -d: -f2 # Count error types grep "ERROR" /var/log/app.log | awk '{print $6}' | sort | uniq -c
Text Processing
bash# Delete empty lines and comments sed '/^$/d; /^#/d' file.txt # Replace multiple spaces with single space sed 's/\s\+/ /g' file.txt # Extract specific column and deduplicate awk '{print $1}' file.txt | sort -u # Calculate average awk '{sum += $1} END {print sum/NR}' file.txt
System Administration
bash# Find processes with highest CPU usage ps aux | sort -rk 3 | head -n 5 # Find processes with highest memory usage ps aux | sort -rk 4 | head -n 5 # Count processes per user ps aux | awk '{print $1}' | sort | uniq -c # Find process using specific port lsof -i :8080 | awk '{print $2}' | tail -n +2
Best Practices
- Combine tools with pipes:
grep | awk | sort | uniq - Prefer grep for searching: Fastest for simple searches
- Use sed for replacement: Preferred tool for text replacement
- Use awk for column data: Best choice for structured text
- Use cut for fixed positions: Simple text cutting tasks
- Be aware of regex syntax: grep and sed have slightly different regex
- Test commands: Test commands before processing important files