Text Processing: grep, sed, and awk — The Engineer's Power Tools
grep: Searching Inside Text Files
When a factory server generates thousands of log lines per hour, you need tools to find specific information fast. grep is the most important text search tool in Linux.
grep "ERROR" /var/log/scada.log # Find lines containing ERROR
grep -i "warning" /var/log/scada.log # Case-insensitive
grep -n "ALARM" /var/log/scada.log # Show line numbers
grep -c "CRITICAL" /var/log/scada.log # Count matches
grep -r "modbus" /opt/scada/config/ # Search recursively
grep -v "DEBUG" /var/log/app.log # Lines NOT matching (invert)
grep -A 3 "FAULT" /var/log/plc.log # Show 3 lines after each match
grep -B 2 "FAULT" /var/log/plc.log # 2 lines before
The context flags (-A, -B, -C) are essential for troubleshooting. When you find an error, you almost always need surrounding lines.
Regular Expressions: Advanced Search Patterns
grep "^2026-04-15" sensor.log # Lines starting with a date
grep "\.csv$" filelist.txt # Lines ending with .csv
grep "sensor[AB]" data.log # sensorA or sensorB
grep -E "(ALARM|FAULT|CRITICAL)" system.log # Multiple keywords
grep -E "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" access.log # IP addresses
| Pattern | Matches |
|---|---|
. |
Any single character |
* |
Zero or more of previous char |
^ |
Start of line |
$ |
End of line |
[abc] |
Any character in set |
[^abc] |
Any character NOT in set |
+ |
One or more (use -E) |
sed: Search and Replace in Files
sed 's/old/new/' file.txt # Replace first per line
sed 's/old/new/g' file.txt # Replace ALL per line
sed -i 's/old/new/g' file.txt # Edit in place
sed -i.bak 's/old/new/g' file.txt # Edit in place, keep backup
Practical uses:
sed -i 's/192.168.1.50/192.168.1.100/g' /opt/scada/config/*.yaml # Update IPs
sed '/^#/d' config.yaml # Delete comment lines
sed '/^$/d' report.txt # Delete blank lines
sed -n '10,20p' large_file.csv # Print only lines 10-20
Always use -i.bak on production servers to create automatic backups before changes.
awk: Processing Column-Based Data
awk treats each line as fields separated by whitespace or a custom delimiter. It excels at structured data like CSV and log files.
awk '{print $1}' access.log # Print first column
awk -F',' '{print $2}' data.csv # Comma delimiter, column 2
awk '$3 > 100' sensor_readings.csv # Lines where column 3 > 100
awk -F',' '{sum += $2} END {print sum}' data.csv # Sum column 2
awk -F',' '$4 == "ALARM"' events.csv # Filter by status field
Formatted output:
awk -F',' '{printf "Sensor: %-10s Temp: %6.1f\n", $1, $2}' readings.csv
cut, sort, and uniq: Complementary Tools
cut -d',' -f1,3 data.csv # Extract fields 1 and 3
sort -n numbers.txt # Numeric sort
sort -t',' -k3 -n data.csv # Sort by column 3
sort error_codes.txt | uniq -c # Count occurrences (must sort first)
sort error_codes.txt | uniq -d # Show only duplicated lines
Practical Example: Extracting Temperature Alarms From a Sensor Log
Given /var/log/sensors/temp_2026-04-15.csv:
2026-04-15T08:00:01,sensor_01,72.3,NORMAL
2026-04-15T08:00:01,sensor_02,98.7,ALARM
2026-04-15T08:00:02,sensor_02,101.4,CRITICAL
# Find all alarm events
grep -E "(ALARM|CRITICAL)" /var/log/sensors/temp_2026-04-15.csv
# Count alarms per sensor
grep -E "(ALARM|CRITICAL)" /var/log/sensors/temp_2026-04-15.csv | \
awk -F',' '{print $2}' | sort | uniq -c | sort -rn
# Highest temperature reading
awk -F',' '{print $3}' /var/log/sensors/temp_2026-04-15.csv | sort -n | tail -1
# Save CRITICAL events to a report
grep "CRITICAL" /var/log/sensors/temp_2026-04-15.csv > /tmp/critical_report.csv
This combination of grep, awk, sort, and uniq processes millions of lines in seconds.
Summary
In this lesson you learned the core text processing tools:
grepsearches for patterns; use-rfor directories,-Efor extended regex.- Regular expressions match complex patterns like IPs and date ranges.
sedperforms search-and-replace; always use-i.bakfor safety.awkprocesses column-based data with filtering and calculations.cut,sort, anduniqcomplement the main tools for extraction and deduplication.- Combining these tools lets you analyze sensor logs and extract alarms in seconds.
In the next lesson, you will learn the Linux permission model for securing SCADA configuration files and controlling access to sensitive data.