20 Essential Linux Command-Line Tools — sed, awk, jq, fzf Guide

Text Processing Classics

1. sed — Stream Editor

sed performs text transformations on input streams. The most common use case is in-place substitution:

# Replace first occurrence per line
sed 's/foo/bar/' file.txt

# Replace ALL occurrences (global flag)
sed 's/foo/bar/g' file.txt

# In-place edit (creates backup with .bak)
sed -i.bak 's/localhost/production.example.com/g' config.yml

# Delete lines matching pattern
sed '/^#/d' config.ini   # Remove comment lines

# Print lines 5-10
sed -n '5,10p' access.log

# Multiple operations
sed -e 's/foo/bar/g' -e '/^$/d' file.txt

2. awk — Data Extraction and Reporting

awk processes text line by line, splitting each line into fields. Think of it as a mini programming language for text:

# Print specific columns (field separator: space by default)
awk '{print $1, $3}' access.log

# Use custom field separator
awk -F: '{print $1, $3}' /etc/passwd   # username:uid

# Sum a column
awk '{sum += $5} END {print "Total:", sum}' sales.csv

# Filter rows
awk '$3 > 100 {print $1, $3}' data.txt

# Count occurrences
awk '{count[$1]++} END {for (ip in count) print count[ip], ip}' \
    access.log | sort -rn | head -10

3. jq — JSON Processor

jq is the de facto standard for parsing and transforming JSON on the command line. Essential for working with APIs:

# Pretty print JSON
curl -s api.example.com/users | jq .

# Extract a field
jq '.name' user.json
jq '.[].name' users.json   # All names from array

# Filter array
jq '[.[] | select(.age > 18)]' users.json

# Transform structure
jq '[.[] | {name: .name, email: .email}]' users.json

# Get keys
jq 'keys' object.json

# Format for shell scripts
NAME=$(curl -s api.example.com/user/1 | jq -r '.name')

# Validate JSON — also try DevKits JSON Formatter at /tools/json-formatter.html
echo '{"key": "value"}' | jq . > /dev/null && echo "Valid JSON"

Modern Replacements

4. ripgrep (rg) — Faster grep

ripgrep is 10-100x faster than grep, respects .gitignore, and supports Unicode by default:

# Basic search
rg "pattern" .

# Search specific file types
rg "TODO" --type py

# Case insensitive
rg -i "error" logs/

# Show context (3 lines around match)
rg -C 3 "segfault" /var/log/

# Count matches per file
rg --count "import" src/

5. bat — Better cat

bat is a cat clone with syntax highlighting, line numbers, and Git diff indicators:

bat README.md
bat --language python script.py
bat --paging never config.yml   # Disable paging

# As a man page reader
export MANPAGER="sh -c 'col -bx | bat -l man -p'"

6. fzf — Fuzzy Finder

fzf is an interactive fuzzy finder for any list of items. The most productive terminal tool you can install:

# Interactive file finder
vim $(fzf)

# Fuzzy history search (Ctrl+R replacement)
history | fzf

# Kill a process interactively
kill -9 $(ps aux | fzf | awk '{print $2}')

# Git branch switcher
git checkout $(git branch | fzf)

# Shell integration (add to .bashrc/.zshrc)
# Ctrl+T: paste selected files
# Ctrl+R: search command history
# Alt+C: cd into selected directory
eval "$(fzf --bash)"

System Monitoring

7. htop — Interactive Process Viewer

htop                    # Interactive process viewer
htop -u username        # Filter by user
htop -p 1234,5678       # Monitor specific PIDs

8. ncdu — Disk Usage Analyzer

ncdu /                  # Analyze entire filesystem
ncdu ~                  # Analyze home directory interactively

9. ss — Socket Statistics (netstat replacement)

ss -tlnp                # TCP listening sockets with process names
ss -tunp                # All TCP/UDP connections
ss -s                   # Summary statistics

File and Data Tools

10. fd — find replacement

fd ".py"                # Find Python files
fd -t f -e ".log" .     # Find .log files only
fd --hidden ".env"      # Include hidden files

11. xargs — Build and Execute Commands

# Delete all .tmp files
find . -name "*.tmp" | xargs rm

# Run in parallel (-P 4 = 4 parallel processes)
cat urls.txt | xargs -P 4 -I {} curl -sO {}

# Null-delimited (safe for filenames with spaces)
find . -name "*.log" -print0 | xargs -0 gzip

12. curl — HTTP Client

# GET with headers
curl -H "Authorization: Bearer $TOKEN" https://api.example.com/users

# POST JSON
curl -X POST -H "Content-Type: application/json" \
  -d '{"name":"Alice"}' https://api.example.com/users

# Save to file, follow redirects, show progress
curl -L -o output.html https://example.com

# Show only HTTP status code
curl -o /dev/null -s -w "%{http_code}" https://example.com

Terminal Multiplexing

13. tmux — Terminal Multiplexer

tmux new -s myproject       # New session named "myproject"
tmux attach -t myproject    # Attach to existing session

# Key bindings (Ctrl+B prefix):
# Ctrl+B c     — new window
# Ctrl+B "     — split horizontal
# Ctrl+B %     — split vertical
# Ctrl+B d     — detach (session persists)
# Ctrl+B [     — scroll mode

14. watch — Repeat Command Periodically

watch -n 2 "docker stats --no-stream"   # Update every 2s
watch -d "ls -la /tmp"                  # Highlight changes

Text Manipulation

15. sort and uniq

# Count unique IP addresses in log
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -20

# Sort by 3rd column numerically
sort -k3 -n data.csv

# Find duplicate lines
sort file.txt | uniq -d

16. cut — Extract Columns

cut -d: -f1,3 /etc/passwd       # Fields 1 and 3, colon delimiter
cut -c1-10 file.txt             # First 10 characters per line
ls -la | cut -c40-              # Filenames from ls output

17. tr — Translate Characters

echo "hello world" | tr '[:lower:]' '[:upper:]'   # HELLO WORLD
echo "hello   world" | tr -s ' '                   # Squeeze spaces
cat file.txt | tr -d '\r'                          # Remove Windows line endings

Debugging and Tracing

18. strace — System Call Tracer

strace -p 1234                  # Trace running process
strace -e trace=network ls      # Only network syscalls
strace -c python script.py      # Count syscalls

19. lsof — List Open Files

lsof -i :8080                   # What's using port 8080?
lsof -u username                # Files opened by user
lsof +D /mnt/disk               # Files on a mount point

20. tee — Split Output

# Write to file AND stdout simultaneously
make build 2>&1 | tee build.log

# Append mode
./test.sh | tee -a test-results.log

Frequently Asked Questions

What is the difference between sed and awk?

Use sed for simple substitutions and line deletions. Use awk when you need to work with specific columns, perform calculations, or apply conditional logic. awk is a complete programming language; sed is for stream editing.

How do I install fzf, ripgrep, and bat?

All are available via package managers: apt install fzf ripgrep bat on Debian/Ubuntu, brew install fzf ripgrep bat on macOS, or cargo install ripgrep bat via Rust's package manager.

Is jq available on macOS?

Yes — brew install jq. Also available as a web tool: use DevKits JSON Formatter for visual JSON exploration without installing anything. For complex regex used in text processing, try DevKits Regex Tester.

Deploy Your Own Tools — Recommended Hosting

🌐

Hostinger

Web Hosting from $2.99/mo

💧

DigitalOcean

$200 Free Credit