8.1 Shell Scripting Fundamentals

Content of this chapter

Overview and Objectives

Learning Objectives

After completing this section, you will be able to:

  1. Create properly structured shell scripts with appropriate shebang lines, comments, and execution permissions for production use
  2. Implement conditional logic using if/then/else and case statements to make scripts respond to different situations
  3. Apply boolean operators and file/environment tests to create robust decision-making logic in scripts
  4. Design and implement loops using for, while, and until constructs to automate repetitive tasks
  5. Create and utilize functions to organize script code and create reusable components

Real-world Context

Shell scripting transforms you from someone who types commands manually into someone who builds reusable automation solutions. In my four decades of system administration, I’ve seen how proper scripting separates competent administrators from exceptional ones. Scripts handle routine maintenance, respond to system events, process log files, manage user accounts, and coordinate complex deployment procedures.

Every system administrator eventually reaches a point where manual command execution becomes insufficient. You might need to check disk space on dozens of servers, deploy configuration changes across multiple environments, or process hundreds of log files for security analysis. Shell scripts make these tasks manageable and reliable, and ensure consistent execution while you reduce human error.

Modern infrastructure increasingly relies on automation, which makes scripting skills essential for career progression. Whether you implement configuration management, build CI/CD pipelines, or create monitoring solutions, shell scripting provides the foundation for more advanced automation tools. The principles you learn here apply directly to larger automation frameworks and cloud-native environments.

Script Structure and Foundation Elements

Building reliable shell scripts starts with understanding proper structure and conventions. A well-structured script communicates its purpose clearly, executes predictably, and can be maintained by other administrators. This foundation enables you to create scripts that work consistently across different systems and environments.

The shebang line serves as your script’s execution directive and tells the system which interpreter to use. While #!/bin/bash works on most systems, #!/usr/bin/env bash provides better portability because it searches the PATH for bash. This matters when scripts move between different Linux distributions or when you install bash in non-standard locations.

Comments serve multiple purposes beyond simple documentation. They explain complex logic, document assumptions about the environment, provide usage examples, and help future maintainers understand the script’s purpose. Well-commented scripts are particularly important in team environments where multiple people might modify the same automation tools.

Here’s how proper script structure looks in practice:

#!/usr/bin/env bash
# Script: system-cleanup.sh
# Purpose: Automated system maintenance and cleanup
# Author: Jochen (Monospace Mentor)
# Version: 1.2
# Last Modified: 2025-01-15

# Set strict error handling
set -euo pipefail

# Global variables
SCRIPT_NAME=$(basename "$0")
LOG_FILE="/var/log/system-cleanup.log"
TEMP_DIR="/tmp"
MAX_LOG_SIZE=100000000  # 100MB in bytes

# Function definitions
log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

# Main script execution starts here
log_message "Starting system cleanup process"

This example demonstrates a well-structured script foundation with clear sections for variables, functions, and main execution. The set -euo pipefail line enables strict error handling, while the global variables provide consistent configuration throughout the script. The log_message function shows how to create reusable components early in your script structure.

Execution permissions are often overlooked but critical for script deployment. When you use chmod +x scriptname.sh, you make your script executable, but you need to understand permission implications for security. Scripts that run with elevated privileges need careful permission management to prevent unauthorized modifications.

The next essential element is implementing decision-making capabilities through conditional statements.

Conditional Statements for Decision Making

Conditional statements enable your scripts to respond intelligently to different situations. Instead of executing the same commands regardless of circumstances, you can create scripts that adapt their behavior based on file existence, user input, system conditions, or environment variables.

Understanding the if Statement

The if statement forms the foundation of conditional logic in bash. It evaluates a test condition and executes a block of commands only when that condition is true. The basic syntax follows this pattern:

if [ condition ]; then
    commands
fi

The square brackets [ ] represent the test command, which evaluates the condition and returns an exit status. If the condition is true (exit status 0), bash executes the commands between then and fi. If the condition is false (non-zero exit status), bash skips the command block entirely.

The semicolon after the closing bracket is required when placing then on the same line as if. Alternatively, you can place then on the next line without the semicolon:

if [ condition ]
then
    commands
fi

Extending Logic with else

The else clause provides an alternative path when the if condition fails. This allows your script to handle both positive and negative test results with grace:

if [ condition ]; then
    commands_for_true
else
    commands_for_false
fi

When the condition is true, bash executes the commands after then and skips everything after else. When the condition is false, bash skips the commands after then and executes the commands after else. This ensures exactly one path executes, never both.

Multiple Conditions with elif

The elif (else if) clause allows you to test multiple conditions in sequence. This is more efficient and readable than nested if statements when you need to choose between several mutually exclusive options:

if [ condition1 ]; then
    commands_for_condition1
elif [ condition2 ]; then
    commands_for_condition2
elif [ condition3 ]; then
    commands_for_condition3
else
    commands_for_none_true
fi

Bash evaluates conditions in order and executes the commands for the first true condition it encounters. Once a condition matches, bash skips all remaining elif and else clauses. The final else clause is optional and provides a default action when none of the conditions are true.

Test Conditions and Operators

The condition inside the brackets uses test operators to compare values, check file properties, or evaluate strings. Common test operators include:

  • String comparisons: = (equal), != (not equal), -z (empty), -n (non-empty)
  • Numeric comparisons: -eq (equal), -ne (not equal), -lt (less than), -gt (greater than)
  • File tests: -f (regular file), -d (directory), -e (exists), -r (readable), -w (writable), -x (executable)

Always quote variables in test conditions to prevent errors when variables contain spaces or are empty: [ "$variable" = "value" ] instead of [ $variable = value ].

Here’s a practical example of conditional logic in a backup script:

#!/usr/bin/env bash
# Backup script with intelligent decision making

BACKUP_SOURCE="/home/users"
BACKUP_DEST="/backup/daily"
FREE_SPACE=$(df "$BACKUP_DEST" | tail -1 | awk '{print $4}')
REQUIRED_SPACE=5000000  # 5GB in KB

# Check if source directory exists
if [ ! -d "$BACKUP_SOURCE" ]; then
    echo "Error: Source directory $BACKUP_SOURCE does not exist"
    exit 1
elif [ ! -w "$BACKUP_DEST" ]; then
    echo "Error: Cannot write to backup destination $BACKUP_DEST"
    exit 2
elif [ "$FREE_SPACE" -lt "$REQUIRED_SPACE" ]; then
    echo "Warning: Insufficient disk space for backup"
    echo "Required: ${REQUIRED_SPACE}KB, Available: ${FREE_SPACE}KB"
    exit 3
else
    echo "All conditions met, proceeding with backup"
    tar -czf "$BACKUP_DEST/backup-$(date +%Y%m%d).tar.gz" "$BACKUP_SOURCE"
fi

This backup script demonstrates how conditional logic creates intelligent automation. The script checks multiple prerequisites before proceeding: source directory existence, write permissions, and available disk space. Each condition uses appropriate test operators (-d for directories, -w for write permissions, -lt for numeric comparison), and the script provides specific error messages and exit codes for different failure scenarios.

Understanding Case Statements

Case statements provide elegant solutions when you compare a single variable against multiple possibilities. They’re particularly useful for processing command-line arguments, handling user menu selections, or responding to different system conditions. Case statements are often more readable than long chains of elif statements when you match against specific values.

The basic syntax of a case statement follows this pattern:

case $variable in
    pattern1)
        commands
        ;;
    pattern2)
        commands
        ;;
    *)
        default_commands
        ;;
esac

Case Statement Components

Variable Expression: The $variable after case can be any string expression - a variable, command substitution, or literal string. Bash evaluates this expression once and compares it against each pattern.

Patterns: Each pattern can be a literal string, glob pattern with wildcards (*, ?, [...]), or multiple patterns separated by pipe symbols (|). Patterns are matched literally, not as regular expressions.

Commands: After each pattern and closing parenthesis, you place the commands to execute when that pattern matches. Commands continue until the double semicolon (;;).

Double Semicolon: The ;; terminates each case and tells bash to skip the remaining patterns. This prevents fall-through behavior found in some other programming languages.

Default Case: The * pattern matches anything not caught by previous patterns, serving as a default case. While optional, it’s good practice for handling unexpected input.

Case Termination: The esac keyword (case spelled backward) closes the case statement, similar to how fi closes if statements.

Pattern Matching Features

Case patterns support powerful matching capabilities:

  • Literal strings: "DEBUG" matches exactly “DEBUG”
  • Wildcards: "*.log" matches any string ending in “.log”
  • Character classes: "[0-9]*" matches strings starting with a digit
  • Multiple patterns: "start"|"restart" matches either “start” or “restart”
  • Negation: !(pattern) matches anything except the pattern (requires shopt -s extglob)

Here’s how case statements work in practice:

# Handle different log levels in a monitoring script
case "$LOG_LEVEL" in
    "DEBUG")
        echo "Debug mode enabled - verbose output"
        set -x
        ;;
    "INFO")
        echo "Normal operation mode"
        ;;
    "WARNING")
        echo "Warning level - reduced output"
        ;;
    "ERROR")
        echo "Error level - critical messages only"
        exec 2>/dev/null
        ;;
    *)
        echo "Unknown log level: $LOG_LEVEL"
        echo "Valid options: DEBUG, INFO, WARNING, ERROR"
        exit 1
        ;;
esac

This case statement shows how to handle different logging levels in a monitoring script. Each case performs appropriate actions for the specified log level, from enabling debug output with set -x to redirecting errors to /dev/null for error-only mode. The wildcard pattern (*) provides a default case that handles invalid input gracefully with helpful usage information.

Understanding how to combine conditions with boolean operators multiplies the power of your conditional statements.

Boolean Operators and Complex Logic

Boolean operators allow you to combine multiple conditions into sophisticated decision-making logic. Instead of writing separate if statements for each condition, you can create complex tests that evaluate multiple criteria simultaneously. This makes your scripts more efficient and your logic more precise.

Understanding the AND Operator (&&)

The AND operator (&&) requires all conditions to be true for the overall test to succeed. This implements logical conjunction, where every component must be satisfied for the entire expression to be true.

Syntax: condition1 && condition2

Behavior: If condition1 is false, bash doesn’t evaluate condition2 at all (short-circuit evaluation). Only when condition1 is true does bash proceed to test condition2. The overall expression is true only when both conditions are true.

Use Cases: Perfect for situations where multiple requirements must be met before proceeding. For example, you might check that both a file exists AND the user has write permission before attempting to modify it.

Understanding the OR Operator (||)

The OR operator (||) succeeds if any of the conditions are true. This implements logical disjunction, where only one component needs to be satisfied for the entire expression to be true.

Syntax: condition1 || condition2

Behavior: If condition1 is true, bash doesn’t evaluate condition2 (short-circuit evaluation). Only when condition1 is false does bash test condition2. The overall expression is true when either condition is true.

Use Cases: Useful for fallback scenarios or when multiple valid conditions exist. You might check if a process is running OR if a service file exists before attempting to start a service.

Understanding the NOT Operator (!)

The NOT operator (!) inverts the result of a condition, allowing you to test for the absence of something or the opposite of a condition. This implements logical negation.

Syntax: ! condition

Behavior: If the condition would normally be true, ! makes it false. If the condition would normally be false, ! makes it true.

Use Cases: Particularly useful when checking if files don’t exist, processes aren’t running, or users don’t have specific permissions. Often more readable than using negative test operators.

Combining Operators with Parentheses

Parentheses allow you to group conditions and control evaluation order, just like in mathematical expressions. This becomes essential when you combine AND and OR operators in complex conditions, and ensures your logic evaluates exactly as you intend.

Syntax: ( condition1 && condition2 ) || condition3

Behavior: Operations inside parentheses are evaluated first, then the result is used in the larger expression. This allows you to create complex logical structures like “if (A and B) or C”.

Short-Circuit Evaluation

Understanding short-circuit evaluation is crucial for writing efficient and safe conditions:

  • With AND (&&): If the first condition is false, bash doesn’t evaluate subsequent conditions because the overall result must be false.
  • With OR (||): If the first condition is true, bash doesn’t evaluate subsequent conditions because the overall result must be true.

This behavior affects performance and can prevent errors from subsequent tests that might fail if earlier conditions aren’t met.

Here’s how boolean operators work in practice:

#!/usr/bin/env bash
# System health check with complex conditions

LOAD_THRESHOLD=2.0
DISK_THRESHOLD=90
MEMORY_THRESHOLD=85

# Get current system metrics
CURRENT_LOAD=$(uptime | awk '{print $(NF-2)}' | sed 's/,//')
DISK_USAGE=$(df / | tail -1 | awk '{print $5}' | sed 's/%//')
MEMORY_USAGE=$(free | grep Mem | awk '{printf "%.0f", $3/$2 * 100}')

# Complex condition checking multiple thresholds
if [ "$(echo "$CURRENT_LOAD > $LOAD_THRESHOLD" | bc)" -eq 1 ] && \
   [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ] || \
   [ "$MEMORY_USAGE" -gt "$MEMORY_THRESHOLD" ]; then
    
    echo "ALERT: System resources critical"
    echo "Load: $CURRENT_LOAD, Disk: ${DISK_USAGE}%, Memory: ${MEMORY_USAGE}%"
    
    # Take corrective action
    if [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ] && [ -d "/tmp" ]; then
        echo "Cleaning temporary files..."
        find /tmp -type f -atime +7 -delete
    fi
    
elif [ "$CURRENT_LOAD" > "1.0" ] || [ "$DISK_USAGE" -gt "75" ]; then
    echo "WARNING: System resources elevated"
    echo "Load: $CURRENT_LOAD, Disk: ${DISK_USAGE}%, Memory: ${MEMORY_USAGE}%"
else
    echo "System resources normal"
fi

This system health monitoring script demonstrates complex boolean logic in action. The script gathers multiple system metrics and uses compound conditions to determine system status. The first condition combines high load AND high disk usage OR high memory usage, and shows how parentheses and operators work together. The bc command handles floating-point comparison for load averages, while the subsequent elif provides warning thresholds for early notification.

The NOT operator (!) inverts the result of a condition and allows you to test for the absence of something or the opposite of a condition. Parentheses allow you to group conditions and control evaluation order, just like in mathematical expressions. This becomes essential when you combine AND and OR operators in complex conditions, and ensures your logic evaluates exactly as you intend.

File and environment variable checks form a crucial part of most shell scripts, providing the foundation for robust system automation.

File and Environment Variable Tests

System administration scripts constantly need to check file properties and environment conditions before taking action. These tests prevent errors, ensure scripts run in appropriate environments, and provide the information needed for intelligent decision making. When you master these tests, you create reliable automation.

File Existence and Type Tests

File test operators provide detailed information about files and directories. Beyond simple existence checks, you can test for read/write permissions, file types, size comparisons, and modification times.

Basic Existence Tests:

  • -e file: Returns true if the file exists (any type - regular file, directory, device, etc.)
  • -f file: Returns true if the file exists AND is a regular file (not a directory or special file)
  • -d file: Returns true if the file exists AND is a directory
  • -L file: Returns true if the file exists AND is a symbolic link
  • -S file: Returns true if the file exists AND is a socket
  • -p file: Returns true if the file exists AND is a named pipe (FIFO)

File Permission Tests:

  • -r file: Returns true if the file exists and is readable by the current user
  • -w file: Returns true if the file exists and is writable by the current user
  • -x file: Returns true if the file exists and is executable by the current user
  • -O file: Returns true if the file exists and is owned by the current user
  • -G file: Returns true if the file exists and is owned by the current user’s group

File Property Tests:

  • -s file: Returns true if the file exists and has a size greater than zero
  • -z file: Returns true if the file exists and has zero size
  • -u file: Returns true if the file exists and has the setuid bit set
  • -g file: Returns true if the file exists and has the setgid bit set
  • -k file: Returns true if the file exists and has the sticky bit set

File Comparison Tests

You can compare files directly using special operators:

  • file1 -nt file2: Returns true if file1 is newer than file2 (based on modification time)
  • file1 -ot file2: Returns true if file1 is older than file2
  • file1 -ef file2: Returns true if file1 and file2 refer to the same file (same device and inode)

Environment Variable Tests

Environment variable tests allow scripts to adapt to different runtime environments and user configurations. You can check if variables are set, compare their values, or use them to control script behavior.

Variable State Tests:

  • -z "$VAR": Returns true if the variable is empty or unset (zero length)
  • -n "$VAR": Returns true if the variable is non-empty (non-zero length)
  • [ "$VAR" ]: Shorthand for -n "$VAR", returns true if variable is non-empty

String Comparison Tests:

  • "$VAR" = "value": Returns true if the variable exactly equals the specified value
  • "$VAR" != "value": Returns true if the variable does not equal the specified value
  • "$VAR" < "value": Returns true if the variable is lexicographically less than the value
  • "$VAR" > "value": Returns true if the variable is lexicographically greater than the value

Numeric Comparison Tests (for variables containing numbers):

  • "$VAR" -eq number: Returns true if the variable equals the number
  • "$VAR" -ne number: Returns true if the variable does not equal the number
  • "$VAR" -lt number: Returns true if the variable is less than the number
  • "$VAR" -le number: Returns true if the variable is less than or equal to the number
  • "$VAR" -gt number: Returns true if the variable is greater than the number
  • "$VAR" -ge number: Returns true if the variable is greater than or equal to the number

Parameter Expansion for Variable Testing

Bash provides special parameter expansion syntax for handling undefined or empty variables:

  • ${VAR:-default}: Returns the value of VAR if set and non-empty, otherwise returns “default”
  • ${VAR:=default}: Like above, but also assigns “default” to VAR if it was unset
  • ${VAR:+alternate}: Returns “alternate” if VAR is set and non-empty, otherwise returns nothing
  • ${VAR:?message}: Returns the value of VAR if set and non-empty, otherwise prints “message” and exits

Important Quoting Considerations

Always quote variables in test conditions to prevent errors when variables contain spaces or are empty. Use [ "$variable" = "value" ] instead of [ $variable = value ]. Unquoted variables can cause syntax errors or unexpected behavior when they expand to multiple words or contain special characters.

Here’s a practical example of file testing in a log rotation script:

#!/usr/bin/env bash
# Log rotation script with comprehensive file checks

LOG_FILE="/var/log/application.log"
ARCHIVE_DIR="/var/log/archives"
MAX_SIZE=10485760  # 10MB in bytes

# Comprehensive file and directory checks
if [ ! -e "$LOG_FILE" ]; then
    echo "Log file $LOG_FILE does not exist - nothing to rotate"
    exit 0
elif [ ! -f "$LOG_FILE" ]; then
    echo "Error: $LOG_FILE exists but is not a regular file"
    exit 1
elif [ ! -r "$LOG_FILE" ]; then
    echo "Error: Cannot read $LOG_FILE - check permissions"
    exit 2
fi

# Check archive directory
if [ ! -d "$ARCHIVE_DIR" ]; then
    echo "Creating archive directory: $ARCHIVE_DIR"
    mkdir -p "$ARCHIVE_DIR" || {
        echo "Error: Cannot create archive directory"
        exit 3
    }
elif [ ! -w "$ARCHIVE_DIR" ]; then
    echo "Error: Cannot write to archive directory $ARCHIVE_DIR"
    exit 4
fi

# Check file size
if [ $(stat -f%z "$LOG_FILE" 2>/dev/null || stat -c%s "$LOG_FILE") -gt "$MAX_SIZE" ]; then
    echo "Rotating log file (size exceeds ${MAX_SIZE} bytes)"
    mv "$LOG_FILE" "$ARCHIVE_DIR/application-$(date +%Y%m%d-%H%M%S).log"
    touch "$LOG_FILE"
    chmod 644 "$LOG_FILE"
else
    echo "Log file size within limits - no rotation needed"
fi

This log rotation script showcases comprehensive file testing techniques. The script performs multiple validation steps: checking file existence with -e, verifying it’s a regular file with -f, and testing read permissions with -r. The directory checks ensure the archive location is available and writable. The stat command with portable options determines file size, and the script includes fallback for different operating systems. Finally, the script creates new files with appropriate permissions after rotation.

Environment variable tests allow scripts to adapt to different runtime environments and user configurations. You can check if variables are set, compare their values, or use them to control script behavior. This is particularly important for scripts that run in different environments or under different user accounts.

Common environment checks include verifying the current user ($USER), checking the home directory ($HOME), examining the PATH for required tools, and reading custom configuration variables. These checks make scripts more portable and reliable across different systems and deployments.

Here’s an example of environment variable testing in a deployment script:

#!/usr/bin/env bash
# Deployment script with environment validation

# Required environment variables
REQUIRED_VARS=("DEPLOY_ENV" "APP_VERSION" "DATABASE_URL")

# Check for required variables
for var in "${REQUIRED_VARS[@]}"; do
    if [ -z "${!var}" ]; then
        echo "Error: Required variable $var is not set"
        exit 1
    fi
done

# Validate deployment environment
case "$DEPLOY_ENV" in
    "development"|"staging"|"production")
        echo "Deploying to $DEPLOY_ENV environment"
        ;;
    *)
        echo "Error: Invalid deployment environment: $DEPLOY_ENV"
        echo "Valid environments: development, staging, production"
        exit 2
        ;;
esac

# Check if we're running as the correct user
if [ "$USER" != "deploy" ] && [ "$DEPLOY_ENV" = "production" ]; then
    echo "Error: Production deployments must run as 'deploy' user"
    exit 3
fi

# Proceed with deployment
echo "Environment validated - proceeding with deployment"

This deployment script demonstrates environment validation best practices. The script uses an array to define required variables and loops through them to ensure all are set. The case statement validates that the deployment environment matches approved values, preventing accidental deployments to wrong environments. User validation adds an additional safety check for production deployments, ensuring critical operations run under the correct account.

With solid conditional logic in place, the next step is implementing repetitive operations through various loop constructs.

Loop Constructs for Repetitive Operations

Loops are where shell scripting really shines for system administration tasks. Instead of manually executing the same command on dozens of servers or hundreds of files, loops automate repetitive operations with consistent, reliable execution. When you understand different loop types, you choose the right approach for each automation challenge.

Understanding For Loops

The for loop excels at processing lists of items - files, directories, server names, or user accounts. It takes each item from a list and executes a block of commands for that item. This makes it perfect for batch operations where you know the set of targets in advance.

Basic Syntax:

for variable in list; do
    commands
done

Components Explained:

  • variable: A temporary variable that holds the current item from the list
  • list: A space-separated list of items to process
  • do/done: Keywords that mark the beginning and end of the loop body
  • commands: The operations to perform for each item

List Sources: The list can come from various sources:

  • Literal values: for day in Monday Tuesday Wednesday
  • Glob patterns: for file in /var/log/*.log
  • Command substitution: for user in $(cat /etc/passwd | cut -d: -f1)
  • Brace expansion: for num in {1..10}
  • Arrays: for item in "${array[@]}"

Understanding While Loops

While loops continue executing as long as a condition remains true. They’re ideal for monitoring situations, processing data streams, or implementing retry logic. While loops are particularly useful when you don’t know in advance how many iterations you’ll need.

Basic Syntax:

while [ condition ]; do
    commands
done

Components Explained:

  • condition: A test expression that’s evaluated before each iteration
  • Loop continues as long as the condition returns true (exit status 0)
  • Loop terminates when the condition returns false (non-zero exit status)
  • commands: The operations to perform during each iteration

Common Patterns:

  • Infinite loops: while true or while : for continuous monitoring
  • Counter loops: Using a variable that increments until reaching a limit
  • File processing: Reading input line by line until end of file
  • Service monitoring: Checking status until a desired state is reached

Understanding Until Loops

Until loops provide the opposite logic of while loops - they execute until a condition becomes true. This is particularly useful for waiting operations where you want to continue trying until success occurs.

Basic Syntax:

until [ condition ]; do
    commands
done

Components Explained:

  • condition: A test expression that’s evaluated before each iteration
  • Loop continues as long as the condition returns false (non-zero exit status)
  • Loop terminates when the condition returns true (exit status 0)
  • This is the logical opposite of while loops

When to Use Until:

  • Waiting for services to start: until systemctl is-active service
  • Retrying failed operations: until command_succeeds
  • Polling for file creation: until [ -f expected_file ]
  • Network connectivity checks: until ping -c1 server

Reading Files Line by Line

A common and powerful use of while loops in system administration involves processing files line by line. This technique allows you to handle configuration files, user lists, or any structured text data.

Basic Pattern:

while IFS= read -r line; do
    commands
done < filename

Components Explained:

  • IFS=: Prevents read from trimming leading/trailing whitespace
  • read -r: Prevents backslash escaping, reading lines literally
  • line: Variable that receives each line of input
  • < filename: Redirects the file as input to the while loop

Advanced Reading:

while IFS=: read -r field1 field2 field3; do
    commands
done < filename

This splits each line on colons and assigns fields to separate variables, perfect for processing /etc/passwd or similar structured files.

Loop Control Statements

Break Statement: Immediately exits the current loop, regardless of the loop condition. Useful for stopping loops when you meet specific conditions.

Continue Statement: Skips the remaining commands in the current iteration and jumps to the next iteration. Useful for skipping certain items while you process others.

Example Usage:

for file in *.txt; do
    [ ! -r "$file" ] && continue    # Skip unreadable files
    [ "$(wc -l < "$file")" -eq 0 ] && continue    # Skip empty files
    
    if grep -q "ERROR" "$file"; then
        echo "Found errors in $file"
        break    # Stop processing after first error file
    fi
done

Loop Performance Considerations

Avoiding Subshells: Be aware that pipes create subshells, which can affect variable scope:

# This won't work as expected - counter stays 0
counter=0
cat file | while read line; do
    counter=$((counter + 1))
done
echo $counter  # Still 0!

# This works correctly
counter=0
while read line; do
    counter=$((counter + 1))
done < file
echo $counter  # Shows actual count

Efficient File Processing: For large files, avoid repeatedly calling external commands inside loops. Process data in batches or use tools designed for bulk operations.

Here’s a practical for loop example that processes multiple log files:

#!/usr/bin/env bash
# Process multiple log files for error analysis

LOG_DIRECTORY="/var/log/applications"
ERROR_REPORT="/tmp/error-summary.txt"
ERROR_COUNT=0

# Initialize report
echo "Error Analysis Report - $(date)" > "$ERROR_REPORT"
echo "=================================" >> "$ERROR_REPORT"

# Process all log files in the directory
for logfile in "$LOG_DIRECTORY"/*.log; do
    # Skip if no log files match the pattern
    [ ! -f "$logfile" ] && continue
    
    echo "Processing: $(basename "$logfile")"
    
    # Count errors in this file
    file_errors=$(grep -c "ERROR" "$logfile" 2>/dev/null || echo "0")
    ERROR_COUNT=$((ERROR_COUNT + file_errors))
    
    # Add details to report
    echo "" >> "$ERROR_REPORT"
    echo "File: $(basename "$logfile")" >> "$ERROR_REPORT"
    echo "Error count: $file_errors" >> "$ERROR_REPORT"
    
    # Include recent errors if any exist
    if [ "$file_errors" -gt 0 ]; then
        echo "Recent errors:" >> "$ERROR_REPORT"
        tail -50 "$logfile" | grep "ERROR" | tail -5 >> "$ERROR_REPORT"
    fi
done

echo "Analysis complete. Total errors: $ERROR_COUNT"
echo "Report saved to: $ERROR_REPORT"

This log analysis script demonstrates practical for loop usage in system administration. The loop processes all .log files in a directory, using the [ ! -f "$logfile" ] && continue pattern to handle cases where no files match the glob pattern. Each iteration counts errors, accumulates totals, and generates detailed reports. The script includes error handling for files that might be unreadable and provides a comprehensive summary when processing completes.

While loops continue executing as long as a condition remains true. They’re ideal for monitoring situations, processing data streams, or implementing retry logic. While loops are particularly useful when you don’t know in advance how many iterations you’ll need.

#!/usr/bin/env bash
# Monitor disk space and wait for cleanup

DISK_THRESHOLD=85
CHECK_INTERVAL=300  # 5 minutes

while true; do
    DISK_USAGE=$(df / | tail -1 | awk '{print $5}' | sed 's/%//')
    
    if [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ]; then
        echo "$(date): Disk usage critical: ${DISK_USAGE}%"
        
        # Attempt automatic cleanup
        echo "Attempting automatic cleanup..."
        find /tmp -type f -atime +1 -delete
        find /var/log -name "*.log.*" -mtime +30 -delete
        
        # Wait and check again
        sleep "$CHECK_INTERVAL"
    else
        echo "$(date): Disk usage normal: ${DISK_USAGE}%"
        break
    fi
done

This disk space monitoring script shows how while loops handle ongoing monitoring tasks. The infinite loop (while true) continues checking disk usage at regular intervals. When disk usage exceeds the threshold, the script attempts automatic cleanup by removing old temporary and log files. The loop includes a sleep interval to prevent excessive system load and breaks when disk usage returns to acceptable levels.

Until loops provide the opposite logic of while loops - they execute until a condition becomes true. This is particularly useful for waiting operations where you want to continue trying until success occurs.

Reading files line by line represents a common and powerful use of while loops in system administration. This technique allows you to process configuration files, user lists, or any structured text data:

#!/usr/bin/env bash
# Process user list for account creation

USER_LIST="/tmp/new-users.txt"
FAILED_USERS="/tmp/failed-creations.txt"

# Clear previous failure log
> "$FAILED_USERS"

# Read user list line by line
while IFS=: read -r username fullname department; do
    # Skip empty lines and comments
    [[ "$username" =~ ^#.*$ ]] && continue
    [ -z "$username" ] && continue
    
    echo "Creating user: $username ($fullname) - $department"
    
    # Create user account
    if useradd -c "$fullname" -m "$username"; then
        echo "Successfully created: $username"
        
        # Set initial password (user must change on first login)
        echo "$username:temppass123" | chpasswd
        passwd -e "$username"
    else
        echo "Failed to create: $username" | tee -a "$FAILED_USERS"
    fi
    
done < "$USER_LIST"

# Report results
if [ -s "$FAILED_USERS" ]; then
    echo "Some user creations failed. See: $FAILED_USERS"
else
    echo "All users created successfully"
fi

This user creation script demonstrates line-by-line file processing using a while loop with read. The IFS=: sets the field separator for parsing colon-delimited user data. The script skips comment lines and empty lines, then attempts to create each user account with useradd. Error handling tracks failed creations in a separate log file, and the script sets temporary passwords while forcing users to change them on first login. The final summary reports any failures that occurred during processing.

Functions provide the final organizational tool for creating maintainable and reusable shell scripts.

Functions for Code Organization and Reusability

Functions transform shell scripts from simple command sequences into structured, maintainable programs. They allow you to organize complex logic into named, reusable blocks that make scripts easier to understand, debug, and modify. Well-designed functions eliminate code duplication and create building blocks that you can use across multiple scripts.

Understanding Function Definition

A function definition creates a named command that you can call anywhere in your script. Functions can accept parameters, perform complex operations, and return exit codes to indicate success or failure.

Basic Syntax:

function_name() {
    commands
}

Alternative Syntax:

function function_name {
    commands
}

Components Explained:

  • function_name: A descriptive name following variable naming rules (letters, numbers, underscores)
  • Parentheses (): Required in the first syntax, indicate this is a function definition
  • Curly braces {}: Contain the function body (commands to execute)
  • commands: The operations the function performs when called

Function Parameters and Arguments

Functions can accept parameters just like shell scripts, using positional parameters $1, $2, $3, etc.

Parameter Access:

  • $1, $2, $3, …: Individual positional parameters
  • $@: All parameters as separate words
  • $*: All parameters as a single word
  • $#: Number of parameters passed to the function
  • $0: Name of the script (not the function name)

Parameter Handling Best Practices:

my_function() {
    local param1="$1"
    local param2="$2"
    local param3="${3:-default_value}"
    
    # Check if required parameters are provided
    if [ $# -lt 2 ]; then
        echo "Usage: my_function param1 param2 [param3]" >&2
        return 1
    fi
    
    # Function logic here
}

Local Variables in Functions

Local variables inside functions prevent naming conflicts with global variables and make functions more predictable and reusable.

Local Variable Declaration:

my_function() {
    local var1="value1"
    local var2="$1"
    local result
    
    # Function operations
    result="computed value"
    
    echo "$result"
}

Global vs Local Scope:

  • Variables declared without local are global and affect the entire script
  • Variables declared with local exist only within the function
  • Local variables hide global variables with the same name
  • Function parameters ($1, $2, etc.) are automatically local to the function

Function Return Values

Functions can return data in two ways: through exit codes and through output.

Exit Codes:

  • Use return n to set the function’s exit status (0-255)
  • return 0 indicates success
  • return 1 (or any non-zero value) indicates failure
  • If no return statement is used, the function returns the exit status of the last command

Output Return:

  • Use echo or printf to output data that callers can capture
  • Combine with command substitution: result=$(my_function)
  • Be careful not to mix output and error messages (use stderr for errors)

Example of Both Methods:

validate_and_process() {
    local input="$1"
    
    # Validation
    if [ -z "$input" ]; then
        echo "Error: No input provided" >&2
        return 1
    fi
    
    # Processing
    local processed=$(echo "$input" | tr '[:lower:]' '[:upper:]')
    
    # Return processed data
    echo "$processed"
    return 0
}

# Usage
if result=$(validate_and_process "hello"); then
    echo "Success: $result"
else
    echo "Function failed"
fi

Function Organization and Best Practices

Function Naming: Use descriptive names that clearly indicate the function’s purpose:

  • check_disk_space instead of check
  • backup_database instead of backup
  • validate_ip_address instead of validate

Single Responsibility: Each function should have one clear purpose:

  • Functions that do one thing well are easier to test and debug
  • Avoid functions that perform multiple unrelated operations
  • Break complex operations into smaller, focused functions

Error Handling: Functions should handle errors gracefully:

  • Validate input parameters
  • Check for required files, permissions, or conditions
  • Return appropriate exit codes
  • Provide meaningful error messages to stderr

Documentation: Document complex functions with comments:

# Function: backup_file
# Purpose: Creates a timestamped backup copy of a file
# Parameters: $1 - source file path
# Returns: 0 on success, 1-4 on various error conditions
# Output: Path to backup file on success
backup_file() {
    # Function implementation
}

Here’s a practical example showing function organization in a system maintenance script:

#!/usr/bin/env bash
# System maintenance script with modular functions

# Global configuration
LOG_FILE="/var/log/maintenance.log"
TEMP_CLEANUP_DAYS=7
LOG_RETENTION_DAYS=30

# Logging function
log_message() {
    local level="$1"
    local message="$2"
    echo "$(date '+%Y-%m-%d %H:%M:%S') [$level] $message" | tee -a "$LOG_FILE"
}

# Check disk space function
check_disk_space() {
    local mount_point="${1:-/}"
    local threshold="${2:-90}"
    
    local usage=$(df "$mount_point" | tail -1 | awk '{print $5}' | sed 's/%//')
    
    if [ "$usage" -gt "$threshold" ]; then
        log_message "WARNING" "Disk usage on $mount_point: ${usage}%"
        return 1
    else
        log_message "INFO" "Disk usage on $mount_point: ${usage}% (OK)"
        return 0
    fi
}

# Clean temporary files function
cleanup_temp_files() {
    local temp_dir="${1:-/tmp}"
    local days="${2:-$TEMP_CLEANUP_DAYS}"
    
    log_message "INFO" "Cleaning temporary files older than $days days in $temp_dir"
    
    local count=$(find "$temp_dir" -type f -atime +$days | wc -l)
    
    if [ "$count" -gt 0 ]; then
        find "$temp_dir" -type f -atime +$days -delete
        log_message "INFO" "Cleaned $count temporary files"
    else
        log_message "INFO" "No temporary files to clean"
    fi
}

# Rotate log files function
rotate_logs() {
    local log_dir="${1:-/var/log}"
    local retention_days="${2:-$LOG_RETENTION_DAYS}"
    
    log_message "INFO" "Rotating logs older than $retention_days days in $log_dir"
    
    # Compress logs older than 1 day
    find "$log_dir" -name "*.log" -mtime +1 -exec gzip {} \;
    
    # Remove compressed logs older than retention period
    local deleted_count=$(find "$log_dir" -name "*.log.gz" -mtime +$retention_days | wc -l)
    find "$log_dir" -name "*.log.gz" -mtime +$retention_days -delete
    
    log_message "INFO" "Removed $deleted_count old log files"
}

# Main execution
main() {
    log_message "INFO" "Starting system maintenance"
    
    # Check disk space on critical mount points
    for mount in "/" "/var" "/home"; do
        if [ -d "$mount" ]; then
            check_disk_space "$mount" 85
        fi
    done
    
    # Perform cleanup tasks
    cleanup_temp_files "/tmp" 7
    cleanup_temp_files "/var/tmp" 14
    rotate_logs "/var/log" 30
    
    log_message "INFO" "System maintenance completed"
}

# Run main function
main "$@"

This comprehensive maintenance script shows excellent function organization. Each function has a single, clear responsibility: logging messages, checking disk space, cleaning temporary files, and rotating logs. Functions use local variables to avoid conflicts and accept parameters for flexibility. The main function coordinates all operations, providing a clear execution flow. This modular approach makes the script easy to test, debug, and modify individual components without affecting others.

Function parameters work similarly to script parameters, using $1, $2, etc. for positional arguments and $@ for all parameters. Local variables inside functions prevent naming conflicts with global variables and make functions more predictable and reusable.

Functions can return exit codes to indicate success or failure, and allow calling code to make decisions based on function results. The return statement sets the exit code, while echo or printf can output data that the caller can capture.

Here’s an example of a function that both returns data and exit codes:

# Function that validates and normalizes IP addresses
validate_ip() {
    local ip="$1"
    local normalized_ip
    
    # Check basic format
    if [[ ! "$ip" =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
        return 1
    fi
    
    # Check each octet is valid (0-255)
    IFS='.' read -ra octets <<< "$ip"
    for octet in "${octets[@]}"; do
        if [ "$octet" -gt 255 ] || [ "$octet" -lt 0 ]; then
            return 2
        fi
    done
    
    # Normalize by removing leading zeros
    normalized_ip=$(printf "%d.%d.%d.%d" "${octets[@]}")
    echo "$normalized_ip"
    return 0
}

# Usage example
if result=$(validate_ip "192.168.001.100"); then
    echo "Valid IP address: $result"
else
    echo "Invalid IP address format"
fi

This IP validation function demonstrates advanced function techniques including parameter handling, data validation, and return values. The function checks IP address format using pattern matching, validates each octet’s numeric range, and normalizes the address by removing leading zeros. The function returns both data (via echo) and status codes (via return), allowing the caller to capture the normalized IP while checking for validation success. The usage example shows how to capture function output while testing the return code.

Functions make your scripts more testable because they allow you to verify individual components separately. You can create simple test scripts that call your functions with known inputs and verify the outputs, which makes it easier to catch bugs and ensure reliability.

Common Pitfalls

Shell scripting presents several common traps that can frustrate beginners and create subtle bugs in production scripts. When you understand these pitfalls, you avoid them and debug problems when they occur.

Quoting and Variable Expansion Issues: The most frequent problems involve improper quoting, especially with variables containing spaces or special characters. Always quote variable references in conditions: [ "$variable" = "value" ] instead of [ $variable = value ]. Unquoted variables can cause test conditions to fail unexpectedly or create security vulnerabilities.

Exit Status Confusion: Remember that successful commands return 0, while failure returns non-zero values. This is opposite to many programming languages where 0 means false. In conditions, success (0) evaluates as true, while failure (non-zero) evaluates as false. Always test your conditions carefully and understand what constitutes success for each command.

Loop and File Processing Problems: When you process files with loops, be careful about filenames containing spaces or special characters. Use proper quoting and consider using find with -print0 and while read -d '' for robust file processing. Also, remember that loops run in subshells when part of a pipeline, so variable changes inside the loop might not persist.

Function Scope and Variable Issues: Variables inside functions are global by default unless you declare them with local. This can cause unexpected side effects when functions modify variables that exist in the main script. Always use local variables in functions unless you specifically need global scope.

Error Handling Neglect: Many beginners write scripts that don’t handle errors properly. Commands can fail for various reasons - files might not exist, permissions might be incorrect, or system resources might be exhausted. Use set -e to exit on errors, check return codes for critical operations, and provide meaningful error messages to users.

Security Considerations: Be cautious with user input and external data. Variables containing user input should be validated and quoted properly. Avoid using eval with untrusted input, and be careful about temporary file creation in world-writable directories. When scripts run with elevated privileges, security becomes even more critical.

  1. “Classic Shell Scripting” by Arnold Robbins and Nelson Beebe - This comprehensive guide covers bash scripting fundamentals with excellent coverage of portable scripting techniques and best practices. The authors’ focus on real-world examples makes it perfect for system administrators.

  2. “Learning the bash Shell” by Cameron Newham - An O’Reilly classic that provides thorough coverage of bash features including advanced scripting techniques. The book’s progression from basic to advanced concepts matches well with practical learning needs.

  3. “The Linux Command Line” by William Shotts - Chapters 24-37 cover shell scripting from fundamentals through advanced techniques. The free online version at linuxcommand.org provides excellent supplementary examples and exercises.

  4. “Bash Pocket Reference” by Arnold Robbins - A concise reference that’s invaluable for quick lookup of syntax, built-in commands, and scripting patterns. Perfect for keeping nearby while writing scripts.

  5. Advanced Bash-Scripting Guide (http://tldp.org/LDP/abs/html/) - A comprehensive online resource covering bash scripting in extensive detail. While sometimes overwhelming for beginners, it provides thorough coverage of advanced topics and edge cases.

Assessment

Multiple Choice Questions

  1. What is the purpose of the shebang line in a shell script?
  • a) To add comments to the script
  • b) To specify which interpreter should execute the script
  • c) To make the script executable
  • d) To set environment variables
  1. Which conditional structure is most appropriate for comparing a single variable against multiple specific values?
  • a) if/then/else
  • b) case statement
  • c) while loop
  • d) for loop
  1. What does the test condition [ -f "$filename" ] check for?
  • a) If the file exists and is executable
  • b) If the file exists and is a regular file
  • c) If the file exists and is a directory
  • d) If the file exists and is readable
  1. In the condition [ "$disk_usage" -gt 90 ] && [ -w "$log_file" ], when will the overall condition be true?
  • a) When either condition is true
  • b) When both conditions are true
  • c) When the first condition is false
  • d) When exactly one condition is true
  1. Which loop type would be most appropriate for processing each file in a directory?
  • a) while loop
  • b) until loop
  • c) for loop
  • d) case statement
  1. What is the difference between local variable="value" and variable="value" inside a function?
  • a) There is no difference
  • b) Local variables are only accessible within the function
  • c) Local variables are read-only
  • d) Local variables cannot be passed to other functions
  1. Which command makes a shell script executable?
  • a) sudo script.sh
  • b) chmod +x script.sh
  • c) exec script.sh
  • d) run script.sh
  1. What does the condition [ -z "$variable" ] test for?
  • a) If the variable contains only zeros
  • b) If the variable is empty or unset
  • c) If the variable is a number
  • d) If the variable contains spaces
  1. In a while loop, when does the loop stop executing?
  • a) When the condition becomes true
  • b) When the condition becomes false
  • c) After a fixed number of iterations
  • d) When the script encounters an error
  1. What is the recommended shebang line for maximum portability across different systems?
  • a) #!/bin/bash
  • b) #!/usr/bin/env bash
  • c) #!/bin/sh
  • d) #!/usr/local/bin/bash

Short Answer Questions

  1. Explain the difference between a while loop and an until loop. Provide a scenario where each would be most appropriate.

  2. Write a function called backup_file that takes a filename as a parameter, checks if the file exists and is readable, and creates a backup copy with a timestamp. Include proper error handling.

  3. Describe three different ways to handle multiple conditions in shell scripts. Give an example of when you would use each approach.


Slides