Bash Scripting Basics
Shebang and safe defaults
The first line tells the OS which interpreter to use. The set options below prevent the most common categories of silent failure in shell scripts.
#!/usr/bin/env bash
set -euo pipefail
-e — exit immediately if any command returns non-zero. -u — treat unset variables as an error (prevents typos like $HOEM silently expanding to empty). -o pipefail — if any command in a pipeline fails, the whole pipeline fails (otherwise only the last command's exit code counts).
set -euo pipefail.
set -e caveats
set -e has several well-known cases where it does not exit on failure, which can surprise you:
set -e
# DOES exit on failure — normal command
cp missing.txt /tmp/
# Does NOT exit — commands after || are considered "handled"
grep "pattern" file.txt || echo "not found" # no exit even if grep fails
# Does NOT exit inside [[ ]] tests
[[ -f missing.txt ]] # false but no exit
# Does NOT exit in if conditions (by design — if checks the exit code)
if grep "pattern" file.txt; then
echo "found"
fi
# Subshell failure DOES propagate (with pipefail)
cat file.txt | grep "pattern" | head -1 # if grep fails, pipeline fails
# Workaround: capture exit code without triggering -e
grep "pattern" file.txt && found=true || found=false
The most common trap: grep returns exit code 1 when it finds no matches. In a script with set -e, a grep that finds nothing will exit the script — unless you add || true or use it in an if statement. Check grep returns carefully.
Variables
# Assignment — no spaces around =
name="alice"
count=42
path="/etc/nginx/nginx.conf"
# Reading a variable — always quote to prevent word splitting
echo "$name"
echo "User: $name, count: $count"
# Command substitution — capture output of a command
hostname=$(hostname -f)
today=$(date +%Y-%m-%d)
echo "Running on $hostname at $today"
# Arithmetic
total=$((count + 10))
echo "$total"
# Default value — use fallback if variable is unset or empty
log_dir="${LOG_DIR:-/var/log/myapp}"
user="${DEPLOY_USER:-deploy}"
${VAR:-default} is the most useful substitution: use the value of VAR if set and non-empty, otherwise use the default. It does not change VAR itself.
Special variables
$0 # script name
$1 $2 # positional arguments (first arg, second arg)
$@ # all arguments as separate words — use this in loops
$# # number of arguments
$$ # PID of the current shell
$? # exit code of the last command (0 = success)
Conditionals
# Basic if / elif / else
if [[ "$1" == "start" ]]; then
echo "Starting service"
elif [[ "$1" == "stop" ]]; then
echo "Stopping service"
else
echo "Usage: $0 start|stop"
exit 1
fi
# File tests
if [[ -f "/etc/nginx/nginx.conf" ]]; then
echo "Config exists"
fi
if [[ ! -d "/var/log/myapp" ]]; then
mkdir -p /var/log/myapp
fi
# String tests
if [[ -z "$name" ]]; then echo "name is empty"; fi
if [[ -n "$name" ]]; then echo "name is set: $name"; fi
# Number comparison
if [[ $count -gt 10 ]]; then echo "more than 10"; fi
if [[ $count -eq 0 ]]; then echo "zero"; fi
Common test operators
# Files
-f FILE # exists and is a regular file
-d DIR # exists and is a directory
-r FILE # exists and is readable
-w FILE # exists and is writable
-x FILE # exists and is executable
-s FILE # exists and has size > 0
# Strings
-z STRING # string is empty (zero length)
-n STRING # string is non-empty
== # strings are equal
!= # strings are not equal
# Numbers (integer comparison)
-eq -ne -lt -le -gt -ge
Always use [[ ]] (double bracket) in bash scripts, not [ ] (single bracket). Double bracket is safer: no word splitting inside, supports && and ||, handles empty variables without errors.
Loops
# Loop over a list of items
for host in web01 web02 web03; do
echo "Checking $host"
ssh "$host" uptime
done
# Loop over files
for conf in /etc/nginx/conf.d/*.conf; do
echo "Validating $conf"
nginx -t -c "$conf"
done
# Loop over command output (use process substitution — avoid pipefail issues)
while IFS= read -r line; do
echo "Processing: $line"
done < <(grep "ERROR" /var/log/app.log)
# Loop with a counter
for i in $(seq 1 5); do
echo "Attempt $i"
done
# While loop
attempts=0
while [[ $attempts -lt 3 ]]; do
# try something
attempts=$((attempts + 1))
done
Use while IFS= read -r line; done < <(command) when looping over command output — it handles lines with spaces correctly and preserves the set -e exit code behaviour. Avoid for line in $(command) which splits on whitespace and breaks on filenames with spaces.
Functions
#!/usr/bin/env bash
set -euo pipefail
# Define before use
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"
}
die() {
echo "ERROR: $*" >&2
exit 1
}
require_root() {
if [[ $EUID -ne 0 ]]; then
die "This script must be run as root"
fi
}
# Functions take positional args just like scripts
deploy_service() {
local service="$1" # local — variable is scoped to this function
local env="${2:-prod}"
log "Deploying $service to $env"
systemctl restart "$service"
}
# Call them
require_root
deploy_service nginx
deploy_service postfix staging
Use local for all variables inside functions. Without it, variables are global and will leak into the rest of the script. Use $* to pass all arguments to a command as a single string; use "$@" to pass them as separate, properly-quoted words.
stdin / stdout / stderr
# stdout — normal output (file descriptor 1)
echo "This goes to stdout"
# stderr — error/status messages (file descriptor 2)
echo "This is an error" >&2
# Redirect stdout to a file
command > /tmp/output.txt
# Append stdout to a file
command >> /tmp/output.txt
# Redirect stderr to a file
command 2> /tmp/errors.txt
# Redirect both stdout and stderr to the same file
command > /tmp/all.txt 2>&1
# Or in bash 4+:
command &> /tmp/all.txt
# Discard output (send to /dev/null)
command > /dev/null # discard stdout
command > /dev/null 2>&1 # discard both
# Pipe stdout to another command
command | grep "pattern"
# Pipe both stdout and stderr
command 2>&1 | grep "ERROR"
Exit codes and $?
# Every command returns an exit code: 0 = success, non-zero = failure
grep "pattern" file.txt
echo "grep returned: $?" # 0 if found, 1 if not found, 2 if error
# Common exit code pattern — stop on failure with a message
systemctl start nginx || { echo "Failed to start nginx" >&2; exit 1; }
# Run a command and check result without set -e stopping the script
if ! systemctl is-active --quiet nginx; then
echo "nginx is not running" >&2
exit 1
fi
# Return a value from a function via exit code
is_port_open() {
nc -z -w2 "$1" "$2" > /dev/null 2>&1
# nc returns 0 if port is open, 1 if not
}
if is_port_open web01 80; then
echo "Port 80 is open"
fi
# Explicit exit codes in your script
exit 0 # success
exit 1 # general failure (most common)
exit 2 # misuse of shell builtins (usage error)
String operations
str="hello-world-2024"
# Length
echo "${#str}" # 16
# Substring (offset, length)
echo "${str:6:5}" # world
# Strip prefix (shortest match)
echo "${str#hello-}" # world-2024
# Strip prefix (longest match — greedy)
echo "${str##*-}" # 2024
# Strip suffix
echo "${str%-*}" # hello-world
# Replace first occurrence
echo "${str/world/WORLD}" # hello-WORLD-2024
# Replace all occurrences
echo "${str//l/L}" # heLLo-worLd-2024
# Uppercase / lowercase (bash 4+)
echo "${str^^}" # HELLO-WORLD-2024
echo "${str,,}" # hello-world-2024
Arrays
# Declare and populate
hosts=("web01" "web02" "db01")
fruits=("apple" "banana" "cherry")
# Access element
echo "${hosts[0]}" # web01
# All elements (preserve quoting)
echo "${hosts[@]}"
# Number of elements
echo "${#hosts[@]}" # 3
# Loop over array — always quote [@]
for host in "${hosts[@]}"; do
ssh "$host" uptime
done
# Append to array
hosts+=("db02")
# Slice (elements 1 and 2)
echo "${hosts[@]:1:2}"
Here-docs
Here-docs let you write multi-line strings inline. Useful for generating config files or sending multi-line input to commands.
# Write a multi-line file
cat > /etc/motd << 'EOF'
Welcome to production.
Unauthorised access is prohibited.
EOF
# Indented here-doc (strip leading tabs with <<-)
# Note: must use actual tab characters, not spaces
cat <<- EOF
This line has a leading tab stripped.
So does this one.
EOF
# Pass multi-line input to a command
ssh deploy@server bash << 'REMOTE'
cd /opt/myapp
git pull
systemctl restart myapp
REMOTE
# Here-string (single line, no file)
grep "root" <<< "$(cat /etc/passwd)"
Use << 'EOF' (quoted delimiter) to prevent variable expansion inside the here-doc. Use << EOF (unquoted) when you want variables like $hostname to expand.
Common patterns
Require arguments
#!/usr/bin/env bash
set -euo pipefail
usage() {
echo "Usage: $0 <service> <environment>"
echo " service — systemd service name"
echo " environment — prod|staging|dev"
exit 1
}
[[ $# -lt 2 ]] && usage
SERVICE="$1"
ENV="$2"
Temporary files and cleanup
#!/usr/bin/env bash
set -euo pipefail
# Create a temp file and clean it up on exit (even if script fails)
tmpfile=$(mktemp /tmp/deploy.XXXXXX)
trap 'rm -f "$tmpfile"' EXIT
# Work with the temp file
ansible-playbook site.yml --check > "$tmpfile" 2>&1
grep "changed=" "$tmpfile"
Lock file — prevent concurrent runs
LOCKFILE=/var/run/myscript.lock
if ! mkdir "$LOCKFILE" 2>/dev/null; then
echo "Script already running (lockdir: $LOCKFILE)" >&2
exit 1
fi
trap 'rmdir "$LOCKFILE"' EXIT
# rest of script
Parse simple flags
dry_run=false
verbose=false
while [[ $# -gt 0 ]]; do
case "$1" in
-n|--dry-run) dry_run=true ;;
-v|--verbose) verbose=true ;;
-h|--help) usage ;;
*) echo "Unknown option: $1" >&2; usage ;;
esac
shift
done
Retry loop
wait_for_service() {
local host="$1" port="$2" retries=10 delay=3
for i in $(seq 1 $retries); do
if nc -z -w2 "$host" "$port" > /dev/null 2>&1; then
return 0
fi
echo "Waiting for $host:$port (attempt $i/$retries)..."
sleep "$delay"
done
echo "Timed out waiting for $host:$port" >&2
return 1
}
Debugging scripts
# Run with trace output — prints each command before executing
bash -x script.sh
# Or add to the top of the script
set -x
# Print only specific sections
set -x
some_complex_function
set +x # turn off trace
# Dry-run pattern — check what would happen
DRY_RUN="${DRY_RUN:-false}"
run() {
if [[ "$DRY_RUN" == "true" ]]; then
echo "[DRY RUN] $*"
else
"$@"
fi
}
run systemctl restart nginx
run rm -rf /tmp/old-deploy
# shellcheck — static analysis for bash scripts
# Install: dnf install ShellCheck / apt install shellcheck
shellcheck script.sh
shellcheck catches common mistakes like unquoted variables, deprecated syntax, and portability issues. Run it on any non-trivial script before deploying.
Security in shell scripts
curl https://example.com/install.sh | bash— runs remote code without inspectioneval "$user_input"— executes arbitrary code from external input- Building commands with unvalidated external data (command injection)
# Bad: unquoted variable allows word splitting / globbing attacks
rm -rf $user_directory
# Good: quote everything
rm -rf "$user_directory"
# Bad: using user input directly in a command
find /var/log -name "$search_term" # if $search_term is "* ; rm -rf /"...
# Good: validate and restrict input first
if [[ "$search_term" =~ ^[a-zA-Z0-9._-]+$ ]]; then
find /var/log -name "$search_term"
else
echo "Invalid input" >&2
exit 1
fi
# Never use -x (trace) in scripts that handle passwords or tokens
# set -x will print the values to stdout — check before enabling in CI
cron environment
Cron runs with a minimal, stripped-down environment. Scripts that work perfectly in your shell often fail silently in cron because PATH is different and environment variables are missing.
# Cron's default PATH is roughly:
# /usr/bin:/bin (no /usr/local/bin, no ~/bin, no custom paths)
# Fix 1: Set PATH at the top of the crontab
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# Fix 2: Use full paths in cron commands
0 2 * * * /usr/local/bin/backup.sh
# Fix 3: Source the user environment in the script
#!/usr/bin/env bash
source /etc/profile
source ~/.bash_profile
# Debug cron environment: capture what cron sees
* * * * * env > /tmp/cron-env.txt
Cron also does not have HOME, USER, DISPLAY, SSH_AUTH_SOCK, or any of the variables you see in an interactive shell. If your script relies on any of these, set them explicitly or source the appropriate profile file. The env > /tmp/cron-env.txt trick lets you see exactly what environment cron provides on your specific system.