String Manipulation

Parsing filenames, extracting data, formatting output - strings are everywhere. Bash has powerful built-in tools.

String Length

hljs bash
#!/bin/bash

str="Hello World"
echo "${#str}"  # 11

Substring Extraction

hljs bash
#!/bin/bash

str="Hello World"

echo "${str:0:5}"   # Hello (5 chars from position 0)
echo "${str:6}"     # World (from position 6 to end)
echo "${str: -5}"   # World (last 5 chars - note the space!)
echo "${str:0:-3}"  # Hello Wo (all but last 3)

Remove Prefix/Suffix

Remove Shortest Match

hljs bash
#!/bin/bash

file="document.backup.tar.gz"

echo "${file#*.}"    # backup.tar.gz (remove up to first .)
echo "${file%.*}"    # document.backup.tar (remove from last .)

Remove Longest Match

hljs bash
#!/bin/bash

file="document.backup.tar.gz"

echo "${file##*.}"   # gz (remove up to last .)
echo "${file%%.*}"   # document (remove from first .)

Remember: # front, % back

# removes from the front (# is before $ on keyboard). % removes from the back. Single = shortest match, Double = longest match.

Practical: Parse Filenames

hljs bash
#!/bin/bash

filepath="/home/user/documents/report.pdf"

# Extract filename
filename="${filepath##*/}"     # report.pdf

# Extract directory
directory="${filepath%/*}"     # /home/user/documents

# Extract extension
extension="${filename##*.}"    # pdf

# Filename without extension
basename="${filename%.*}"      # report

echo "Path: $filepath"
echo "Directory: $directory"
echo "Filename: $filename"
echo "Basename: $basename"
echo "Extension: $extension"

Search and Replace

hljs bash
#!/bin/bash

str="hello world world"

# Replace first occurrence
echo "${str/world/universe}"   # hello universe world

# Replace all occurrences
echo "${str//world/universe}"  # hello universe universe

# Replace at beginning
echo "${str/#hello/hi}"        # hi world world

# Replace at end
echo "${str/%world/earth}"     # hello world earth

Case Conversion

hljs bash
#!/bin/bash

str="Hello World"

echo "${str^^}"    # HELLO WORLD (uppercase)
echo "${str,,}"    # hello world (lowercase)
echo "${str^}"     # Hello World (capitalize first)

Default Values

hljs bash
#!/bin/bash

# Use default if unset or empty
name="${1:-Guest}"
echo "Hello, $name"

# Use default if unset (not if empty)
name="${1-Guest}"

# Assign default if unset
: "${CONFIG_FILE:=/etc/app.conf}"
echo "Using config: $CONFIG_FILE"
SyntaxBehavior
${var:-default}Use default if unset/empty
${var-default}Use default if unset only
${var:=default}Assign default if unset/empty
${var:+alternate}Use alternate if set
${var:?error}Error if unset/empty

String Conditionals

hljs bash
#!/bin/bash

str="Hello"

# Contains substring
if [[ "$str" == *ell* ]]; then
    echo "Contains 'ell'"
fi

# Starts with
if [[ "$str" == H* ]]; then
    echo "Starts with H"
fi

# Ends with
if [[ "$str" == *lo ]]; then
    echo "Ends with 'lo'"
fi

# Regex match
if [[ "$str" =~ ^H.*o$ ]]; then
    echo "Matches pattern"
fi

Practical Examples

Parse URL

hljs bash
#!/bin/bash

url="https://user:pass@example.com:8080/path?query=value"

# Extract protocol
protocol="${url%%://*}"  # https

# Remove protocol
rest="${url#*://}"  # user:pass@example.com:8080/path?query=value

# Extract host (simplified)
host="${rest%%/*}"
host="${host%%:*}"
host="${host#*@}"  # example.com

echo "Protocol: $protocol"
echo "Host: $host"

Clean Filename

hljs bash
#!/bin/bash

filename="My Document (final) [v2].pdf"

# Replace spaces with underscores
clean="${filename// /_}"

# Remove parentheses
clean="${clean//(/}"
clean="${clean//)/}"

# Remove brackets
clean="${clean//[/}"
clean="${clean//]/}"

echo "$clean"  # My_Document_final_v2.pdf

Version Parsing

hljs bash
#!/bin/bash

version="v1.2.3-beta"

# Remove 'v' prefix
version="${version#v}"  # 1.2.3-beta

# Extract major version
major="${version%%.*}"  # 1

# Extract minor version
rest="${version#*.}"    # 2.3-beta
minor="${rest%%.*}"     # 2

echo "Major: $major, Minor: $minor"
Knowledge Check

What does ${file##*.} return for file='doc.backup.tar.gz'?

Quick Reference

SyntaxPurpose
${#var}String length
${var:pos:len}Substring
${var#pattern}Remove shortest prefix
${var##pattern}Remove longest prefix
${var%pattern}Remove shortest suffix
${var%%pattern}Remove longest suffix
${var/old/new}Replace first
${var//old/new}Replace all
${var^^}Uppercase
${var,,}Lowercase

Key Takeaways

  • ${#var} for length, ${var:0:5} for substring
  • # removes from front, % removes from back
  • Single #/% = shortest match, double = longest
  • ${var/old/new} for search-replace
  • ${var^^} / ${var,,} for case conversion
  • Use ${var:-default} for default values
  • These replace many sed, cut, awk calls

Next: best practices for production scripts.