A robust, production-ready file-based locking utility using flock(1) for safe concurrent script execution with stale lock detection and flexible waiting modes.
- Features
- Installation
- Usage
- Options
- Exit Codes
- Examples
- How It Works
- Use Cases
- Testing
- Troubleshooting
- Best Practices
- Exclusive Locking: Prevents multiple instances of the same operation from running simultaneously
- Stale Lock Detection: Automatically removes locks left behind by crashed processes
- Flexible Waiting Modes:
- Non-blocking (default): Fail immediately if lock is held
- Blocking: Wait indefinitely for lock to become available
- Timeout: Wait up to a specified number of seconds
- PID Tracking: Tracks which process holds each lock
- Clean Exit Handling: Automatic lock cleanup on normal exit or signal termination
- Lock Stealing: Administrative override to break held or abandoned locks (
--steal) - Safe for Automation: Ideal for cron jobs, systemd services, and CI/CD pipelines
- Comprehensive Error Messages: Clear, actionable error reporting
- Battle-tested: 127 comprehensive test cases
Install the script, manpage, and bash completion using either the Makefile or installation script.
# Install to /usr/local (default) - may require sudo
make install
# Install to /usr - requires sudo
sudo make PREFIX=/usr install
# Install to user directory (no sudo needed)
make PREFIX=~/.local install
# Uninstall
make uninstall# Install to /usr/local (default) - may require sudo
./install.sh install
# Install to /usr - requires sudo
sudo ./install.sh --prefix /usr install
# Install to user directory (no sudo needed)
./install.sh --prefix ~/.local install
# Skip confirmation prompts
./install.sh -y install
# Uninstall
./install.sh uninstallInstall only specific components:
# Using Makefile
make install-script
# Using install.sh
./install.sh install-script# Using Makefile
make install-man
# Using install.sh
./install.sh install-man# Using Makefile
make install-completion
# Using install.sh
./install.sh install-completionIf you prefer manual installation:
# Copy script
sudo cp shlock /usr/local/bin/
sudo chmod +x /usr/local/bin/shlock
# Build and install manpage (requires pandoc)
pandoc --standalone --to man -o shlock.1 shlock.1.md
sudo cp shlock.1 /usr/local/share/man/man1/
sudo mandb -q
# Install bash completion
sudo cp shlock.bash_completion /usr/share/bash-completion/completions/shlockUse directly from the repository without installing:
/ai/scripts/lib/shlock/shlock [OPTIONS] [LOCKNAME] -- COMMAND [ARGS...]Script requirements:
- Bash 5.0 or later
flockutility (usually fromutil-linuxpackage)/run/lockdirectory (standard on most Linux distributions)
Manpage build requirements (optional, only needed for make install-man):
- pandoc - Document converter
Install pandoc:
# Debian/Ubuntu
sudo apt install pandoc
# Fedora/RHEL
sudo dnf install pandoc
# macOS
brew install pandocBash completion is automatically installed with make install or ./install.sh install. It provides intelligent tab-completion for:
- Options:
-m,-w,-t,-s,--max-age,--wait,--timeout,--steal,--help,--version - Lock names: Existing locks from
/run/lock/*.lock - Commands: After
--, completes available commands and files
Manual activation (if not using system-wide installation):
# Source completion for current shell
source shlock.bash_completion
# Or add to ~/.bashrc for permanent activation
echo 'source /path/to/shlock.bash_completion' >> ~/.bashrcUsage examples:
shlock --<TAB> # Shows: --help --max-age --steal --timeout --version --wait
shlock -m <TAB> # Suggests hours values
shlock -t <TAB> # Suggests seconds values
shlock backup<TAB> # Shows existing lock names starting with 'backup'
shlock mylock -- <TAB> # Completes available commandsIf installing to a custom prefix (e.g., ~/.local), add to your ~/.bashrc or ~/.profile:
export PATH="$HOME/.local/bin:$PATH"
export MANPATH="$HOME/.local/share/man:$MANPATH"
# Bash completion directory (if needed)
export BASH_COMPLETION_USER_DIR="$HOME/.local/share/bash-completion"After installation to a custom prefix, restart your shell or run:
source ~/.bashrcYou can rename the script to any name you prefer without affecting functionality. This is useful to avoid name conflicts with other programs:
# Rename to avoid conflicts
mv shlock sherlock
chmod +x sherlock
# Use with new name
sherlock backup -- /usr/local/bin/backup.shThe script name is not referenced internally, so renaming has no effect on its operation.
shlock [OPTIONS] [LOCKNAME] -- COMMAND [ARGS...]- LOCKNAME: Unique identifier for the lock (e.g.,
backup,deployment,sync)- Optional: If omitted, auto-generated from basename of COMMAND
- Example:
shlock -- /usr/local/bin/backup.shuses lockname "backup.sh"
- COMMAND: Command to execute while holding the lock
- ARGS: Optional arguments passed to COMMAND
Important: The -- separator is required to separate options from the command.
| Option | Argument | Description |
|---|---|---|
-m, --max-age |
HOURS | Maximum lock age before considered stale (default: 24) |
-w, --wait |
- | Wait indefinitely for lock to become available |
-t, --timeout |
SECONDS | Maximum time to wait for lock |
-s, --steal |
- | Forcefully remove existing lock (prompts if holder is running) |
-h, --help |
- | Display help message |
-V, --version |
- | Display version information |
Breaking change in v2.0.0: exit codes have been remapped to BCS-canonical values. Callers that branch on
$?MUST update. See commit message and CHANGELOG below.
| Code | Meaning |
|---|---|
| 0 | Command executed successfully (wrapped command exit code passed through) |
| 1 | Lock held (non-timeout acquisition failure) or steal cancelled by user |
| 2 | Usage error (missing COMMAND or -- separator) |
| 13 | Permission denied (no writable lock directory) |
| 22 | Invalid argument (unknown option, bad LOCKNAME, non-numeric value) |
| 24 | Timeout (--timeout N expired) |
| other | Propagated from the wrapped command (standard Unix-wrapper convention, matches nice(1), timeout(1), sudo(1), env(1)) |
Fail immediately if lock is already held:
# Explicit lock name
shlock backup -- /usr/local/bin/backup.sh
# Auto-generated lock name (from command basename)
shlock -- /usr/local/bin/backup.sh
# Lock with arguments
shlock sync -- rsync -av /src /dest
# Lock with custom stale threshold
shlock --max-age 12 critical -- /path/to/critical.shWait until the lock becomes available:
# Wait for deployment lock
shlock --wait deployment -- ./deploy.sh production
# Wait with custom stale threshold
shlock --max-age 6 --wait database-backup -- /usr/local/bin/db-backup.shWait up to a specified time:
# Wait up to 30 seconds
shlock --timeout 30 sync -- rsync -av /src /dest
# Wait up to 5 minutes (300 seconds)
shlock --timeout 300 report -- /usr/local/bin/generate-report.sh
# Critical task with short timeout
shlock --timeout 10 healthcheck -- curl -f http://localhost/healthBreak held or abandoned locks:
# Steal lock from dead process (automatic, no prompt)
shlock --steal backup -- /usr/local/bin/backup.sh
# Steal lock from running process (prompts for confirmation)
shlock --steal deployment -- ./deploy.sh productionPrevent overlapping executions:
# In crontab with explicit lock name
*/5 * * * * /usr/local/bin/shlock backup -- /usr/local/bin/backup.sh 2>&1 | logger -t backup
# Using auto-generated lock name
*/5 * * * * /usr/local/bin/shlock -- /usr/local/bin/backup.sh 2>&1 | logger -t backup
# With timeout for long-running tasks
0 2 * * * /usr/local/bin/shlock --timeout 3600 nightly-job -- /usr/local/bin/nightly.sh# In your script or ExecStart
ExecStart=/usr/local/bin/shlock --wait service-name -- /usr/local/bin/your-service#!/bin/bash
# Ensure only one deployment runs at a time
if ! shlock --timeout 60 deploy-prod -- ./deploy.sh production; then
case $? in
1) echo "Deployment already in progress (lock held)" ;;
24) echo "Timed out waiting for deployment lock" ;;
*) echo "shlock or deployment failed with code $?" ;;
esac
exit 1
fi#!/bin/bash
if shlock database-maintenance -- /usr/local/bin/maintenance.sh; then
echo "Maintenance completed successfully"
else
exit_code=$?
case $exit_code in
1) echo "Lock is held by another process" ;;
2) echo "Usage error (missing COMMAND or -- separator)" ;;
13) echo "Permission denied: no writable lock directory" ;;
22) echo "Invalid argument" ;;
24) echo "Timeout waiting for lock" ;;
*) echo "Maintenance script exited with code $exit_code" ;;
esac
exit $exit_code
fi- LOCKNAME Resolution: If LOCKNAME is omitted, derives it from the basename of COMMAND
- Lock Directory Determination: Automatically selects lock directory:
- Tries
/run/lock(standard tmpfs location) - Falls back to
/var/lockif/run/lockunavailable - Falls back to
/tmp/locks(created if needed) - Fails if no directory is writable
- Tries
- Lock File Path: Constructs path
<LOCK_DIR>/<LOCKNAME>.lock - Stale Lock Check: If the lock file exists, cleans it up when either (a) older than
--max-agewith dead holder, or (b) within--max-agebut holder process is dead. Refuses if an over-age lock is held by a running process (see Stale Lock Detection) - Lock Stealing (optional): If
--stealis specified, removes existing lock (auto-cleans dead-process locks, prompts for running processes) - Lock Acquisition: Uses
flock(1)for atomic, kernel-level locking - PID Tracking: Writes the script's PID to
<LOCK_DIR>/<LOCKNAME>.pid - Command Execution: Runs the specified command while holding the lock
- Cleanup: Automatically removes PID file on exit; lock file persists for reuse
Lock files are stored in the first writable directory from this list:
/run/lock/(preferred) - tmpfs filesystem, cleared on reboot/var/lock/(fallback) - persistent across reboots on most systems/tmp/locks/(last resort) - created automatically if needed, cleared on reboot
File patterns:
- Lock files:
<LOCK_DIR>/<LOCKNAME>.lock - PID files:
<LOCK_DIR>/<LOCKNAME>.pid
Before attempting to acquire the lock, shlock examines the existing lock file and applies one of two cleanup paths:
-
Age-based (
--max-ageexceeded) — If the lock file mtime is older than--max-agehours (default: 24):- If the PID in the PID file is dead → lock is removed (stale, cleaned).
- If the PID is still running → lock acquisition fails with error code 1 (long-running process, not stale).
-
Holder-based (within
--max-age) — If the lock file is younger than--max-agebut the PID file holder is no longer running, the lock is reclaimed automatically with a warning. This covers crashes where the process died before releasing the lock.
Both paths leave an existing running holder's lock untouched.
Non-blocking (default):
- Attempts to acquire lock once
- Fails immediately if lock is held
- Best for: Cron jobs where you want to skip if already running
Blocking (--wait):
- Waits indefinitely for lock to become available
- Acquires lock as soon as it's released
- Best for: Sequential tasks that must eventually run
Timeout (--timeout SECONDS):
- Waits up to specified seconds for lock
- Fails with exit code 24 if timeout expires
- Works independently — does not require
--wait. If both are specified,--timeouttakes priority - Best for: Tasks with time constraints
# In crontab - runs every 5 minutes but skips if previous run is still active
*/5 * * * * shlock sync -- /usr/local/bin/sync-data.sh
# Or use auto-generated lock name
*/5 * * * * shlock -- /usr/local/bin/sync-data.sh# Multiple scripts accessing the same database
shlock --wait database -- /usr/local/bin/db-operation-1.sh
shlock --wait database -- /usr/local/bin/db-operation-2.sh# Ensure only one deployment runs at a time
shlock --timeout 300 deployment -- ./deploy.sh "$ENVIRONMENT"# Prevent multiple instances of CPU/IO-heavy operations
shlock backup -- /usr/local/bin/full-backup.sh
shlock indexing -- /usr/local/bin/rebuild-search-index.sh# Prevent multiple restart attempts
shlock --timeout 30 service-restart -- systemctl restart myservice# When a lock was left behind by a crashed process
shlock --steal backup -- /usr/local/bin/backup.shThe utility includes a comprehensive test suite with 127 test cases covering all functionality.
# Run all tests
cd /ai/scripts/lib/shlock/tests
./run_tests.sh
# Run specific test file
./test_basic.sh
./test_wait_timeout.sh- test_basic.sh (21 tests): Basic functionality, argument handling, exit codes, short option bundling, wrapped command propagation
- test_concurrent.sh (13 tests): Concurrent lock acquisition, race conditions
- test_edge_cases.sh (24 tests): Edge cases, stress tests, special characters
- test_errors.sh (36 tests): Error handling, invalid inputs, signal handling, LOCKNAME sanitization
- test_stale_locks.sh (11 tests): Stale lock detection, max-age thresholds
- test_steal.sh (9 tests): Lock stealing, dead/running process handling, steal combinations
- test_wait_timeout.sh (13 tests): Blocking mode, timeout behavior, queuing
Symptom: Lock appears held even though no process is running
Solutions:
# Use --steal to break a held lock
shlock --steal YOUR_LOCKNAME -- your-command
# Check for lock files
ls -la /run/lock/YOUR_LOCKNAME.*
# Check which process holds the lock
cat /run/lock/YOUR_LOCKNAME.pid
ps -p $(cat /run/lock/YOUR_LOCKNAME.pid)
# Force remove stale lock manually (use with caution)
rm -f /run/lock/YOUR_LOCKNAME.lock /run/lock/YOUR_LOCKNAME.pidSymptom: Cannot create lock files
Solutions:
# Check /run/lock permissions
ls -ld /run/lock
# Ensure your user can write to /run/lock
# Typically this requires being in the appropriate group or running as rootSymptom: --timeout flag not recognized or failing
Check:
- Verify
flocksupports-woption:flock --help | grep -e '-w' - Update util-linux if needed:
apt-get update && apt-get install util-linux - Ensure timeout value is numeric and positive
Symptom: Lock is removed even when process is running
Check:
# Verify timestamp on lock file
stat /run/lock/YOUR_LOCKNAME.lock
# Check if system time is correct
date# Good - explicit lock names
shlock database-backup -- ...
shlock customer-data-sync -- ...
shlock nightly-reports -- ...
# Also good - auto-generated from descriptive script names
shlock -- /usr/local/bin/database-backup.sh
shlock -- /usr/local/bin/customer-data-sync.sh
# Avoid
shlock lock1 -- ...
shlock temp -- ...# Short-running tasks (< 1 hour)
shlock --max-age 2 quick-sync -- ...
# Medium tasks (few hours)
shlock --max-age 12 backup -- ...
# Long-running tasks (overnight)
shlock --max-age 48 monthly-report -- ...# Don't let deployments wait forever
shlock --timeout 300 deployment -- ./deploy.sh
# Quick healthchecks should timeout fast
shlock --timeout 5 healthcheck -- ./check-health.shif ! shlock backup -- /usr/local/bin/backup.sh; then
# Alert, log, or take corrective action
echo "Backup failed or locked" | mail -s "Backup Alert" admin@example.com
fi# In cron
* * * * * shlock task -- /path/to/script.sh 2>&1 | logger -t task-lock
# In scripts
shlock task -- /path/to/script.sh 2>&1 | tee -a /var/log/task.log#!/bin/bash
# Check if lock is held too long
LOCK_FILE="/run/lock/backup.lock"
MAX_AGE_SECONDS=7200 # 2 hours
if [[ -f "$LOCK_FILE" ]]; then
AGE=$(($(date +%s) - $(stat -c %Y "$LOCK_FILE")))
if ((AGE > MAX_AGE_SECONDS)); then
echo "Warning: backup lock held for ${AGE} seconds" | \
mail -s "Lock Alert" admin@example.com
fi
fi# README or comment in script
# This script uses locks:
# - "database-backup" - Exclusive access to database during backup
# - "file-sync" - Prevents concurrent rsync operations
#
# Dependencies:
# - database-backup must complete before file-sync can run#!/bin/bash
# Outer operation
shlock operation-a -- bash -c '
echo "Running operation A"
# Inner operation with different lock
shlock operation-b -- echo "Running operation B"
'#!/bin/bash
if [[ "$FORCE" == "yes" ]]; then
# Skip lock for forced execution
/usr/local/bin/task.sh
else
# Normal locked execution
shlock task -- /usr/local/bin/task.sh
fi[Unit]
Description=My Locked Service
After=network.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/shlock --wait my-service -- /usr/local/bin/my-service.sh
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target- Lock file creation: Negligible overhead (< 1ms)
- Lock acquisition: Atomic kernel operation (< 1ms)
- Stale lock check: Single file stat + process check (< 10ms)
- Lock release: Automatic on process exit
The utility adds minimal overhead to command execution, making it suitable for frequent operations and time-sensitive tasks.
- File Permissions: Lock files inherit permissions from
/run/lock(typically world-writable with sticky bit) - PID Spoofing: The utility validates process existence but doesn't verify process identity
- Race Conditions:
flockprovides atomic locking, preventing race conditions - Symlink Attacks: Lock files are created with
>redirection, following symlinks
For security-critical applications, consider:
- Running with appropriate user permissions
- Using dedicated lock directories with restricted permissions
- Implementing additional process validation
Q: What happens if the system crashes while holding a lock?
A: The lock file persists but becomes stale. On next acquisition attempt, it will be removed if older than --max-age and the PID is not running.
Q: Can I use the same lock name from different scripts? A: Yes, that's the intended use. The same lock name ensures mutual exclusion across all scripts using it.
Q: What if /run/lock doesn't exist?
A: shlock automatically falls back through multiple directories: /run/lock → /var/lock → /tmp/locks (created if needed). If none are writable, the script fails with an error message.
Q: Is it safe to use in containers?
A: Yes, but note that locks are container-scoped. Different containers don't share locks unless they share the same /run/lock volume.
Q: Can I use this with non-Bash scripts?
A: Yes, you can lock any executable: shlock task -- python3 script.py or shlock task -- /usr/bin/my-binary
Q: How many locks can I have? A: Practically unlimited. Each lock is just two small files in the lock directory.
Contributions are welcome! Please ensure:
- All tests pass:
./tests/run_tests.sh - Shellcheck compliance:
shellcheck shlock - Documentation updates for new features
This utility is part of the Okusi Group bash scripting standard library.
Exit codes remapped to BCS-canonical values. No compatibility flag provided — callers that branch on $? MUST update.
Migration:
| Old | New | Meaning |
|---|---|---|
| 1 | 1 | Lock held, steal cancelled (unchanged) |
| 1 | 13 | No writable lock directory |
| 1 | 24 | --timeout expired |
| 2 | 2 | Missing COMMAND or -- separator (unchanged) |
| 2 | 22 | Unknown option, invalid LOCKNAME, non-numeric -m/-t |
| 3 | propagated | Wrapped command's own exit code |
The "Command failed with exit code N" message has been removed; the wrapped command's own stderr is authoritative. shlock is now a transparent wrapper matching nice(1), timeout(1), sudo(1), and env(1).
See git log.
flock(1)- Linux manual pagefcntl(2)- POSIX file locking- Bash Coding Standard:
/ai/scripts/Okusi/bash-coding-standard/
Version: 2.0.0 Last Updated: 2026-04-19 Maintainer: Gary Dean (Biksu Okusi)