Version: v2.2
Core Philosophy: "Read, Understand, Verify."
Cerebro is not a simple "copy-paste" backup script. It is an intelligent state-monitor wrapped in a backup utility.
Unlike standard tools (like rsync or simple tar cron jobs) that blindly copy data on a schedule, Cerebro operates on a "Verify First" architecture. It stages data, compares it against the last known valid state across multiple storage destinations, and only commits a new backup if meaningful changes are detected.
It is designed for complex environments (Self-hosted stacks, Docker containers, System configurations) where you need to distinguish between critical configuration changes and meaningless noise.
Cerebro solves three critical problems inherent in standard backup solutions:
Most backup tools trigger a new archive if a file's timestamp changes. Cerebro inspects the content.
- The Problem: A Docker container rotates a log file. A standard backup tool sees a "change" and creates a 2GB duplicate archive.
- The Cerebro Solution: Through the
[NOLOG]and[TARDISCARD]protocols, Cerebro detects that only a log file changed. It can record the event in the system log but discard the heavy archive, saving gigabytes of storage over time while keeping your backup timeline clean.
Standard backups are opaque; you don't know what changed until you restore.
- The Cerebro Solution: The
[LOGDIFF]feature allows you to define text files (scripts, configs,.envfiles). When these change, Cerebro diffs the content and writes the specific added/removed lines directly intocerebro.log. - Result: You can read your log file like a commit history, seeing exactly when and how a configuration was altered without ever touching a
.tar.gzfile.
Cerebro does not assume a single storage location is available.
- The Cerebro Solution: It supports an array of
[DESTINATIONS]. Before every run, it scans all defined locations (NAS, USB Drives, Cloud Mounts) to find the true "Latest" backup. After a run, it synchronizes the new backup to all healthy destinations, ensuring your off-site and on-site copies are identical.
Cerebro follows a strict execution pipeline to ensure data integrity:
- Staging: Creates a temporary, isolated environment in
/tmpfor comparison work. - Latest Backup Discovery: Scans ALL
[DESTINATIONS]to find the most recent valid backup (by timestamp). - Comparison: Extracts the previous backup and performs a file-by-file diff against live data.
- Decision Matrix:
- No Change: Abort, delete staged tar, log "No changes detected."
- LOGDIFF Change: Extract and log the actual text differences, tag backup as
LOGDIFF. - NOLOG Change: Note the change but don't log content, tag backup as
NOLOGorDISCARDbased on rules.
- Tagging & Metadata: Create entry in
.tar_meta_data.txtwith backup type (LOGDIFF/NOLOG/DISCARD). - Creation: Build the final
tar.gzarchive with all excludes applied. - Verification: Run
tar -tzfto ensure the archive is not corrupted. - Distribution: Use
rsyncwith timeout to copy the valid archive to all reachable[DESTINATIONS]. - Cleanup: If
[TARDISCARD] DISCARD=1, remove old DISCARD-tagged backups. Prune log file if[LOGPRUNE]enabled. - Self-Maintenance: Update cron job, remove temp files, release lock.
Cerebro is controlled entirely via cerebro.cfg. This file defines the "Intelligence" of the system.
Standard path definitions. Supports wildcards (*).
How it works:
[INCLUDE]: Absolute paths to files or directories you want backed up.[EXCLUDE]: Patterns or paths to skip (logs, temp files, cache directories).
Built-in Protection:
- Automatic Backup Directory Exclusion: Cerebro automatically excludes its own
backups/folder to prevent recursive backup loops. You do NOT need to manually add/path/to/cerebro/backupsto[EXCLUDE]- the script handles this internally regardless of where you install it.
Example:
[INCLUDE]
/home/pi/Docker
/home/pi/.config/important-app
/etc/nginx/nginx.conf
[EXCLUDE]
*.log
*.tmp
/home/pi/Docker/container-logsFiles matching these patterns get their CONTENT read and diffed.
- Use Case: Configuration files, scripts, docker-compose files - anything where you need to see WHAT changed.
- Behavior: When a file matching this pattern changes, Cerebro extracts the previous version, runs
diff, and writes the added/removed lines directly intocerebro.log. - Benefit: Your log becomes a version control system. You can see exactly when
docker-compose.ymlchanged and what lines were modified without extracting any tar files.
What gets logged:
FILE: /home/pi/Docker/docker-compose.yml
DIFFERENCE: Content changed between old and new backup
< OLD LINE: image: nginx:1.20
> NEW LINE: image: nginx:1.21
< OLD LINE: - "8080:80"
> NEW LINE: - "8081:80"
Patterns:
[LOGDIFF]
*.yml
*.sh
*.conf
*.env
*.py
*.jsonFiles here trigger backups but DON'T log content differences.
- Use Case: Binary files, databases, frequently changing data where the diff output would be useless noise.
- Behavior: Cerebro detects the file changed (via hash/size comparison) and triggers a backup, but it does NOT write the diff to the log file. Instead it just notes: "File X changed."
- Why? A 500MB
.dbfile changing would create a 500MB diff in your log. This keeps logs readable while still capturing state changes.
Common Use Cases:
- SQLite databases (
*.db) - Application state files
- Binary configuration blobs
- Docker volume data that changes frequently but you don't need content-level visibility
Example:
[NOLOG]
/home/pi/Docker/pi-hole/data/*
/home/pi/Apps/app-state.db
*.sqliteIntelligently prunes backups that only contain noise-level changes.
DISCARD=0: Disabled. All backups are kept.DISCARD=1: Enabled. "Smart Prune" logic activates.
How it works: When a backup run completes, Cerebro checks the metadata to see WHAT triggered the backup:
- If the backup contains LOGDIFF changes → Keep it (tagged as
LOGDIFF). - If the backup contains ONLY NOLOG changes → Tag it as
DISCARD. - On the NEXT run, if a new backup is created, Cerebro deletes all old
DISCARD-tagged backups, keeping only the most recentDISCARDbackup.
Why this matters:
- You have a Docker container that writes to a log file every 5 minutes.
- Without TARDISCARD: You'd have 288 backups per day (one every 5 minutes), all containing the same data except for log rotation.
- With TARDISCARD: You have 1 backup representing the "last known NOLOG state" and all your meaningful backups (when actual configs changed).
Storage saved: In a typical homelab, this can save 80-90% of backup storage over a year.
Define multiple backup storage locations.
Cerebro treats this as an array of equals - no "primary" or "secondary". Before every run, it scans ALL destinations to find the latest backup across all of them.
Features:
- Automatic hostname appending: If you define
/mnt/nas/backup/cerebro, Cerebro will actually write to/mnt/nas/backup/cerebro/hostname- this allows multiple machines to use the same NAS share without conflicts. - Resilience: If one destination is offline (unmounted NAS), Cerebro continues with the available ones.
- Sync logic: After creating a new backup, Cerebro attempts to copy it to ALL reachable destinations using
rsyncwith timeout protection.
Example:
[DESTINATIONS]
/media/G-Drive/backup/cerebro
/mnt/nas/backup/cerebro
/mnt/cloud-mount/backupsWhat happens:
- Machine hostname is
raspberrypi - Cerebro will write to:
/media/G-Drive/backup/cerebro/raspberrypi/backup_20240215_040000-1.tar.gz/mnt/nas/backup/cerebro/raspberrypi/backup_20240215_040000-1.tar.gz/mnt/cloud-mount/backups/raspberrypi/backup_20240215_040000-1.tar.gz
Control when Cerebro runs automatically.
[SCHEDULE]
cron=1
schedule=00 04 * * 1cron=0: Disable automatic scheduling. Run manually only.cron=1: Enable automatic scheduling.schedule=: Standard cron syntax. Use https://crontab.guru/ to generate.
Common schedules:
00 04 * * *- Daily at 4:00 AM00 04 * * 1- Weekly on Monday at 4:00 AM00 */6 * * *- Every 6 hours*/30 * * * *- Every 30 minutes (aggressive monitoring)
Important: Cron uses the system's local timezone. If your system timezone is UTC but you want backups at 4 AM local time, you need to convert:
# Check your timezone
timedatectl
# Or
date +%ZHow it works:
When you run ./cerebro.sh for the first time (or any time config changes), it automatically installs/updates the cron job. You never need to manually edit crontab.
Cron vs Manual backups:
- Cron backups are named with time-based suffixes:
backup_20240215_040000-1.tar.gz(1 = 00:00-05:59, 2 = 06:00-11:59, etc.) - Manual backups are named with letter suffixes:
backup_20240215_153000-a.tar.gz,backup_20240215_154500-b.tar.gz(cycling a-z)
Control what gets written to cerebro.log.
[LOGTYPE]
DEBUG=0
INFO=1
NOFIRSTRUN=0DEBUG=1: Ultra-verbose. Logs every file being processed, timing info, decision trees. Use for troubleshooting.DEBUG=0: Production mode. Only logs significant events.INFO=1: Log normal operational messages (backup created, files transferred, etc.)INFO=0: Silent mode. Only log errors and file differences.NOFIRSTRUN=1: Suppress the "First run, no previous backup to compare" message. Useful if you're running Cerebro on a new system and don't want log noise.
Recommended settings:
- Development/Testing:
DEBUG=1, INFO=1 - Production:
DEBUG=0, INFO=1 - High-frequency monitoring:
DEBUG=0, INFO=0(only log real changes)
Automatically clean old entries from cerebro.log.
[LOGPRUNE]
ENABLED=1
DISCARD_MAX_AGE_DAYS=1ENABLED=1: Active. Cerebro will delete log entries older than the specified age.DISCARD_MAX_AGE_DAYS=1: Keep only the last 1 day of logs.
Why this exists:
If you run Cerebro every 5 minutes with DEBUG=1, your log file will grow to gigabytes in a week. This feature keeps the log manageable while retaining recent history.
What gets deleted: Only the timestamped log entries. The current run's log header is always kept.
Recommendation:
- High-frequency backups (< 1 hour):
DISCARD_MAX_AGE_DAYS=1 - Daily backups:
DISCARD_MAX_AGE_DAYS=30 - Weekly backups:
DISCARD_MAX_AGE_DAYS=365
Cerebro maintains a hidden metadata file at $SCRIPT_DIR/assets/.tar_meta_data.txt. This file tracks every backup and its classification.
Format:
backup_20240215_040000-1.tar.gz:LOGDIFF
backup_20240215_100000-2.tar.gz:NOLOG
backup_20240215_160000-3.tar.gz:DISCARD
backup_20240215_220000-4.tar.gz:DISCARD:LATEST
Tags:
LOGDIFF: This backup contains changes to files in[LOGDIFF]. Always kept.NOLOG: This backup contains changes to files in[NOLOG]AND[LOGDIFF]. Always kept.DISCARD: This backup contains ONLY[NOLOG]changes. Eligible for deletion.DISCARD:LATEST: The most recent DISCARD backup. Protected until a newer backup is created.
TARDISCARD Logic:
Current state:
backup_001.tar.gz:LOGDIFF
backup_002.tar.gz:DISCARD
backup_003.tar.gz:DISCARD
backup_004.tar.gz:DISCARD:LATEST
New backup created (backup_005.tar.gz):
- If it's LOGDIFF → Delete all DISCARD except LATEST
- If it's DISCARD → Promote backup_005 to DISCARD:LATEST, delete backup_002 and backup_003
Result:
backup_001.tar.gz:LOGDIFF
backup_004.tar.gz:DISCARD
backup_005.tar.gz:DISCARD:LATEST
This ensures you always have:
- All meaningful configuration changes (LOGDIFF/NOLOG)
- The two most recent states (current + previous)
Setup:
[INCLUDE]
/home/pi/Docker
[EXCLUDE]
*.log
*.tmp
[LOGDIFF]
*.yml
*.sh
*.conf
[NOLOG]
/home/pi/Docker/*/data/*
[SCHEDULE]
cron=1
schedule=00 04 * * *
[TARDISCARD]
DISCARD=1Result:
You sleep. Cerebro wakes up daily at 4 AM, checks your containers. If nothing changed, it goes back to sleep (no backup created). If an app auto-updated and changed a config, Cerebro snapshots it, logs the version change in cerebro.log, syncs it to your NAS, and you can review it in the morning. If just a database file changed, it creates a backup but marks it as DISCARD to save space.
Setup:
[INCLUDE]
/home/user/scripts
/home/user/projects
[LOGDIFF]
*.sh
*.py
*.js
*.json
*.md
[SCHEDULE]
cron=1
schedule=*/30 * * * *
[TARDISCARD]
DISCARD=0Result:
Every 30 minutes, Cerebro checks your work. Every time you edit a script, it creates a versioned backup. The cerebro.log effectively becomes a "git commit history" showing exactly what code you tweaked, when, and what the diff was. You can trace bugs back to specific edits without needing to remember to commit.
Setup:
[INCLUDE]
/etc
/var/www
/opt/production-app
[LOGDIFF]
*.conf
*.ini
*.yml
[DESTINATIONS]
/mnt/local-raid/backup
/mnt/nas/backup
/mnt/cloud-sync/backup
[SCHEDULE]
cron=1
schedule=00 */4 * * *
[TARDISCARD]
DISCARD=1
[LOGPRUNE]
ENABLED=1
DISCARD_MAX_AGE_DAYS=7Result:
High frequency backups (every 4 hours). Multiple redundant destinations (local RAID + NAS + cloud). If your local drive dies, the NAS copy is up to date. If you accidentally break a config file, you can review cerebro.log to see exactly what changed in the last backup, or restore from the previous tar. Log pruning keeps the log file under control.
Setup:
[INCLUDE]
/home/pi/.ssh
/etc/passwd
/etc/shadow
/etc/sudoers
/var/log/auth.log
[LOGDIFF]
*
[SCHEDULE]
cron=1
schedule=*/5 * * * *
[LOGTYPE]
DEBUG=1
INFO=1
[TARDISCARD]
DISCARD=0Result: Every 5 minutes, Cerebro checks critical system files. If someone adds an SSH key, modifies sudoers, or changes user passwords, you'll have a timestamped backup and a full diff of exactly what changed. This is forensics-level monitoring - if your system is compromised, you'll know exactly when and what was altered.
Example cerebro.log output after a run where docker-compose.yml changed:
========== BACKUP RUN STARTED ==========
2024-02-15 04:00:01 - Run Type: cron
2024-02-15 04:00:01 - Timestamp: 2024-02-15 04:00:01
2024-02-15 04:00:01 - Backup Name: backup_20240215_040001-1.tar.gz
[Backup Creation]
2024-02-15 04:00:02 - Starting backup creation...
2024-02-15 04:00:05 - Backup tar created successfully.
[Comparison]
2024-02-15 04:00:06 - Comparing with latest backup: backup_20240214_040000-1.tar.gz
2024-02-15 04:00:10 - FILE: /home/pi/Docker/docker-compose.yml
2024-02-15 04:00:10 - DIFFERENCE: Content changed between old and new backup
< OLD: image: nginx:1.20
> NEW: image: nginx:1.21
< OLD: - "8080:80"
> NEW: - "8081:80"
2024-02-15 04:00:11 - INFO: Backup tagged as: LOGDIFF
2024-02-15 04:00:11 - Changes detected between:
2024-02-15 04:00:11 - New: /path/to/backup_20240215_040001-1.tar.gz
2024-02-15 04:00:11 - Previous: backup_20240214_040000-1.tar.gz
[Transfer]
2024-02-15 04:00:15 - INFO: Backup copied to /media/G-Drive/backup/cerebro/raspberrypi/backup_20240215_040001-1.tar.gz via rsync.
2024-02-15 04:00:20 - INFO: Backup copied to /mnt/nas/backup/cerebro/raspberrypi/backup_20240215_040001-1.tar.gz via rsync.
[Cleanup]
2024-02-15 04:00:21 - INFO: Backup removed from /path/to/cerebro/backups/backup_20240215_040001-1.tar.gz
[Tar Removal]
2024-02-15 04:00:21 - Removed backup_20240214_100000-2.tar.gz from /media/G-Drive/backup/cerebro/raspberrypi.
2024-02-15 04:00:21 - Removed backup_20240214_100000-2.tar.gz from /mnt/nas/backup/cerebro/raspberrypi.
========== BACKUP RUN ENDED ==========
Key takeaways:
- You can see EXACTLY what changed (nginx version and port mapping)
- You know which backup contains the change
- You can see the transfer was successful to both destinations
- You can see old DISCARD backups were cleaned up
Cerebro creates standard .tar.gz files. Restoring is straightforward:
# Find the backup you want
ls /mnt/nas/backup/cerebro/raspberrypi/
# Extract everything
cd /
sudo tar -xzf /mnt/nas/backup/cerebro/raspberrypi/backup_20240215_040000-1.tar.gz
# This restores all files to their original locations# List contents
tar -tzf backup_20240215_040000-1.tar.gz | grep docker-compose
# Extract specific file
tar -xzf backup_20240215_040000-1.tar.gz home/pi/Docker/docker-compose.yml
# File is now in ./home/pi/Docker/docker-compose.yml (relative path)
# Copy it to the actual location
sudo cp home/pi/Docker/docker-compose.yml /home/pi/Docker/docker-compose.yml# Extract to a temp directory
mkdir /tmp/restore-check
tar -xzf backup_20240215_040000-1.tar.gz -C /tmp/restore-check
# Now you can browse /tmp/restore-check to see the backed up state
# without overwriting live dataUse the log file:
# Search for when a specific file changed
grep "docker-compose.yml" cerebro.log
# Look for the backup name associated with that change
# Then extract that specific backupRecall is an optional companion utility for Cerebro. Everything in section 8 works without it. Recall is for when you want a faster, safer, and more human-friendly recovery experience.
recall.sh is a dedicated extraction and restoration tool for the Cerebro backup suite. Rather than manually identifying the correct .tar.gz, extracting with the correct relative path, and copying the file back — Recall does all of that in a single command.
Key behaviours:
- Reads your
cerebro.cfgautomatically — knows your destinations, finds the latest backup itself - Fuzzy search with disambiguation — find a file by name fragment. If multiple matches exist, you get an interactive selector
- Safe extraction by default — files are always placed at
.bakpaths (e.g.smb.conf.bak) so you can inspect before committing. No live file is ever silently overwritten - Folder-aware — append a trailing slash to extract an entire directory tree
- Multi-term — restore multiple files or folders in a single call
- Multi-destination aware — searches across all your configured
[DESTINATIONS]for the latest valid backup
recall.sh lives alongside cerebro.sh in the same directory. No additional dependencies beyond what Cerebro already requires.
chmod +x recall.sh./recall.sh [search_term_1] [search_term_2] ...Restore a single config file:
./recall.sh smb.conf
# Output: [SUCCESS] Saved to -> /etc/samba/smb.conf.bak
# Review it, then: sudo mv /etc/samba/smb.conf.bak /etc/samba/smb.confRestore your rclone auth:
./recall.sh rclone.conf
# Output: [SUCCESS] Saved to -> /home/pi/.config/rclone/rclone.conf.bakRestore multiple files in one call:
./recall.sh smb.conf rclone.conf
# Both extracted and placed as .bak files simultaneouslyRestore an entire folder:
./recall.sh Apps/tutor/
# Output: [SUCCESS] Saved to -> /home/pi/Apps/tutor_bak/Ambiguous search — interactive picker:
./recall.sh compose
# Found 4 matching files for 'compose':
# 1) home/pi/Docker/stack-a/docker-compose.yml
# 2) home/pi/Docker/stack-b/docker-compose.yml
# ...
# Select the file to extract for 'compose' (or Cancel):The .bak convention is intentional — it gives you a chance to diff before you commit:
# Extract
./recall.sh smb.conf
# Review what you are getting back
diff /etc/samba/smb.conf /etc/samba/smb.conf.bak
# If satisfied, apply it
sudo mv /etc/samba/smb.conf.bak /etc/samba/smb.conf
sudo systemctl restart smbdRecall writes its own log to recall.log in the same directory as cerebro.sh. Each session is delimited with RECALL SESSION STARTED / ENDED markers.
When installed, Cerebro creates the following structure:
/opt/cerebro/ # Installation directory (example)
├── cerebro.sh # Main script
├── cerebro.cfg # Configuration file
├── cerebro.log # Log file (all events)
├── backups/ # Temporary staging area
│ └── (empty - backups are moved to destinations immediately)
└── assets/ # Metadata directory
├── .tar_meta_data.txt # Backup classification tracking
└── .manual_backup_counter # Cycles through a-z for manual backups
Destination directories (configured in [DESTINATIONS]):
/mnt/nas/backup/cerebro/
└── hostname/ # Auto-appended based on system hostname
├── backup_20240215_040000-1.tar.gz
├── backup_20240215_100000-2.tar.gz
└── backup_20240216_040000-1.tar.gz
Key points:
- The
backups/directory is always empty after successful runs (files are moved to destinations) - If all destinations fail, backups remain in
backups/until the next successful run - The
assets/directory is critical - losing it breaks TARDISCARD logic (but doesn't affect backup data) - Each machine backing up to the same NAS gets a separate subdirectory based on hostname
- Linux environment (Bash 4.0+)
- Standard GNU tools (Cerebro self-checks and can install if missing):
tar- Archive creationrsync- File transferdiff- Content comparisongawk(GNU Awk) - CRITICAL: Required for log pruning. Standardawkimplementations won't work.grep,sed- Text processingfind,wc- File operationstimeout- Process management
You can install Cerebro using curl or wget. The installer will automatically detect your environment, download the script and a template configuration, and set up the directory structure in ~/cerebro.
Option A: Using curl
mkdir -p ~/cerebro && curl -sL https://raw.githubusercontent.com/Arelius-D/Cerebro/main/install.sh | bashOption B: Using wget
mkdir -p ~/cerebro && wget -qO - https://raw.githubusercontent.com/Arelius-D/Cerebro/main/install.sh | bash- Place the script:
mkdir -p /opt/cerebro
cd /opt/cerebro
# Copy cerebro.sh and cerebro.cfg here
chmod +x cerebro.sh- Edit the config:
nano cerebro.cfg
# Set your [INCLUDE] paths
# Set your [DESTINATIONS]
# Configure [SCHEDULE]- First run (manual):
./cerebro.shWhat happens on first run:
- Cerebro checks for required tools, offers to install if missing
- Reads
cerebro.cfg - Creates the
backups/directory - Creates the
assets/directory for metadata - Since there's no previous backup, it creates the first one and tags it as
LOGDIFF - Installs the cron job (if
cron=1in config) - Logs to
cerebro.log
Suppressing "first run" messages:
Set NOFIRSTRUN=1 in [LOGTYPE] if you don't want the "No previous backup found" message in the log.
- Understanding Manual vs. Cron Execution:
When you run Cerebro, it operates in one of two modes:
Manual Mode (default):
./cerebro.sh- Uses letter suffixes (a-z, cycling):
backup_20240215_153022-a.tar.gz - Each manual run increments the letter (a→b→c...→z, then wraps back to a)
- Useful for ad-hoc backups before making risky changes
Cron Mode (--update flag):
./cerebro.sh --update- Uses number suffixes (1-4, based on time of day):
- 1 = 00:00-05:59
- 2 = 06:00-11:59
- 3 = 12:00-17:59
- 4 = 18:00-23:59
- Prevents creating dozens of backups per day if cron runs frequently
- The cron job automatically uses this flag (it's appended in the crontab entry)
Example: If your cron runs every hour, you'll get at most 4 backups per day (one per time window), not 24.
- Verify cron installation:
crontab -l | grep cerebroYou should see your schedule.
- Test a cron run manually:
./cerebro.sh --updateThis simulates a cron-triggered run.
- Compression:
tar -czfuses gzip. On modern systems, negligible impact. - Diffing: Only happens when changes are detected. Text files are fast, large binaries in
[NOLOG]are skipped. - Typical homelab load: < 5% CPU for 30 seconds during backup creation.
- Staging: Uses
/tmpfor extraction and comparison. Ensure/tmphas space for 2x your largest backup. - Typical usage: Extracts old tar, compares with live data, creates new tar. Peak memory = size of largest single file being processed.
- Local staging:
$SCRIPT_DIR/backups/is temporary. Backups are moved to[DESTINATIONS]immediately after creation. - Destination storage: Depends on your data size and retention policy.
- With TARDISCARD: Expect 10-20% of "all backups ever created" due to smart pruning.
- Without TARDISCARD: All backups are kept. Plan accordingly.
Example:
- 10GB of Docker data
- Daily backups
- 1 real config change per week
- 6 log file changes per day (NOLOG events)
Without TARDISCARD: 7 backups/week × 10GB = 70GB/week = 3.6TB/year
With TARDISCARD: 1 LOGDIFF backup/week × 10GB + 1 DISCARD backup = 20GB/week = 1TB/year
- rsync with timeout: Cerebro uses
rsyncwith a 300-second timeout. If a transfer hangs (NAS offline, network issue), it will abort and try the next destination. - Compression: Backups are gzipped, reducing network transfer size.
- Cloud storage: If using cloud mounts (rclone, etc.), ensure stable connectivity. Cerebro will skip unreachable destinations.
A: Check your [EXCLUDE] patterns. You might be excluding the files that changed. Enable DEBUG=1 to see which files are being processed.
A:
- Enable
TARDISCARD DISCARD=1 - Move frequently-changing data (logs, temp files, cache) to
[NOLOG] - Add patterns to
[EXCLUDE]for unnecessary data
A: Enable [LOGPRUNE] and set DISCARD_MAX_AGE_DAYS=1 (or your preferred retention).
A:
# Check if cron job exists
crontab -l | grep cerebro
# Check cron service
systemctl status cron
# Check permissions
ls -la /path/to/cerebro.sh
# Check the log file for errors
tail -f cerebro.logA:
- Copy the entire Cerebro directory (script, config, assets folder)
- Update paths in
cerebro.cfgto match the new system - Run
./cerebro.shonce to install the cron job - Your existing backups in
[DESTINATIONS]will be discovered automatically
A: Cerebro will keep the backup in $SCRIPT_DIR/backups/ and log an error. The backup is not lost, just not transferred. On the next successful run, it will be synced.
A: Yes. Cerebro automatically appends the hostname to the destination path, so each machine gets its own subdirectory:
/mnt/nas/backup/cerebro/
├── machine1/
├── machine2/
└── machine3/
A: Set cron=0 in [SCHEDULE], run ./cerebro.sh once to remove the cron job. Cerebro is now dormant but all settings are preserved.
A: If you have large binary files in [INCLUDE], move them to [NOLOG]. Cerebro will still back them up but won't try to diff them.
A: Cerebro will rebuild it on the next run. You'll lose the LOGDIFF/NOLOG/DISCARD tags, so TARDISCARD won't work correctly until new backups are created. Your actual backup data is unaffected.
A: This is a warning, not an error. It happens when a file is being written to while tar is reading it (common with active log files or databases).
- Solution 1: Add actively-written files to
[EXCLUDE]if they're not important - Solution 2: Run Cerebro during low-activity periods
- Solution 3: Use pre-backup scripts to stop services temporarily (see Advanced Use Cases)
- The backup will still be created; only the actively-written file may be incomplete
A: Not directly. Cerebro operates on local filesystems. However:
- Option 1: Mount the remote filesystem (NFS, SMBFS, SSHFS) and add it to
[INCLUDE] - Option 2: Use rsync to sync remote data locally first, then backup the local copy
- Option 3: Run Cerebro on the remote server and use shared storage for
[DESTINATIONS]
A: Check three things:
- Exit code:
./cerebro.sh; echo $?(0 = success) - Log file:
tail cerebro.log(should end with "BACKUP RUN ENDED") - Destination:
ls -lh /path/to/destination/hostname/(newest file should match timestamp)
A: The second instance will detect the lock file (/tmp/cerebro.lock) and exit immediately with an error message. This prevents corruption from concurrent backups.
A:
- Stop any running cron jobs:
crontab -eand comment out the Cerebro line - Move the entire directory:
mv /old/path/cerebro /new/path/cerebro cd /new/path/cerebro- Run
./cerebro.shonce - this updates the cron job with the new path - Verify:
crontab -l | grep cerebroshould show the new path
Your backups in [DESTINATIONS] are unaffected; Cerebro will find them automatically.
Pipe Cerebro's output to a mail command in your cron job:
00 04 * * * /opt/cerebro/cerebro.sh --update 2>&1 | mail -s "Cerebro Backup Report" admin@example.comParse cerebro.log for the string "Backup tagged as: LOGDIFF" to trigger alerts:
if grep -q "Backup tagged as: LOGDIFF" cerebro.log; then
curl -X POST https://monitoring.example.com/webhook -d "Critical config changed"
fiAdd a script that runs before Cerebro:
#!/bin/bash
# pre-cerebro.sh
docker-compose -f /home/pi/Docker/docker-compose.yml down
/opt/cerebro/cerebro.sh --update
docker-compose -f /home/pi/Docker/docker-compose.yml up -dThis ensures you're backing up a consistent state.
Encrypt backups before sending to cloud:
# In your cron job
/opt/cerebro/cerebro.sh --update
gpg --encrypt --recipient admin@example.com /mnt/nas/backup/cerebro/hostname/*.tar.gz
rclone sync /mnt/nas/backup/cerebro/hostname/ remote:encrypted-backup/cerebro.sh: Should be700(only owner can read/write/execute)cerebro.cfg: Should be600(contains paths, might contain sensitive info)- Backups: Inherit permissions from the destination directory
- If backing up
/etc/shadow,/home/user/.ssh, or other sensitive files, ensure:- Destination directories are encrypted or access-controlled
- Backups are not world-readable
- Log file (
cerebro.log) doesn't expose sensitive content (use[NOLOG]for these files)
- Cerebro can run as a regular user if it has read access to all
[INCLUDE]paths - If backing up system files (
/etc,/var), run as root or use sudo - Cron jobs inherit the user context - ensure the user running Cerebro has appropriate permissions
| Feature | Cerebro | rsync | Duplicity | Borg | tar+cron |
|---|---|---|---|---|---|
| Smart Change Detection | ✅ Content-aware | ❌ Timestamp-based | ✅ Block-level | ✅ Block-level | ❌ Time-based |
| Diff Logging | ✅ Built-in | ❌ Manual | ❌ No | ❌ No | ❌ Manual |
| Multi-Destination Sync | ✅ Automatic | ❌ Manual | ❌ Single | ❌ Single | ❌ Manual |
| Noise Filtering | ✅ TARDISCARD | ❌ No | ❌ No | ❌ No | ❌ No |
| Configuration Format | ✅ Simple INI | ❌ CLI args | ❌ Complex | ❌ Complex | ❌ Scripts |
| Human-Readable Backups | ✅ tar.gz | ✅ Files | ❌ Encrypted | ❌ Repo format | ✅ tar.gz |
| Setup Complexity | ✅ Simple | ✅ Simple | ❌ Moderate | ❌ Moderate | ✅ Simple |
| Incremental | ❌ No | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No |
| Deduplication | ❌ No | ❌ No | ✅ Yes | ✅ Yes | ❌ No |
When to use Cerebro:
- You need visibility into WHAT changed, not just THAT something changed
- You want simple, readable backups (tar.gz) you can extract with standard tools
- You need multi-destination redundancy without complex scripts
- You want to filter out noise (log rotations, temp files) from your backup history
When NOT to use Cerebro:
- You need incremental/differential backups (use Borg or Duplicity)
- You have terabytes of data with minor changes (use rsync or Borg)
- You need encryption at rest (use Duplicity or add GPG to Cerebro workflow)
- You need deduplication (use Borg)
Final Note: Cerebro is built on the premise that data is useless without context. By providing deep visibility into what changed and why a backup occurred, it transforms backups from a "storage chore" into a "system administration asset."
Philosophy: A backup system should answer three questions:
- What do I have? (Latest state)
- What changed? (Diffs and logs)
- When did it change? (Timestamped history)
Cerebro answers all three without requiring you to extract archives or maintain external version control.
Questions? Issues? Contributions? https://github.com/Arelius-D/Cerebro
License: MIT License