My file server is simply a Linux box running Samba and NFS servers, as well as password protected HTTPS so I can easily grab any files I need from remote locations so long as there is internet connectivity. My files are stored on external USB drives, one of which is the master, which is the drive I will actively use, and the second drive is the backup. Every night I have a cron job run a script (backup_files.sh) which uses rsync to mirror the two drives. In addition, the script removes any junk files that may have been left behind on the drive, such as backup files from vi, and metadata that my Mac likes to spew all over the place (which I can't stand!). The script also keeps a log, which is important to check every so often to make sure everything is actually working. Before doing anything, the script makes sure both drives are mounted, so nothing bad happens in the event one of the drives is missing (like the master drive crashing, and the backup script erasing everything on the backup to "synchronize" them). I will describe how this works later.
Now simply having a drive and it's backup is not enough in my opinion. If one drive fails, then you are left with only one good copy of everything. If there is a fire, then both drives are gone. So, every month or so I'll swap the backup drive with another, and I keep the unused drive at a remote location. Now my server and the entire building it is in can be destroyed with everything in it, but at least my files will be safe! In addition to the backup I keep on my home server, I also back up all my files to my computer at work in a separate script (backup_files_remotely.sh). This backup also runs nightly, and executes after the local backup has completed. The script requires the use of an RSA key pair, with the private key on my home server, and the public key in $HOME/.ssh/authorized_keys on my work server.
For the most part, using the scripts I describe below, I have a fully automated backup system. The only thing I need to do is check the logs once and a while to make sure nothing broke, and swap out my backup drives.
This script will first check for the presence of a file named .identity on the root directory of the master and backup paths. This ensures that you do not delete the contents of the backup drive when the master drive is not mounted, and that you do not copy the contents of the master drive to the root drive of your system (potentially filling it to capacity) when the backup drive is not mounted. These are both problems I've encountered before I finally added this drive checking feature to the script. The USB drive mounting script shown below also makes use of the .identity file to determine where to mount the drive, based on the name of the drive stored in the .identity file.
| backup_files.sh |
#!/bin/bash
# Note: FILES_DIR and BACKUP_FILES_DIR must both contain a file with filename .identity for this script to run
FILES_DIR=$HOME"/files/"
BACKUP_FILES_DIR=$HOME"/backup_files/"
EXCLUDE="lost+found .identity"
LOG_FILE=$HOME"/logs/backup_files_log.txt"
KEEP_LOG="1" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session
TMP_FILE=$HOME"/.backup_files_running"
if [ ! -e $TMP_FILE ]; then
touch $TMP_FILE
EXCLUDED=""
for i in $EXCLUDE; do
EXCLUDED="$EXCLUDED --exclude=$i";
done
if [[ $KEEP_LOG -eq 1 || $KEEP_LOG -eq 2 ]]; then
if [ $KEEP_LOG -eq 2 ]; then
rm -f $LOG_FILE
fi
if [ -e $FILES_DIR/.identity ] && [ -e $BACKUP_FILES_DIR/.identity ]; then
date +%F\ %T\ %A | tee -a $LOG_FILE
ID_FILES=`cat $FILES_DIR/.identity`
ID_BACKUP_FILES=`cat $BACKUP_FILES_DIR/.identity`
echo "" | tee -a $LOG_FILE
echo "Starting rsync backup, from" $ID_FILES "to" $ID_BACKUP_FILES | tee -a $LOG_FILE
rsync $EXCLUDED --delete-after -av $@ $FILES_DIR $BACKUP_FILES_DIR | tee -a $LOG_FILE
ERROR=$?
echo "" | tee -a $LOG_FILE
date +%F\ %T\ %A | tee -a $LOG_FILE
echo "--------------------------------------------------------------------------------" | tee -a $LOG_FILE
rm -f $TMP_FILE
exit $ERROR
else
date +%F\ %T\ %A | tee -a $LOG_FILE
echo "" | tee -a $LOG_FILE
echo "Drives are not mounted (or no .identity file exists on drive)" | tee -a $LOG_FILE
if [ ! -e $FILES_DIR/.identity ]; then
echo $FILES_DIR "is not mounted" | tee -a $LOG_FILE
fi
if [ ! -e $BACKUP_FILES_DIR/.identity ]; then
echo $BACKUP_FILES_DIR "is not mounted" | tee -a $LOG_FILE
fi
echo "" | tee -a $LOG_FILE
echo "--------------------------------------------------------------------------------" | tee -a $LOG_FILE
rm -f $TMP_FILE
exit 3
fi
else
if [ -e $FILES_DIR/.identity ] && [ -e $BACKUP_FILES_DIR/.identity ]; then
date +%F\ %T\ %A
ID_FILES=`cat $FILES_DIR/.identity`
ID_BACKUP_FILES=`cat $BACKUP_FILES_DIR/.identity`
echo ""
echo "Starting rsync backup, from" $ID_FILES "to" $ID_BACKUP_FILES
rsync $EXCLUDED --delete-after -av $@ $FILES_DIR $BACKUP_FILES_DIR
ERROR=$?
echo ""
date +%F\ %T\ %A
echo "--------------------------------------------------------------------------------"
rm -f $TMP_FILE
exit $ERROR
else
date +%F\ %T\ %A
echo ""
echo "Drives are not mounted (or no .identity file exists on drive)"
if [ ! -e $FILES_DIR/.identity ]; then
echo $FILES_DIR "is not mounted"
fi
if [ ! -e $BACKUP_FILES_DIR/.identity ]; then
echo $BACKUP_FILES_DIR "is not mounted"
fi
echo ""
echo "--------------------------------------------------------------------------------"
rm -f $TMP_FILE
exit 3
fi
fi
else
echo "Backup is already running"
exit 2
fi
|
| backup_files_remotely.sh |
#!/bin/sh
# To backup multiple source dirs into the backup dir, separate dirs with a space and do not end dir paths with a slash
# To copy the contents of the source dir into the backup dir, end with a slash
USERNAME="nick"
SSH_KEY=$HOME"/.ssh/rsa_key"
SOURCE_DIR=$HOME"/files/"
BACKUP_DIR="remote.server_address.net:files/"
EXCLUDE="lost+found .identity"
LOG_FILE=$HOME"/logs/backup_files_remotely_log.txt"
KEEP_LOG="2" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session
TMP_FILE=$HOME"/.backup_files_remotely_running"
EXCLUDED=""
for i in $EXCLUDE; do
EXCLUDED="$EXCLUDED --exclude=$i";
done
if [ ! -e $TMP_FILE ]; then
touch $TMP_FILE
if [[ $KEEP_LOG -eq 1 || $KEEP_LOG -eq 2 ]]; then
if [ $KEEP_LOG -eq 2 ]; then
rm -f $LOG_FILE
fi
date +%F\ %T\ %A | tee -a $LOG_FILE
echo "" | tee -a $LOG_FILE
rsync -e "ssh -i $SSH_KEY" $EXCLUDED --delete-after -av $@ $SOURCE_DIR $USERNAME@$BACKUP_DIR | tee -a $LOG_FILE
ERROR=$?
echo "" | tee -a $LOG_FILE
date +%F\ %T\ %A | tee -a $LOG_FILE
echo "--------------------------------------------------------------------------------" | tee -a $LOG_FILE
else
date +%F\ %T\ %A
echo ""
rsync -e "ssh -i $SSH_KEY" $EXCLUDED --delete-after -av $@ $SOURCE_DIR $USERNAME@$BACKUP_DIR
ERROR=$?
date +%F\ %T\ %A
echo ""
echo "--------------------------------------------------------------------------------"
fi
rm -f $TMP_FILE
exit $ERROR
else
echo "Backup is already running"
exit 2
fi
|
Another option is to directly compare the files with diff. Twice a month I have a cron job run the script below to compare the master and backup drives with diff. If a file differs it will let me know, and I will compare the file with another backup (like my backup at work, using md5sum to compare checksums if the files are large), and recopy the file to the drive with the altered copy (and re-check that file with diff to make sure it took).
| backup_diff.sh |
#!/bin/bash
FILES_DIR=$HOME"/files/"
BACKUP_FILES_DIR=$HOME"/backup_files/"
LOG=$HOME"/logs/backup_diff.txt"
KEEP_LOG="1" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session
if [[ $KEEP_LOG -eq 1 || $KEEP_LOG -eq 2 ]]; then
if [ $KEEP_LOG -eq 2 ]; then
rm -f $LOG_FILE
fi
date +%F\ %T\ %A | tee -a $LOG
echo 'Starting backup diff between' `cat $FILES_DIR/.identity` 'and' `cat $BACKUP_FILES_DIR/.identity` | tee -a $LOG
diff -rq $FILES_DIR $BACKUP_FILES_DIR | tee -a $LOG
ERROR=$?
date +%F\ %T\ %A | tee -a $LOG
echo "--------------------------------------------------------------------------------" | tee -a $LOG
else
date +%F\ %T\ %A
echo 'Starting backup diff between' `cat $FILES_DIR/.identity` 'and' `cat $BACKUP_FILES_DIR/.identity`
diff -rq $FILES_DIR $BACKUP_FILES_DIR
ERROR=$?
date +%F\ %T\ %A
echo "--------------------------------------------------------------------------------"
fi
exit $ERROR
|
This script acts like a service (although once it mounts or unmounts the drives, it is finished and no longer resident in memory), and accepts start, stop, and restart commands. When you "start" the script (the default if no argument is passed to the script), the script steps through the devices listed in DRIVES and looks for drives to mount, and mounts them in the locations specified. Sending the "stop" argument to the script will have the script look at all of the directories where a drive may be mounted, and check if they contain a .identity file (indicating a drive is mounted), and then unmount any mounted drives. Sending "restart" to the script simply runs "stop" then "start" with a 2 second delay in between.
I have this set up as a service on my server, so that my external USB drives are mounted to the correct locations upon startup. In Ubuntu, this may be done by placing the script in /etc/init.d/, and running "sudo update-rc.d mount_drives.sh defaults" to create links in the /etc/rcX.d directories (which will automatically send start/stop commands to the script when changing run levels), where X is the run level. The script may be removed as a service with "sudo update-rc.d mount_drives.sh remove". When changing drives out, I'll manually run the script (as root) with the appropriate commands.
| mount_drives.sh |
#!/bin/sh
DRIVE_1_DIR="/home/nick/files/"
DRIVE_1_ID="files0"
DRIVE_2_DIR="/home/nick/backup_files/"
DRIVE_2_ID="files1"
DRIVE_3_DIR="/home/nick/backup_files/"
DRIVE_3_ID="files2"
DRIVE_4_DIR="/home/nick/backup_files/"
DRIVE_4_ID="files3"
DRIVE_5_DIR="/home/nick/backup_files/"
DRIVE_5_ID="files4"
TEMP_MOUNT_DIR="/mnt/"
DRIVES="/dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1"
mount_drives_start() {
for DRIVE in $DRIVES; do
mount -t ext3 $DRIVE $TEMP_MOUNT_DIR
if [ -e $TEMP_MOUNT_DIR/.identity ]; then
ID=`cat $TEMP_MOUNT_DIR/.identity`
echo "The ID for" $DRIVE "is" $ID
umount $TEMP_MOUNT_DIR
if [ $ID == $DRIVE_1_ID ]; then
if [ ! -e $DRIVE_1_DIR/.identity ]; then
echo "Mounting" $DRIVE "to" $DRIVE_1_DIR
mount -t ext3 $DRIVE $DRIVE_1_DIR
else
echo "Something is already mounted to" $DRIVE_1_DIR
echo $DRIVE "will not be mounted"
fi
elif [ $ID == $DRIVE_2_ID ]; then
if [ ! -e $DRIVE_2_DIR/.identity ]; then
echo "Mounting" $DRIVE "to" $DRIVE_2_DIR
mount -t ext3 $DRIVE $DRIVE_2_DIR
else
echo "Something is already mounted to" $DRIVE_2_DIR
echo $DRIVE "will not be mounted"
fi
elif [ $ID == $DRIVE_3_ID ]; then
if [ ! -e $DRIVE_3_DIR/.identity ]; then
echo "Mounting" $DRIVE "to" $DRIVE_3_DIR
mount -t ext3 $DRIVE $DRIVE_3_DIR
else
echo "Something is already mounted to" $DRIVE_3_DIR
echo $DRIVE "will not be mounted"
fi
elif [ $ID == $DRIVE_4_ID ]; then
if [ ! -e $DRIVE_4_DIR/.identity ]; then
echo "Mounting" $DRIVE "to" $DRIVE_4_DIR
mount -t ext3 $DRIVE $DRIVE_4_DIR
else
echo "Something is already mounted to" $DRIVE_4_DIR
echo $DRIVE "will not be mounted"
fi
elif [ $ID == $DRIVE_5_ID ]; then
if [ ! -e $DRIVE_5_DIR/.identity ]; then
echo "Mounting" $DRIVE "to" $DRIVE_5_DIR
mount -t ext3 $DRIVE $DRIVE_5_DIR
else
echo "Something is already mounted to" $DRIVE_5_DIR
echo $DRIVE "will not be mounted"
fi
else
echo "The .identity file does not match any known drive"
echo $DRIVE "will not be mounted"
fi
else
umount $TEMP_MOUNT_DIR
echo $DRIVE "does not exist, or does not have a .identity file"
fi
done
}
mount_drives_stop() {
if [ -e $DRIVE_1_DIR/.identity ]; then
echo "Unmounting" $DRIVE_1_DIR
umount $DRIVE_1_DIR
else
echo "No drive mounted at" $DRIVE_1_DIR", or no .identity file is on drive"
fi
if [ -e $DRIVE_2_DIR/.identity ]; then
echo "Unmounting" $DRIVE_2_DIR
umount $DRIVE_2_DIR
else
echo "No drive mounted at" $DRIVE_2_DIR", or no .identity file is on drive"
fi
if [ -e $DRIVE_3_DIR/.identity ]; then
echo "Unmounting" $DRIVE_3_DIR
umount $DRIVE_3_DIR
else
echo "No drive mounted at" $DRIVE_3_DIR", or no .identity file is on drive"
fi
if [ -e $DRIVE_4_DIR/.identity ]; then
echo "Unmounting" $DRIVE_4_DIR
umount $DRIVE_4_DIR
else
echo "No drive mounted at" $DRIVE_4_DIR", or no .identity file is on drive"
fi
if [ -e $DRIVE_5_DIR/.identity ]; then
echo "Unmounting" $DRIVE_5_DIR
umount $DRIVE_5_DIR
else
echo "No drive mounted at" $DRIVE_5_DIR", or no .identity file is on drive"
fi
}
mount_drives_restart() {
mount_drives_stop
sleep 2
mount_drives_start
}
case "$1" in
'start')
mount_drives_start
;;
'stop')
mount_drives_stop
;;
'restart')
mount_drives_restart
;;
*)
mount_drives_start
esac
|
| delete_metadata.sh |
#!/bin/sh
echo "Searching for Thumbs.db Windows thumbnail metadata"
find $1 -name 'Thumbs.db' -exec rm -vf {} \;
echo "Searching for .DS_Store Macintosh metadata"
find $1 -name '.DS_Store' -exec rm -vf {} \;
echo "Searching for ._* Macintosh metadata"
find $1 -name '._*' -exec rm -vf {} \;
echo "Searching for *~ backups"
find $1 -name '*~' -exec rm -vf {} \;
echo "Checking for .TemporaryItems/ in" $1
if [ -e $1/.TemporaryItems/ ]; then
rm -rfv $1/.TemporaryItems/
fi
|