For any computer user, data loss or data corruption can be a really bad experience but for professionals like sysadmins, its a nightmare. It is very important to back up data so that it can be restored in the event of data loss. Manual copying is a tedious job and hence there are several tools available for easing this task. Rsync (remote sync) is a remote and local file synchronisation tool. It can copy locally, to or from another host over any remote shell, and to or from a remote Rsync daemon. It supports copying links, devices, owners, groups and even permissions. It is also faster than most of the other synchronisation tools as it uses the remote-update protocol which allows the transfer of only the differences between two sets of files, i.e., it copies data only by moving those portions of a file that have changed. It also provides various options that allow us to control the synchronisation. Rsync is a very powerful utility and also a lot of fun to work with. In this article, we will explore this powerful utility.
Rsync is a popular tool and it comes, by default, in most of the Linux and UNIX-like systems.
To install Rsync on Debian or Ubuntu Linux, type the following command:
$ sudo apt-get install rsync
To install on rpm based Linux, type the following command:
# yum install rsync
Note: All the examples in this tutorial are demonstrated on Ubuntu Mate 14.10
If you have used any of the file copying utilities (FTP, SSH, etc) earlier, you will find the Rsync syntax very familiar. The syntax is something like what follows:
# rsync options source destination
Copying/syncing files and directories locally
Copying a file
The following command will sync a single file on a local machine from one location to another. Here the source file test_bk.zip is being synced to /tmp/backups.
$: rsync -zvh test_bk.zip /tmp/backups/
In the above command, -zvh are three different options where z compresses file data, v stands for verbose and h outputs numbers in a human-readable format. After running the command, you will get an output similar to what follows:
$: rsync -zvh test_bk.zip /tmp/backups/ test_bk.zip sent 2.82M bytes received 35 bytes 5.64M bytes/sec total size is 3.73M speedup is 1.32
Copying/syncing a directory
The following command will transfer or sync all the files from one directory to a different. Here, we are coping the directory /test to /tmp.backup.
$: rsync -azvh test /tmp/backups/
$: rsync -azvh test /tmp/backups/ sending incremental file list test/ test/dropbox_test.py test/flask_app.py test/flask_app_bk.py sent 6.38K bytes received 77 bytes 12.91K bytes/sec total size is 25.05K speedup is 3.88
Copying/syncing files and directories to or from a remote server
Copying a directory to a remote machine
The following command will sync a directory from a local to a remote machine. The command now copies the directory /test from our local machine to the home directory on the remote machine.
$: rsync -avz test/ [email protected]:/home/
Here xxx.xxx.xxx.xxx is the IP address of the remote machine.
$: rsync -avz test/ [email protected]:/home/ [email protected] password: sending incremental file list ./ dropbox_test.py flask_app.py flask_app_bk.py sent 6,364 bytes received 76 bytes 858.67 bytes/sec total size is 25,049 speedup is 3.89
Copying a remote directory to a local machine
To copy a directory from a remote server to a local machine, use the following command:
$: rsync -avzh [email protected]:/home/test /tmp/backup_remote
$: rsync -avzh [email protected]:/home/test /tmp/backup_remote [email protected] password: receiving incremental file list test/ test/test.css test/test.html test/test.js sent 3.79K bytes received 10.71k bytes 228.42K bytes/sec total size is 76.99M speedup is 44.94
Using Rsync and SSH together
SSH, which stands for Secure Shell, is a cryptographic network protocol for initiating text-based shell sessions on remote machines in a secure way. It was designed and created to provide the best security when accessing another computer remotely. SSH provides security by encrypting the sessions and hence when you use Rsync over SSH, all of your data is encrypted and cannot be easily read by others.
Rsync allows us to copy files recursively with compression and over an encrypted channel. And to do so, we use Rsync with the -e option. The following command copies the file test_bk.zip from the local machine to the remote server over SSH:
$: rsync -vzhe ssh test_bk.zip [email protected]:/home/
$: rsync -vzhe ssh test_bk.zip [email protected]:/home/ [email protected] password: sent 45 bytes received 12 bytes 8.77 bytes/sec total size is 3.73M speedup is 65,426.14
Similarly, files and directories can be copied from remote servers over SSH with the appropriate syntax.
Using Rsync with some useful options
1.–exclude and –include
As the flag name, we can use these options to specify which files to include and which to exclude during the sync.
The following command demonstrates this. It specifies the inclusion of all files with a .py extension and the exclusion of all files with a .js extension.
$: rsync -avze ssh --include *.py --exclude *.js test/ [email protected]:/home/ [email protected] password: sending incremental file list ./ sent 157 bytes received 28 bytes 33.64 bytes/sec total size is 25,049 speedup is 135.40
If a file or directory is not located at the source, but resides at the destination, you may want to delete it at the target while syncing to keep the latest copy of the backup. To do this, you have to use the –delete option, as follows:
$: rsync -avze ssh --delete test/ [email protected]:/home/ [email protected] password: sending incremental file list deleting flask_app_bk.py ./ sent 122 bytes received 253 bytes 68.18 bytes/sec total size is 13,298 speedup is 35.46
Deleting source files after a successful transfer
The option –remove-source-files is used to delete source files after a successful transfer, as shown below:
$: rsync -avze ssh --remove-source-files test/ [email protected]:/home/ [email protected] password: sending incremental file list ./ flask_app_bk.py sent 2,853 bytes received 46 bytes 527.09 bytes/sec total size is 11,751 speedup is 4.05 $: cd test/ $: ls $:
Displaying the progress while transferring data with Rsync
To display the progress of the data being transferred from one machine to another, we can use the –progress option. The following code displays the files and the time remaining to complete the transfer:
$: rsync -avze ssh --progress test/ [email protected]:/home/ [email protected] password: sending incremental file list ./ binjs.js 176 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=3/5) hook.js 3,124 100% 2.98MB/s 0:00:00 (xfr#2, to-chk=2/5) test.js 3,876 100% 3.70MB/s 0:00:00 (xfr#3, to-chk=1/5) test_bk.zip 3,729,290 100% 487.08kB/s 0:00:07 (xfr#4, to-chk=0/5) sent 2,824,460 bytes received 95 bytes 77,385.07 bytes/sec total size is 3,736,466 speedup is 1.32
Automating Rsync backups with Cron
The Cron utility is used to automate command execution at a specific time or after fixed intervals. We can combine Rsync and Cron to automate backup tasks.
To edit the Cron table file for the user you are logged in as, run the following command:
$ crontab -e
Now press i to enter the insert mode and edit the Cron table file. Cron uses the following syntax: minute of the hour, hour of the day, day of the month, month of the year, day of the week, and command.
Now, if we want to sync the directory named scripts to /tmp/backup every midnight, then the following line in the Cron table will do it with Rsync:
0 00 * * * rsync -avh --delete /home/user/programs/ /tmp/backups
The first 0 specifies the minute of the hour, and 00 specifies the time as midnight. Since we want this command to run daily, we will leave the rest of the fields with asterisks and then write the Rsync command to be executed.