The Complete Magazine on Open Source

Use Sync for Stress-free Data Backup

SHARE
/ 2165 0

Data securing visual

A disk crash or a corrupted file can have disastrous consequences. A safe bet is to backup one’s data. The author narrates his personal experience with data backup issues and gives a tutorial in the use of Sync, which he recommends for home users.

Most of you are probably active users of various services like Gmail, Flickr, Google Photos, Apple iPhotos and Instagram, or at least, must be aware of them. Many of you might also be users of digital storage options like Dropbox. While these services do give you the flexibility of easy availability of data anywhere in the world (assuming you have the bandwidth to access the data), you need to maintain a local copy (on your laptop or desktop) of all your data, much as vendors may try to convince you that all the data is safe with them. In case the local copy gets corrupted, one can always depend on cloud storage.
While data stored in the cloud is mostly safe, there have been cases where the cloud service provider has accidentally deleted user data. Additionally, data can be lost in case there is a data breach and the cloud provider’s service gets compromised or passwords get stolen. Considering these possibilities, some of us are not comfortable in relying entirely on cloud services.
Personally, I make limited use of cloud-based storage services and mostly depend on keeping data in local storage. Having made the decision of ‘going local’ I need to take steps to ensure that the data is properly backed up, so that I will have a copy of the data, in case the primary storage fails. For important things like photographs, email and financial documents, I maintain two backup copies of data. In other words, I have the actual data in local storage, as well as a backup of the data on two different portable HDDs (hard disk drives). Due to this approach, I need to use a well-defined and simple backup process.

A survey of personal backup solutions
Over the years, I have used many methods to maintain backup copies of data. The first method obviously was to maintain a copy of each file. Then I started storing compressed versions of the data and using a timestamp for each backup. Just as enterprises have had to address the Big Data explosion, I too have not been immune to the problem of large volumes of data. Due to data volumes and the frequency of changes, the timestamp method of maintaining backups is not practical. Additionally, each zip archive ends up with one copy of the file and when a particular copy is needed, it is difficult to figure out the most relevant copy.
I also tried using version-control software like SVN. The problem with version control is that it’s sub-optimal for binary data. In other words, version control is best suited for text files, where versions are stored using the ‘difference’ method. In the case of binary files, most such solutions simply copy the binary file into the repository, as finding a ‘difference’ between two binary files is not simple and straightforward. Ideally, version control software, when used, should be hosted on a separate server, which is not an option at home. Additionally, maintaining the repository on an external HDD was next to impossible.
Then I started searching for personal backup software that would allow me to maintain backups of data and that, too, on external media like a portable HDD. I tried multiple solutions, two of them being Microsoft SyncToy and Cobian Backup. While I liked each of these tools (with a preference for Cobian Backup), I found that the GUI interface, and the way these tools stored the backup commands, created a problem that needed additional effort each time I wished to use the tools for backup.
While using these tools, I faced a problem with the backup commands due to the way Windows uses external media. Whenever external media is connected to a computer, there is no guarantee that it will be assigned the same drive letter as the one assigned the last time the media was connected. For example, if the connected media is assigned the drive letter H: today, there is no guarantee that it will be assigned the same drive letter tomorrow. If you have other media already connected, the portable HDD will get assigned the next available drive letter. As the backup commands configured in these tools refer to an absolute path that includes the drive letter, the command does not work if the drive letter is different. If the external HDD is assigned a different drive letter, you will have no choice but to edit the backup command to point it to the present drive letter. Now imagine the effort needed if such changes have to be applied to multiple backup commands, each time you wish to take a backup.
While searching for more tools, I came across a simple backup tool, namely, Sync (alternately named Syncdir).

Introducing Sync
Sync is a Java application that needs to be invoked from the command line. I can almost hear you grumble: “Ew! The command line. How can you even recommend a command line application in these days of GUI systems?” But, in my experience, this command line interface provides me the maximum flexibility.
On my local machine, I always have a JVM instance available either as a JDK or a JRE, along with a properly configured PATH variable and a JAVAHOME variable. To make the task of backing up easier, I have created batch files that contain the backup command (using Sync). The batch files are then stored on the portable device. Thus, whenever I wish to take a backup, I simply connect the portable device and execute the backup script. As the source directory does not change (being local storage) and the destination directory is specified using a relative path, the batch scripts do not need to be edited – irrespective of what drive letter is assigned to it. When launched, the script executes in a dedicated command prompt and I can continue with other activities. I only need to monitor the progress once in a while. Once the backup is complete, I go through the log file to check for errors and my backup work is done.
By using a Windows batch file, my backup activity is reduced to a double click or an ‘Enter’ key-press, which is all that’s needed to launch the script. The scripts have made the task so simple that even after maintaining two backup copies of my personal photographs, email and other documents, the only trouble I face is that of maintaining a regular schedule to keep the backup copies suitably fresh.

Figure 1]

Figure 1: Taking backups for the first time

Figure 2

Figure 2: Updating the backup. New files added

Using Sync
To use Sync, the command format is as follows:

java -jar Sync.jar <switches> [“Source”] [“Target”]

Using this command synchronises the [“Target”] to match the [“Source”]. It should be noted that only the [“Target”] is modified and, by default, the file name, size, last-modified time and a CRC-32 checksum of the file are used to match files. If [“Source”] is a directory, the source and target directories are matched recursively. Matched target files are time-synced and renamed if necessary; unmatched source files are copied to the target directory and unmatched target files and/or directories are deleted. If [“Source”] is a file, source and target files are matched, while ignoring the file name. If the files match, the target file is time-synced and renamed if necessary; if the target does not exist, the source file is copied to the target.

A few switches
Some of the switches that can be used with Sync are:

Table

Batch script: sync-generic.bat
Given below is a ‘generic’ script that can be invoked by other backup scripts. The reason for using such a script is that the actual backup script ends up with only minimal details and is thus easier to maintain. Additionally, changes to the commands, if any, when done in a central script work for all dependent scripts, reducing maintenance efforts.

@echo off
set SYNC_HOME=./Sync.jar
rem deleting existing log file
del log\%1.log
java -jar Sync.jar --log:log/%1.log --force %2 %3
echo Backup for %1 done.
Figure 3

Figure 3: Updating the backup. A file has been updated

Batch script: backup-doc.bat
Given below is the script that takes a backup of the ‘doc’ directory and uses the generic script for this purpose.

@echo off
set NAME=doc
call sync-generic.bat %NAME% D:\%NAME% .\%NAME%
pause

While there are many personal backup solutions, my tool of choice is Sync – a Java based command-line tool. Though it does not have a fancy GUI, it helps me maintain backups with little effort. To make the task of backing up easier, I use Windows batch files, which only require me to double-click (or press an Enter key) the script, to kick off the backup process.

References
[1] Syncdir, http://syncdir.sourceforge.com/
[2] Best free backup software: http://www.in.techradar.com/news/software/applications/Best-free-backup-software-11- programs-we-recommend/articleshow/38877922.cms
[3] 13 Best backup software: http://www.pcadvisor.co.uk/test-centre/software/13-best-backup-software-2015-2016-uk-3263573/