Amanda is probably one of the best open source network backup solutions available in the market today. The Advanced Maryland Automatic Network Disk Archiver (Amanda), as the name suggests, was developed at the University of Maryland. It allows the administrator to set up a single master backup server to backup multiple hosts over the network, to tape drives, to disks or to optical media, and even the cloud (with the help of Amazon S3 Web services). It provides an array of options for the type of media that can be backed up to, and a multitude of client environments.
Amanda is basically like an enterprise solution that does require a bit of configuration initially, but after that, it’s a breeze. But don’t let this put you off, because along with a tool like Amanda comes great flexibility and reliability, which allows it to work efficiently in the most complex of architectural environments. A variety of client operating systems are now supported, ranging from UNIX to all flavours of Linux, Solaris, Mac OSX and Windows. Overall, it’s a great utility, which you will fall in love with once you get used to it.
Why Amanda?
The most basic questions are:
- Why should you spend so much time struggling with Amanda, when you can use simple command-line automation scripts for your Linux servers?
- Even if you do decide to use a backup utility, why should you spend so much time and resources to set up software like Amanda?
The answer lies in the following three factors: scalability, flexibility and architectural complexity, which decide whether you need Amanda or not. Basically, Amanda is suited for those who have a multitude of systems with varying server environments, and it is really expensive to have a separate backup facility for each of these.
Amanda provides you with a dedicated backup server for the whole kit, making the backup process reliable, well-planned and efficient. Also, it allows for really good scalability and freedom when it comes to expanding your network. After adding new clients to the network, all you need to do is make a few changes in the configuration and you’re done; you can let Amanda handle the rest.
This essentially makes it quite future-proof. Otherwise, if your network is not that complex, I would frankly recommend a simpler solution for you, because Amanda might prove to be a bit of overkill in such scenarios.
One big advantage that Amanda has is that the data can easily be obtained with native tools, even if the Amanda server is rendered completely unusable (that, of course, does not mean that the data is not safe; it does have really good security features).
Features
- Client-server architecture: Obviously, this is one of the most important features. A unique feature is that it is the server that schedules and decides which client is to be backed up, and requests clients for data, instead of clients requesting the server. Also, all configuration related to the backup plan is done on the server. Amanda also has a really good media interface, without having any device driver-based dependencies.
- Automatic backup level selection: Amanda uses the concept of a “backup level” to distinguish different kinds of backups. Each backup type has a level number assigned to it; for example, a full backup is Level 0. Backing up the system at any level means saving all the files that have changed since the last backup at the previous level. Thus, a Level 1 backup saves all the files that have changed since the last full (Level 0) backup; a Level 2 backup saves all the files that have been changed since the last Level 1 backup, and so on. It determines the backup level automatically, rather than making the administrator determine this ahead of time.
- A consistent backup window and resource utilisation: Amanda provides a consistent plan for all your backups, such that there are no spikes in media and server resource utilisation — and you can set a specific time-frame within which backups are to happen.
- An intelligent backup scheduler: It determines the amount of data changed for a client, and schedules accordingly. The administrator only specifies a few parameters according to which backups are to occur. It basically distributes full backups with incremental ones over the backup cycle, to balance the amount of data that is backed up at a time. The scheduler skips any clients that could not be backed up, or were not available at the instant, and reschedules when they are available again.
- Data encryption and compression: Data compression options are plentiful. Compression can happen on the client or server, depending on the configuration specification. Encryption with OpenSSH and Kerberos is available, making it secure enough.
- Reporting and verification: The
amreport
tool provides reports on each backup run, along with detailed statistics. It also sends overnight email notifications to the administrators. Theamverify
tool checks the Amanda format on a drive, and whether it can be restored to a healthy state.
Installation and configuration
As far as installation is concerned, it turns out to be pretty easy; it’s the setting up and initial configuration that takes time. For most Linux systems, Amanda is available in repositories as the packages amanda-common
and amandaserver
, which, as the names suggest, need to be installed at their respective places. Some of the main dependency requirements for Amanda, which must be installed on the system before trying to install Amanda, are:
- GNU Tar 1.15 or later
- Samba for communication with Windows clients
- Perl 5.6 or later
- Glib version 2.2 or later
- Awk and Gnuplot for the
amplot
utility
The installation creates a new user, ‘amanda’ (or something similar), to run the Amanda backup and other tools. Configuration files are created in /etc/amanda
, including an example configuration to play around with; this can serve as an overview, but is not of much practical use. To create a new configuration, we need to create a folder under /etc/amanda
, the name of which will represent a particular configuration for Amanda. In this directory will be the amanda.conf
file, which contains the following pieces of major information:
org
: The email subject, to differentiate between various backups.mailto
: Administrator email address(es) to which to send reports (multiple addresses to be separated with spaces).tapecycle
: The number of tapes that are available, and to be circulated.dumpcycle
: The number of days in the total dump cycle.runspercycle
: The number of daily full backups to be taken.tapedev /dev/null
: This should be changed totapedev /dev/nst0
, which is the non-rewinding device for Linux.tapetype
: The configuration of the tape drive that Amanda will be using. There is a utility calledamtapetype
, which performs writes to the tape to determine the capacity and speed.
The values for tapecyle
, dumpcycle
and runspercycle
depend on the backup plan you choose, and the strategy you plan to undertake. There are a couple of other parameters in the file, which should be self-explanatory.
Now, we need to label the tapes. This is very important, because tapes are rejected if they are found to be improperly labelled during a backup run. This should be done with the amlabel
utility, as the ‘amanda’ user, and to create labels that match the regular expression specified in the particular amanda.conf
file. We’ll now want to add entries to the disklist file in the configuration directory, i.e., tell Amanda which directories on the client need to be backed up. Finally, the amcheck
utility is used to check the validity of the configuration; if it reports no errors, you are good to go!
Go on and run amdump
to take a trial run. The command won’t print anything, but a report of the backup can be generated with amreport
(it will also probably send an email to report everything when it is done). Once we have everything set up well, it’s time for cron
to take over and automate the process every night.
Recovering data from Amanda is pretty easy, though this time, the client needs to be set up to pull data from the server. This can be done in the amanda-client.conf
file of the configuration directory, after which the amrecover
utility handles everything.
In conclusion, this article tries to present to the user what it takes to set up and get an Amanda backup running. Though it is recommended to go through the official documentation while actually setting up Amanda, this involves taking care of a lot of the nitty-gritties of the configuration. Though the set-up process might look easy after reading this article, it is actually a bit harder, and might require more than one attempt. But once everything falls into place, it becomes pretty easy to run such a powerful tool on your server.