Preparing to use edmbakup for filesystem backups

This is an HTML-format version of a document from August 1992. Some of the ideas described as future enhancements are now in use at this court. A few current notes have been added below in [bracketed italics].

The binary edmbakup is the very generic nucleus of a number of possible backup schemes. The binary itself simply creates an environment, for the duration of one command, in which a specified file system could be backed up without affecting any of its timestamps, and possibly by an operator without superuser privilege. The environment also prevents access to other file systems. Details are in the edmbakup comments.

The command executed can be anything. The edmbakup binary runs any specified command (provided it's available on the target file system) and passes along any arguments and open file descriptors unchanged.

The edmbakup binary itself is so general that it's not a convenient backup system by itself. Rather, new programs or scripts such as our nightly and full are layered over it to provide whatever functionality the backup operators should actually see. While our nightly and full scripts really are the way we currently do backups, they are best viewed as examples. The underlying edmbakup can serve, without modification, as the basis for other scripts that might be largely similar to our nightly/full, or considerably different, reflecting local requirements.

The few restrictions built into edmbakup, and therefore common to any backup arrangement built around it, are:

The filesystem to be backed up must be unmounted at the time edmbakup is run. It is left unmounted when edmbakup exits.
The command edmbakup is told to run, plus any other commands or files it may need, must reside on the filesystem to be backed up.
The file courtshare/mied/backup/backup.gate must exist, with exactly that pathname relative to the root of the filesystem to be backed up. This is a copy of the appgate component of our invoke facility.

The edmbakup binary itself is setuid-superuser so that it can mount and unmount the specified filesystem on behalf of a non-superuser operator. The specified command, however, is not run as the superuser; it is run with a specified user and/or group id configured in advance by setting the owner, group, and set-id properties of backup.gate. Any special abilities the operator will need while performing the backup are provided by executables provided in advance on that file system, executable only by the gate IDs. The operator is able to run these executables when using edmbakup, but not at other times.

The nightly/full scripts we use depend on a traditional shell command piping find into pax (a Portable Archive Exchange utility that supports cpio and tar format in conformance with the POSIX standard). For this to work, a directory /courtshare/mied/backup is created on each filesystem we expect to back up. This directory must contain backup.gate, a simple copy of the shell /bin/sh, and the find and pax programs. [On systems with static shared libraries or dynamic shared objects, each filesystem must also contain copies of any shared library or object needed to run sh or the other binaries. Typically, that will mean adding /shlib/libc_s and/or /usr/lib/libc.so]

The find program should be able to see the name and status of every file on the file system, and the pax program should be able to read any file to be backed up, on behalf of the non-superuser operator. Therefore, these copies of find and pax are made setuid-superuser. They are also made executable only by a special group with no members; ordinary users can't run these copies of find and pax, and neither can the backup operators under ordinary circumstances. The backup.gate is made setgid to the special group, so that find and pax are available to commands run under edmbakup.

While edmbakup has the privilege it needs to mount and unmount the filesystem for the backup, the filesystem must still be unmounted before running edmbakup and then remounted for normal use. Both of these operations require privilege the backup operator does not have. The approach in our district has been that the operator does backups in a single-user run level where the filesystems are already unmounted. The operator only needs a way to tell init to change run levels. Unmounting, mounting, and other shutdown/startup operations are all done by init and the rc scripts, which have the required privilege.

To allow non-superuser operators to initiate a change of run level, all that's needed is a copy of /etc/init under a new name, which is made setuid-superuser and executable only by a special group which contains only the operators.

We chose to define a new run-level, 4, for the making of backups. This just entails adding a 4 to the run-levels in /etc/inittab for each terminal that should be active for the backup operators. If the operators don't mind having to work at the console, it should be possible to just use level 1 and avoid adding a new level.

To do backups, our operators can use init 5 or init 6, wait for the system to shut down and restart, enter 4 for the run level, and log in. After using edmbakup to back up the file systems, they simply use init 2 and watch the system come back up.

So that our operators can log in under their own accounts while the /u filesystem is unmounted, copies of their home directories are placed in the /u directory of the root filesystem. These are invisible in normal operation, since the /u filesystem is mounted over them. Other users cannot log in during backups, even on the active terminals, because their home directories cannot be found.

POSSIBLE ENHANCEMENTS

Some known weaknesses of our current backup procedures can be addressed without modification to edmbakup itself or scripts that have already been developed. New functionality can be added simply by layering on top of what is available now.

Prevention of exposure

The arrangement described above, currently in use at this court, allows a non-superuser backup operator to back up all files on a filesystem, by using setuid copies of find and pax. It also prevents others from using those setuid executables for any purpose, and even the operators cannot use them to modify a file system; only read access is permitted.

However, a backup operator could use these abilities to read directories and files for purposes other than backup. By examining what the nightly/full scripts do, the operator could execute edmbakup directly using a similar command, but choose to direct the pax output someplace other than a backup tape, for example, piping it into pax -r to extract a certain file. For that matter, the operator could simply extract any desired file from the backup tape after creating it. This situation is a major improvement over giving operators superuser, which was once standard procedure in this court, but it is clearly far from ideal.

It is easy to require the operators to use the approved backup scripts with edmbakup, and prevent using edmbakup for other purposes. The edmbakup binary can be made executable only by a special group with no members, and a setgid invocation arrangement can be set up around the approved backup scripts. The destination tape drive should be hard-coded into the script so the backup output could not be diverted. The invocation front-end need not be developed from scratch; our invoke front-end will suffice.

It is also possible to dedicate a tape drive to backups and change its permissions so it is accessible only by the IDs picked up from invoke. This way, the operator will not be able to simply read back the backup tape after creating it, except for purposes allowed by the front end. This is not very useful if another compatible tape drive is easily available to the operator.

Prevention of spoofing/tampering

If the backup operators work unsupervised (as ours do) and have unrestricted access to the tape drive, an operator could prepare a "backup" tape containing new or modified files. If a system problem later forced restoration from the prepared tape, system integrity would be compromised.

This risk can be reduced by setting up a front-end and restricting access to the tape drive as suggested above. It can be further reduced if the front-ended backup script produces, say, cryptographic signatures of the files being backed up, storing those signatures in a location inaccessible to the operator. The signatures could later be used to check a backup tape for authenticity.

If requirements offset performance concerns, it is possible to design a front-ended backup script that encrypts the entire backup tape using a key that is not accessible to the operator. This is a strong defense against both exposure and tampering; with this technique, the operator cannot even take the backup tape to another tape drive or system to read it or modify it. This is most practical with tape drive hardware that supports encryption; it can be done in software, but with a performance penalty.

Implementation would require nothing more than invoke, edmbakup, and an encryption filter; the tricky bit is to design a key management scheme so that the key can be protected from the operator and changed regularly, but old keys can be obtained as needed to restore from old backups.