This is an HTML-format version of a document from August 1992. Some of the ideas described as future enhancements are now in use at this court. A few current notes have been added below in [bracketed italics].
The binary edmbakup is the very generic nucleus of a number of possible backup schemes. The binary itself simply creates an environment, for the duration of one command, in which a specified file system could be backed up without affecting any of its timestamps, and possibly by an operator without superuser privilege. The environment also prevents access to other file systems. Details are in the edmbakup comments.
The command executed can be anything. The edmbakup binary runs any specified command (provided it's available on the target file system) and passes along any arguments and open file descriptors unchanged.
The edmbakup binary itself is so general that it's
not a convenient backup system by itself.
Rather, new programs or scripts such as our
nightly
and full are layered over it to
provide whatever functionality the backup
operators should actually see.
While our nightly and full scripts really are
the way we currently do backups, they are best
viewed as examples.
The underlying edmbakup can serve, without
modification, as the basis for other scripts that
might be largely similar to our nightly/full, or
considerably different, reflecting local
requirements.
The few restrictions built into edmbakup, and therefore common to any backup arrangement built around it, are:
courtshare/mied/backup/backup.gate
must exist, with exactly that pathname
relative to the root of the filesystem to be
backed up.
This is a copy of the
appgate
component of our invoke facility.
backup.gate.
Any special abilities the operator will need while
performing the backup are provided by executables
provided in advance on that file system,
executable only by the gate IDs.
The operator is able to run these executables when
using
edmbakup, but not at other times.
The nightly/full scripts we use depend on a
traditional shell command piping find into pax
(a Portable Archive Exchange utility that supports
cpio and
tar
format in conformance with the POSIX
standard).
For this to work, a directory
/courtshare/mied/backup
is created on each filesystem we expect to back up.
This directory must contain backup.gate, a simple
copy of the shell /bin/sh, and the
find and pax
programs.
[On systems with static shared libraries or dynamic shared objects,
each filesystem must also contain copies of any shared library or object
needed to run sh or the other binaries.
Typically, that will mean adding /shlib/libc_s and/or /usr/lib/libc.so]
The find program should be able to see the name
and status of every file on the file system, and
the pax program should be able to read any file to
be backed up, on behalf of the non-superuser
operator.
Therefore, these copies of find and pax are made
setuid-superuser.
They are also made executable only by a special
group with no members; ordinary users can't run
these copies of find and pax, and neither can the
backup operators under ordinary circumstances.
The backup.gate is made setgid to the special
group, so that find and pax are available to
commands run under edmbakup.
While edmbakup has the privilege it needs to mount
and unmount the filesystem for the backup, the
filesystem must still be unmounted before running
edmbakup and then remounted for normal use.
Both of these operations require privilege the
backup operator does not have.
The approach in our district has been that the
operator does backups in a single-user run level
where the filesystems are already unmounted.
The operator only needs a way to tell
init to
change run levels.
Unmounting, mounting, and other shutdown/startup
operations are all done by init and the
rc
scripts, which have the required privilege.
To allow non-superuser operators to initiate a
change of run level, all that's needed is a copy
of /etc/init under a new name, which is made
setuid-superuser and executable only by a special
group which contains only the operators.
We chose to define a new run-level, 4, for the making of backups. This just entails adding a 4 to the run-levels in /etc/inittab for each terminal that should be active for the backup operators. If the operators don't mind having to work at the console, it should be possible to just use level 1 and avoid adding a new level.
To do backups, our operators can use init 5 or init 6, wait for the system to shut down and restart, enter 4 for the run level, and log in. After using edmbakup to back up the file systems, they simply use init 2 and watch the system come back up.
So that our operators can log in under their own
accounts while the /u filesystem is unmounted,
copies of their home directories are placed in the
/u directory of the root filesystem.
These are invisible in normal operation, since the
/u filesystem is mounted over them.
Other users cannot log in during backups, even on
the active terminals, because their home
directories cannot be found.
Some known weaknesses of our current backup procedures can be addressed without modification to edmbakup itself or scripts that have already been developed. New functionality can be added simply by layering on top of what is available now.
The arrangement described above, currently in use
at this court, allows a non-superuser backup
operator to back up all files on a filesystem, by
using setuid copies of find and pax.
It also prevents others from using those setuid
executables for any purpose, and even the
operators cannot use them to modify a file system;
only read access is permitted.
However, a backup operator could use these
abilities to read directories and files for
purposes other than backup.
By examining what the
nightly/full scripts do, the
operator could execute edmbakup directly using a
similar command, but choose to direct the pax
output someplace other than a backup tape, for
example, piping it into pax -r to extract a
certain file.
For that matter, the operator could simply extract
any desired file from the backup tape after
creating it.
This situation is a major improvement over
giving operators superuser, which was once
standard procedure in this court, but it is
clearly far from ideal.
It is easy to require the operators to use the
approved backup scripts with
edmbakup, and prevent
using edmbakup for other purposes.
The edmbakup binary can be made executable only by
a special group with no members, and a setgid
invocation arrangement can be set up around the
approved backup scripts.
The destination tape drive should be hard-coded
into the script so the backup output could not be
diverted.
The invocation front-end need not be developed
from scratch; our
invoke front-end will suffice.
It is also possible to dedicate a tape drive to
backups and change its permissions so it is
accessible only by the IDs picked up from
invoke.
This way, the operator will not be able to simply
read back the backup tape after creating it,
except for purposes allowed by the front end.
This is not very useful if another compatible tape
drive is easily available to the operator.
If the backup operators work unsupervised (as ours do) and have unrestricted access to the tape drive, an operator could prepare a "backup" tape containing new or modified files. If a system problem later forced restoration from the prepared tape, system integrity would be compromised.
This risk can be reduced by setting up a front-end and restricting access to the tape drive as suggested above. It can be further reduced if the front-ended backup script produces, say, cryptographic signatures of the files being backed up, storing those signatures in a location inaccessible to the operator. The signatures could later be used to check a backup tape for authenticity.
If requirements offset performance concerns, it is possible to design a front-ended backup script that encrypts the entire backup tape using a key that is not accessible to the operator. This is a strong defense against both exposure and tampering; with this technique, the operator cannot even take the backup tape to another tape drive or system to read it or modify it. This is most practical with tape drive hardware that supports encryption; it can be done in software, but with a performance penalty.
Implementation would require nothing more than invoke, edmbakup, and an encryption filter; the tricky bit is to design a key management scheme so that the key can be protected from the operator and changed regularly, but old keys can be obtained as needed to restore from old backups.