The ICMS Docket Fiche system in use at the Eastern District of Michigan was initially developed in May of 1992 to address four concerns:
fichelastdate
file should for any reason contain the wrong date.
The design chosen was a directory structure of individual docket files, each
one compressed (.Z).
The UNIX timestamp on each file reflects the currency
of that docket.
The directory structure has a few levels of branching based
on parts of the case number, to reduce directory search time by limiting the
number of entries in any single directory.
The structure can be split over
two or more filesystems if the number of docket files is expected to exceed
the roughly 65,000 limit on number of i-nodes in one filesystem.
The solution implemented includes the following components:
cpioed (preserving modification times)
to produce the fiche tree itself.
If this process is repeated for several existing fiche files, and the
proper options are given to
cpio (or
pax) to
update existing files only
if their timestamps are older, the result will be a merged fiche tree
containing the most recent version of each docket, as required in (1)
above.
The output of split.awk can be piped directly into
cpio (or pax); there
is no need to have enough storage for the intermediate cpio archive.
If the existing fiche file is on tape, the resulting data flow is directly
from tape to the final compressed files in the fiche tree.
The intermediate storage used is roughly on the order of the largest
single docket.
dkt.rpt whose output, in the format of a fiche
archive, is then split and un-cpioed just as in (1).
These processes run concurrently using pipes; here again,
negligible intermediate storage is needed.
As regards the performance concern (2) we find this implementation to perform adequately; some quantitative details are further below.
Because each file's timestamp is considered individually, the process does not involve any separate file with a last-run date. It is not prone to the misbehavior possible if such a file were incorrect. If the process is killed, it will pick up on its next run with only those cases that still require updating. This satisfies the robustness consideration (3). If a court is turning on automatic updates for the first time and many dockets must be generated, it is perfectly legitimate to let the process run till it gets in the way, kill it, run it again the next night, kill it, etc., until a run produces no further updates. Once the vast majority of dockets have been brought up to date, ongoing nightly maintenance runs are brief. Should any fiche file disappear for any reason, it is simply replaced on the next run.
Because updating a single docket requires overwriting just one small file, single-case updates on the fly are practical (4 above).
One sample UNIX client,
sdlpinq,
is provided, which provides a
user interface
very much like PACER/CHASER flatinq.
respawn
entry in /etc/inittab
so it is always present and reading a FIFO in the root of the docket tree.
Another process, such as sdlpd, requests
an update by simply writing the
case office, year, docket type, and number on the FIFO, and
ondemand takes
care of firing up the maintain
script (2 above) to update the file.
If an error in the database has been corrected, a DBA can
echo the
affected case number onto this FIFO to cause the docket file to be
replaced with a corrected version.
While performance was one of the design concerns for this system, we can state no conclusions on how it compares to the stock ICMS fiche software. We have used our system exclusively since 1992, first on the U5000/95, later on the 486. Our only experience with the stock software has been on the U5000/90, pre-1992, and with only partial fiche files. We have never been adventuresome enough to try it on our full, merged fiche collection.
We can provide the data on the size and configuration of our system at the time of this writing and the performance figures for our own implementation; perhaps another court which has figures for the stock software can then do a rough comparison.
The Eastern District of Michigan maintains a complete fiche tree for purposes of public access: all dockets, live or archived, which have ever been on-line. In the summer of 1993, a local project converted all Courtran criminal dockets received on tape from DC into compatible formats and merged them into this tree.
The tree is split over two filesystems on a single Micropolis 1528 disk on a Dell 486/33SE configured with three such disks. These are 1 KB filesystems; every file occupies an integer multiple of two 512-byte blocks. The loss of usable space when small files are stored in larger fixed allocation units is known as internal fragmentation.
Our tree contains about 94,000 individual dockets and consumes about 266 MB total, including space lost to internal fragmentation. The total uncompressed size, for comparison with a stock ICMS fiche file, is roughly 536 MB.
As of revision 3.2 of
split.awk,
the Free Software Foundation
gzip is used
rather than compress to compress the docket files.
This has improved the
compression ratio.
Because gzip can transparently uncompress the older
compressed files as well as gzipped files,
we made no effort to recompress
existing files at the time of the change.
Rather, those dockets which have
been created or updated since the change are gzipped,
while many other files
in the tree remain compressed.
Therefore, our overall compression ratio is
better than when only compress was in use,
but not as good as would likely
be observed in a court that uses the 3.2 or higher
split.awk from the outset.
Each filesystem is configured with 420,000 512-byte blocks and 65,488 (the maximum) i-nodes. The number of blocks was chosen according to the ratio (~ 6.4 blocks / i-node) which roughly described our fiche tree in practice at the time the filesystems were set up. The figure is now closer to 5.7 blocks per i-node, attributable to two factors:
split.awk, and
gzip format, the figure
would likely drop further.
Of the 94,000 dockets in the tree, a typical nightly maintenance run updates 600 to 1000, and completes in an hour and a half to three hours, varying somewhat with the load of batched ICMS reports and other competition.
A vendorrun is an infrequent operation and has not been carefully timed; it seems to take a couple or three hours on an unloaded system, with output going directly to an Exabyte 8500 SCSI 8mm tape.
Interactive performance of the SDLP server and client to look up individual dockets varies with network speed, and mostly reflects the time to transmit the compressed docket. Within our own Building LAN, a typical docket is retrieved in under a second. The current server and client are both written in an interpreted language and have not been heavily optimized for speed; nor is that a priority, as we find current performance acceptable.