The UNIFY® 4.0 Database Management System

UNIFY 4.0 is a database management system developed by UNIFY Corporation of Sacramento, California in the early-to-mid-1980s. The 4.0 version was released circa 1987. It was selected as the DBMS on which all nationally-supported court UNIX applications developed at the Administrative Office should be built. The applications depending on UNIFY 4.0 include the Integrated Case Management System, the Records Management System, the Case Assignment System, the Court Financial System-1, and the Court Reporter system. An appropriately-licensed UNIFY 4.0 run-time distribution is included, pre-installed, on every UNIX host purchased by the AO and provided to a court for the purpose of hosting nationally-supported UNIFY applications.

UNIFY Corporation is still in business and has current database products in the marketplace, which by now scarcely resemble the 4.0 version used in the courts. To the best of my knowledge, UNIFY Corporation no longer offers support for UNIFY 4.0. The Administrative Office and the Training and Support Centers provide support and training to the courts. The Administrative Office has a source-code license and the sources for UNIFY 4.0; the DBMS as delivered to the courts includes modifications made by the Administrative Office.

Review of common DBMS capabilities

In the contemporary market, popular production-quality DBMS offerings are similar in basic structure: they tend to be composed of a back-end database server combined with client tools, to provide a sporting approximation of Dr. Codd's relational model, and to incorporate some sub- or superset of the Structured Query Language (SQL) as both Data Definition Language (DDL) and Data Manipulation Language (DML), and as the fundamental abstraction in terms of which the clients, application programs, and server communicate. They support transaction processing and recovery, ensure database integrity in the presence of multiple concurrent access and update sessions, and can recover from local failures, system failures, and media failures up to the last completed transaction before failure, using essentially principles described by Haerder and Reuter in the December 1983 ACM Computing Surveys. Recovery from local and system failure is generally automatic, nearly transparent, and faster than recovery from media failure, with local-failure recovery being fastest.

Specifics of UNIFY 4.0

Assuming some experience with the other DBMS products and the capabilities reviewed above, this overview will now highlight the ways in which UNIFY 4.0 differs from the familiar products, and the implications of each difference for systems staff supporting UNIFY 4.0 applications.

Not a Client-Server Architecture

UNIFY 4.0 has no single "back-end," "engine," or "server" process that coordinates all access to the database files. Rather, each application program and user tool opens the database file(s) and manipulates the contents directly. The tools coordinate concurrent access and update requests among themselves using a combination of locking facilities provided by the OS and a shared memory segment for exchanging lock status. This design has implications in the areas of database integrity, automatic recovery from local failures, and enforcement of access controls.

Programming Model Not Independent of Storage Details

In Codd's relational model, programmers write, and programs interact with the database, strictly in terms of relations and the defined relational operators (project, restrict, join, etc.) applied to them; this makes programs independent of how the data are actually stored on media, and makes possible programs which continue to work even if those underlying details are changed. In UNIFY 4.0, by contrast, programs make function calls which assume a specific organization and access method. If a particular column is changed from hash to b-tree organization, for example, application programs which call acckey must be rewritten to call btsrch instead, recompiled, and tested. If the b-tree is then changed to an "explicit relationship," the btsrch calls must again be changed to faccess or makeset. This design limits an administrator's ability to tweak application performance by optimizing storage structures, particularly when application sources are not at hand. The example should also illustrate the difficulty of porting existing UNIFY 4.0 applications to modern, relational DBMS products, something the Administrative Office has long contemplated.

Treatment of NULL

UNIFY 4.0 does not support the NULL keyword or IS NULL predicate in SQL. Some data types, such as the DATEs, can take on a roughly-analogous state which is distinguishable from any valid value; other data types have no such state. This can be determined for each data type by observing the effect of inserting a row into a table via SQL, where the table has a column with the data type of interest, and that column is omitted from the SQL insert statement. In the case of a numeric field, that effect is the same as setting the column to zero; with a string field, the effect is the same as assigning the empty string (and different from the effect of assigning any positive number of spaces). A time field is set to midnight. These values may be interpreted as "null" by certain utilities such as REPOINT, even where they happen to represent actual data.

Even the unspecified DATE state (which is entered and displayed as **/**/**) does not behave as NULL behaves in the relational model or the SQL standard. NULL is defined to be rigorously distinguished from (and not comparable to) any data value; that is, none of the conditions A < B, A = B, A > B is allowed to evaluate to true if A or B is null. In UNIFY 4.0, by contrast, a comparison of an unspecified date to any specified date will indicate the unspecified date is earlier; a comparison of two unspecified dates will indicate they are equal.

These considerations can lead to surprises and complications in formulating UNIFY 4.0 queries, particularly for a person accustomed to the relational model and the SQL standard.

SQL as an interactive tool

UNIFY 4.0 includes a program named SQL with which the database can be queried or updated using commands that resemble the Data Manipulation Language portion of the Structured Query Language; it does not support any of SQL's Data Definition Language commands or the transaction-processing commands COMMIT WORK and ROLLBACK WORK. The lack of COMMIT and ROLLBACK reflect UNIFY 4.0's underlying inability to provide the corresponding functions.

This program operates at the level of an application or tool; internally, it operates on the database using the UNIFY Host Language Interface, which is not SQL-based, but depends instead on UNIFY-specific C-language function calls. Therefore, while SQL is available for ad-hoc queries and can be used fairly easily in shell scripts and the like--a trusted procedure makes it even easier--a programmer working in another language has two less convenient choices:

Learn and use the UNIFY HLI function calls, producing a program which is relatively low-level, often row-oriented, and mysterious to anyone who does not know the UNIFY HLI.
Code SQL queries and updates in the program and execute them by running the SQL tool in a coprocess, piping commands to it, and parsing the results piped back.

The second option may result in a clearer program, as long as the mechanics of managing the coprocess do not themselves muck it up to an equal or greater extent. We have developed function libraries unifysql.gawk and unifysql.tcl to make such operations simple and natural in gawk and Tcl programs, respectively; a demo illustrates how to use them. However, these libraries (and the coprocess approach in general) are quite limited in their ability to detect and respond appropriately to the various kinds of things that could go wrong with an update in a sophisticated application; also, they depend on SQL's query optimizer, where a better strategy can sometimes be devised and coded with HLI calls. Finally, there are some HLI calls that just have no SQL counterpart.

UNIFY 4.0 SQL does not include the Data Definition Language; tables, columns, permissions and so on are defined using UNIFY screens such as schent, emp, grp, etc.. However, all that information can be found as tables in the UNIFY data dictionary and another trusted procedure allows those tables to be queried and updated via SQL, subject to authorization.

Transaction Processing not supported

Transaction processing, associated with the COMMIT WORK and ROLLBACK WORK statements in the SQL standard, is not supported in UNIFY 4.0 through SQL or any other mechanism. In transaction-processing databases, an application program needing to make a group of related updates can be assured that they will either all complete successfully or none will; that is, the database cannot assume a state where some of the related updates have been made and some have not. A TP database will ensure this even if the application program or the computer system should fail in mid-transaction; it does so by detecting the failure and rolling back any of the updates which had been made but not committed. During the progress of the transaction, locking is usually used to prevent other processes from seeing the incompletely-updated records.

UNIFY 4.0 does provide a locking scheme to keep transactions invisible until they complete, but lacks the other basics of transaction processing. This reflects two details of the UNIFY 4.0 design:

Because each application program instance opens and manipulates the data files directly (using an object library of UNIFY routines linked into its own address space), there is no single coordinating "engine" to detect and clean up after a program that has failed to complete a transaction.
Because the UNIFY 4.0 transaction log entries include after-images but not before-images, the system would be incapable of automatically rolling back incomplete transactions even if it could detect them.

A program which may need to abort a transaction in progress does not have recourse to the standard SQL ROLLBACK statement. If an abort capability is needed, the programmer must include explicit program logic to remember the state of the data before a transaction began and to restore that state if the transaction is aborted. This works only when failure of the program itself is not the reason for abort. If the program fails without either completing or undoing its updates, the incomplete transaction will remain in the database.