Isaac Newton Institute for Mathematical Sciences

Semantics of Computation

1 July - 31 December 1995

Organisers: S Abramsky (Imperial College, London), G Kahn (INRIA, Sophia-Antipolis), J C Mitchell (Stanford), A M Pitts (Cambridge)

Semantics of Computation Seminar

Thursday 14 December, 2.15pm

Abstracting dependencies between software configuration items

Carl A. Gunter (Univ. Pennsylvania)


This project studies an abstract model of dependencies between software configuration items based on a theory of concurrent computation over a class of enriched Petri nets. The primary goal is to illustrate the descriptive power of the model and lay theoretical groundwork for using it to design software configuration maintenance tools. As a start in this direction, the work analyzes and addresses certain limitations in make description files using a form of abstract interpretation.

Even modest software projects entail the production of what are sometimes called *software configuration items*. Some of the items, the *source* items, are directly modified by programmers or the environment of the system, while others, the *produced* items, are derived from operations that work on other source or produced items. For example, a C source file is written by a programmer and used to produce an object file, which, in turn, is linked with other object files to produce an executable. From this, it is clear that changes in source files must be perculated to the produced files that depend on them. The key question thus arises: given a collection of changes to sources, what sequence of commands can be evaluated to restore the desired relationships?

A recognition of the ubiquity of this problem, and the insight that a tool could address it in a wide range of cases, led Stuart Feldman to develop the Unix tool, make. In this limited application domain, the special-purpose make *description files* were easier to write and maintain than general purpose programs, so the tool quickly gained widespread use. Particular advantages of make are:

  1. It is usable for a wide range of systems.
  2. It enables programmers to directly express dependencies between items and to describe actions to establish desired relationships.

As a tool, make has seen a great deal of growth since its introduction in the 1970's and it has been found to have some noteworthy disadvantages:

  1. Extracting dependencies is a difficult and error-prone process for programmers. Moreover, such errors can be difficult to detect.
  2. Although programmers are allowed to express and change dependencies between files as they will, they have limited ability to express finer dependencies based on the semantics of the underlying systems.

The primary competition for the general-purpose make tools are *Integrated Development Environments (IDE's)*. An example is Microsoft's Visual C++ development environment, which maintains its own automatically-generated make-like description file. Typical advantages of IDE's include:

  1. They can extract dependencies automatically.
  2. They can exploit fine dependencies based on the semantics of the systems for which they are designed.

However, generally,

  1. They are not sufficiently `open' to be easily linked with other systems.
  2. They significantly limit programmer control over dependencies and actions.

One approach to reconciling the advantages of general make description files with IDE's is to automate the generation of make files from families of software items in specific systems. For instance Unix *makedepend* does this for C-programming files. Such systems offer prospects for some measure of interoperability with other systems because they work with standard make description files. The approaches are limited to the domains for which they are designed: for instance makedepend works only for C-programming items. However, one hopes for a more general mechanism. For example, consider the issue of documenting functions in libraries. For a language like Standard ML, which has a rich type system for interfaces, it is desirable to closely couple the functional and performance specifications with the type specifications. However, if these are kept as separate files, the process of hand-tracking the effects of modifications is tedious and error-prone. An alternative, now being pursued by Emden Gansner and John Reppy for the SML/NJ libraries, is to concentrate the signatures and their documentation into a single file written in Standard Generalized Markup Language (SGML). A combination of LaTeX, SML, and SGML tools are used to generate the desired targets. For instance, an ML document file for sets can be used to generate SML, LaTeX, or hypertext using the respective tools.

In general, no matter how much is integrated into an IDE, it is likely that it will be desirable to use the inputs and products of the environment in a project with other tools. Thus one would like to have a means whereby an IDE can report its dependencies and actions in a standard manner to its broader environment. The initial work on this project is based on the idea that a reasonable first step toward solving this problem is to develop an abstract approach to representation of relationships between software configuration items. In work so far, a representation known as a *production net* is formulated and applied to a collection of issues in determining which operations must be performed to restore production consistency to a system which has been changed. Production net *models* and *abstractions* are used to formulate precise correctness conditions and provide a framework for unifying a class of optimizations which have been shown by experience to have significant practical benefit. The optimizations are based on the idea of retaining `abstracted' information about software configuration items as an aid to determining whether a production operation really needs to be performed. The resulting theory may thus be seen as an application of `abstract interpretation' to software configuration maintenance.

Copyright © Isaac Newton Institute