Active Object Models and object representations

Michel Tilman, System Architect, Unisys


Introduction

How we represent objects in Active Object Models is driven by several forces, such as:

  • Space usage considerations
  • Run-time performance
  • Dynamic nature of object representation and object type
  • The need or wish for uniformity of object representation, meta-model and persistence scheme across the active object model(s)
  • The application domain
  • The degree of freedom offered by end-user tools to change various aspects of the active object model

In this paper we explore some of these forces and show how they influence several design issues of the Argo framework. In particular, we view dynamic, hybrid strategies as a natural phenomenon.

Scenario

As a basis for the following discussion, we present a simple, but typical scenario when using our framework. The user logs on to the system and selects an application. In this application we expose that part of the object model we are interested in, in particular the object types and properties to be used. In an application, the user may execute a query, list some or all of the instances, display the results in a list view and open a form on a selection of objects to get a more detailed view.

In practice there are many object types that are used in only one or two applications; other object types are more common, but then again, not all of their properties (for instance associations) are relevant in all applications. Depending on the object type, the number of instances may vary considerably. Since users are free to define their own queries and list and form layouts, there is in general no way to specify at development time which properties we need at given time for a particular object or group of objects.

Class- (or code-based) approaches

The most obvious object representation in an object-oriented environment is by means of classes. Object properties are represented by means of instance variables; access is handled by get- and set-accessors. In the remainder of this paper we use properties in a broad sense: they denote basic attributes (e.g. strings and dates) as well as references to other objects (arbitrary relationships, type-subtype relationships, part-whole relationships, ....).

Analysis of the characteristics of this approach depends obviously on the semantics of the underlying implementation language, such as typing and reflective facilities, but in general we make the following observations:

  • Performance is usually optimal. Whether space usage is optimal, really depends on the application domain or case.
  • These approaches are usually less suited to achieve the run-time behavior often associated with Active Object Models. Even Smalltalk, despite it very dynamic nature, is used in a rather static way in many existing complex applications. Run-time (for instance, just-in-time) generation of classes and accessor methods is typically achieved by compiling code in the browser tools. For that matter, using the compiler at run-time in VisualWorks (the Smalltalk used in the Argo project) required (very expensive) development rather than run-time licenses until not so long ago. The reflective facilities of the VisualWorks language make it easy to (re)define classes without compiler, even for classes with existing instances, but the same can not be said for a basically simple problem as creating accessor methods on the fly. Cloning prototype methods, using wrappers, e.a. often require more advanced knowledge than many developers are able or willing to deploy in a commercial setting.
  • Class-based approaches may quickly lead to a huge overhead in just the code and maintenance necessary to define the object structures and accessor methods (cf. the UDP framework). If the class library can be factored in, say, fine-grained parcel-like structures that are loaded on command at run-time, this may alleviate some of the problems, but this is not without its own problems regarding deployment, development, and even design.
  • An Active Object Model is used in a particular context, and there are usually several aspects that need to be addressed in a generic way in that context, e.g. when we want to store the objects in the database in a uniform way, when we want lazy or eager retrieval of selected groups of properties or when we want accessor-level authorization control. These issues are not always satisfactorily achieved in a code-based approached (again, depending upon the language of choice).

Variable-state pattern

Using the well-known variable state pattern (often used in combination with the type object pattern) and a generic accessor protocol, we can set up a much more dynamic environment, with appropriate support for the various generic aspects. A typical implementation will use a property list, i.e. a dictionary mapping properties to the values for these properties. In its simplest form the dictionary keys are names representing properties. Often, when we add typing, when we want to re-use property and type definitions or when we model meta-information explicitly, it makes more sense to use the actual properties as keys and add a parallel mapping: at the type level, properties basically map names onto types, at the object-level, the property lists map properties onto actual values.

Using dictionaries throughout instead of a slot-based approach has it drawbacks: in general we pay a rather hefty space penalty, and performance is likewise affected. This is particularly important if, as in our case, meta-level information is represented in the same way as instance-level objects, stored in the database using the same persistence component, and if the meta-information needs to be consulted at run-time (as is usual in Active Object Models).

Property roles

Not all properties play the same role. This becomes clear when looking at our meta-model:

  • Some of these properties are part of the kernel model: they are needed for the bootstrap, e.g. when reading an object model from the database.
  • Some properties play a house-keeping role, e.g. modification date of an object type.

We could handle each of these different property roles in different ways. For instance, using regular instance variables and code to represent and access key properties. The price we pay is loss in uniformity and additional adaptor (or glue) code to make meta-level objects behave as regular objects, e.g. when opening a form on an object type, when using them in queries and authorization rules, .... The solution we opted for was simple: cache references to key properties, add optimized behavior in specific classes representing these objects, while adhering to the generic object representation. Removing these optimizations will not break the system, it will only run slower. This transparency is an important asset in keeping a framework design clean.

Stable vs. volatile property and object type definitions

Defining an object type starts with placing the new object type in the type hierarchy, specifying its basic attribute properties and its relationships to the existing object types. Additional 'virtual' properties may be derived from these 'real' properties, for instance by representing chains of associations as a logical property that may essentially be used as any other property. We distinguish two types of usage:

  • Virtual properties defined in the object model editor (by a knowledgeable user), with primary emphasis on re-use.
  • Virtual properties defined on the fly in queries and layouts by the regular end-user.

Clearly, the second type of usage is more 'volatile' than the first, and certainly more dynamic than 'real' properties. Hence it makes sense to find out if the representation of object properties can be fine-tuned using these observations.

When retrieving and displaying lists of objects, the user is initially only interested in a limited set of properties. When he / she has located the objects of interest, more information is needed. Similarly to properties, we can view this as a sort of implicit object type 'definition'.

Object representation in the Argo framework

In our framework we conceptually represent objects in the database as objects with two instance variables:

  • One instance variable represents volatile properties.
  • The other instance variables represents stable properties, i.e. real (and, for now, re-useable virtual) properties.

We use the variable state pattern to represent volatile properties, such as virtual properties defined on the fly in list layouts. The dictionary keys are symbols and uniquely identity the virtual property. The virtual property itself does not exist as an object.

Other properties are mapped onto slots in an array. The number of slots varies in the following cases:

  • When the object type definition changes, we re-compute the offsets, akin to the VisualWorks VM when a class definition changes.
  • When retrieving a list of objects from the database, we initially allocate enough slots to accommodate the properties needed. We add an extra indirection to consult a small array mapping the actual property slot indexes onto the compact representation. If additional properties are needed later on, we unfold the object to its normal size, and the indirection is removed.

The need for an extra indirection (and the array instance variable) when accessing the array vanishes if we use the Smalltalk reflective facilities: combining indexed instances variables with the "become:" operator yields the desired behavior. This is, in fact, how objects are implemented in our framework.

Caching

As we already hinted, we use caching throughout our framework, not only to cache objects retrieved from the database, but also to optimize usage of critical parts of the Active Object Model.

Just-in-time generation

The techniques described above have enabled us to achieve a good compromise between space and time requirements. Even so, some types of applications may require us to go one step further, for instance when we need to support complex computations.

Just-in-time generation of the necessary (and only these) classes and accessor protocols would enable us to achieve these goals. As already mentioned, dynamic languages as Smalltalk make implementation of this approach feasible. The (somewhat) harder part will be to achieve the right degree of transparency, i.e. the generation process must be totally unobtrusive, just like caching techniques. After all, we are interested in instances of our Active Object Model(s), not in instances of classes.

Hybrid approaches

The techniques we have been applying in our framework our fairly simple, and well-known, but they seem to work well in our context. Even so, several variations and hybrid combinations on the themes presented here can further optimize our framework and broaden its scope.