Entity Modelling

www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory


The Distinction between Composition and Reference

The entity modelling notation in one form or another is part of the core syllabus in the information sciences but invariably the distinction made here between composition and reference is hardly made1. It is surprising that the distinction between composition and reference is not taught more frequently because one of the prime motivations for teaching the entity relationship model is as a precursor technique to database design and, as we shall see later, concepts of context, reference and scope are of prime importance to database design and without them there remains a fudged step of the 'here is one I prepared earlier' variety. The concept of relationship scope is the key missing concept and it is introduced in this chapter.

Of these two superficially similar types of relationship:

  • the relationship between a play and the characters within the play,
  • the relationship between a play and performances of that play.
The first of these would generally be classified as a composition relationship for we can say that a play is in part composed of all the characters within it, whereas the second would generally be classified as a reference relationship for we would not say that a play is in part composed of all of it's performances. For this reason an entity model describing just these three entity types, play, performance and character, contains both vertical composition relationships and an orthogonal reference relationship, as shown below in figure 1.

  • a play is composed of one or more characters
  • a performance is a performance of exactly one play
Figure 1
Composition and reference

Since it is a distinction rarely made many readers may be sceptical of whether there is a credible distinction between composition and reference; in the circumstances such doubts are reasonable and much of this chapter will be devoted to examples and implications of the distinction. So far much weight has rested on appeal to a sense of what constitutes a part and of what parts something can reasonably be said to be composed. We said in the introduction that entity modelling was concerned with what could be known of an entity; now, another way of asking what can be known of an entity is to ask what description can be given of an entity or what of an entity can be communicated.

If focusing on parts and composition doesn't clarify the distinction between composition and reference or, for that matter, to convince of the credibility of the distinction, then another ways of clarifying relies on a focus on full description or communication and this in turns leads to the idea of copying the full description of an entity - for to communicate an entity is to copy it in some way from source to destination.

Therefore we ask what would be communicated in a full description of a play - surely it would include a full description of each of the characters? The play-characters relationship therefore passes the full description test and is classified as a composition relationship. The play-performances relationship on the other hand fails this test - it is not necessary to describe every performance of a play in order to fully describe the play - and therefore it fails the test.

The matter will not rest however - there are many relationships which can be modelled either way and then models containing them are subtly different and are appropriate in different circumstances.

Figure 2
The folder example is an excellent example of a composition relationship. I cannot delete a folder on my computer without deleting all the folders and files contained within it (of course I can move the contained items first and then delete the parent folder). Shortcuts are different - I can delete a shortcut to a file or folder without deleting the file or folder. Therefore the relationship between a shortcut and that which it is a short cut to is a reference relationship.

1The one exception to this would be in teaching of the UML notation wherein there is a further classification of composition relationships resulting in three subclasses of the core relationship concept rather than two as here.