Entity Modelling

www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory


Physical ER Models

Data representations are described by physical ER models as discussed in the earlier perspective: data modelling; a physical ER model is an ER model which includes for each relationship a means of representing it in data.

Previously (in the foundations of data section) we introduced the idea that databases are message systems1 and that specifying a data structure is specifying a system of messages. According to this view there are a number of types of subject entity and there is hierarchy of messages, each of them describing a subject entity; furthermore there are different types of message corresponding to the different types of subject entity and each type determines:

  • the content of that type of message as a tuple of attributes of the subject entity which are to be attributed in each message of the type
  • the types of message which may have this type of message as their contexual position within the hierarchy
  • which combinations of attributes of a message are referential, to what type of entity they are referential and how, and, of these,
  • which particular combination of attributes referentially identifies the subject entity of the message.

Relationships are represented either by message context or by explicit message attributes or by a combination of the two. Models that prescribe message representations in all or in part having a reliance on context are called hierarchical. Those that do not place a reliance on message context are called relational2. The ER model shown in figure 1 is a physical model because the model includes a specification of how each relationship is to be represented. It is relational because none of these representations have a reliance on context.

Figure 1
Example of a physical ER Model that is purely relational.

These representations can be explained as follows:

  • Messages of type A have attributes a0,a1 and a2. The underlining indicates that a0 is an identifying attribute, it alone identifies subject entitites of type A.
  • Messages of type B have attributes b0, b1 and b2. Again as indicated by the underlining b0 is an identifying attribute, it alone identifies subject entitites of type B. Attribute a0 as indicated by the parenthetical (R3) is a referential attribute it communicates the relationship a subject entity of type B has with a subject entity of type A through relationship h which is here labelled R3. In short, the attribute a0 required of messages of type B represents relationship h; it has the same name as the identifying attribute of the destination entity type A but more often such a referential attribute will be given a different name - it is the parenthetical (R3) which distinguishes it as a referential attribute not the coincidence of its naming. As we discuss later, such referential attributes that represent relationships of one subject entity to another are, in relational database design, traditionally called foreign key attributes or foreign key columnns.
  • Messages of types C and D are similar to those of B.

Relational Models

In a relational model relationships are represented by attribute For instance, In this example:

the relationship a is represented by the attribute pname which would not appear a purely logical account:

Let us add a further entity type 'C' to our example and a further relationship c: B⋺━C

then the appropriate relational model will be like this:

A more general situation regards types of entity that are identified by attributes only within the context of one or more relationships with other entities. Figure 2 is a case in point.

Figure 2
Example of a physical ER Model that is relational and has cascading identifiers.

In this example the attribute b0 of entity type B is shown as being identifying only when taken in conjunction with the relationship h. For this reason the referential attribute a0 which represents h is shown as an identifying attribute of B. The values of a0 and b0 are to be taken in conjunction to identify a unique entity of type B. This in turns means that the relationship g is represented by a pair of attributes of entity type C - both shown as identifying because relationship g is identifying. Finally, as a consequence, the relationship f of the entity type D is by necessity represented by a triple of attributes here shown named a0, b0 and c0 and each annotated with the parenthetical text (R1) to show their role in the model.

identifying attributes are often called key attributes or just keys and those that individually or in combination reference entities other than the subject entity are sometimes called cascaded keys.

Hierarchical ER Models

In a hierarchical data representation another possibility is available for it is possible to represent the relationship between B and A by nesting data for B's within data for related A's. With this choice of data representation this is the hierarchical physical model:

We can represent the nested hierarchical data representation textually like this:


          B(name)
          A(name,B*)
        

This is an abbreviated form of what may be used to represent a document structure in a DTD: the former i.e. relational, may be represented in a DTD like this:


          ELEMENT A(name)
          ELEMENT name CDATA
          ELEMENT B(name,aname)
          ELEMENT aname CDATA
        
and the nested hierarchical representation, like this:

          ELEMENT A(name, B*)
          ELEMENT name CDATA
          ELEMENT B(name)
          ELEMENT aname CDATA
        

In the first case the xpath expression to navigate from a B element to its related A element is:


          //A[name=current()/aname]
        
in the second case the xpath expression is simply the navigation to the parent element:

          ..
        

When we consider possibilities for a hierarchical data representation then we note that the entity type B is potentially represented as a child of A or as a child of C.

The hierarchical physical model, following our feeling that B is a child of A is like this:

or alternately if we feel that B is a child of C:


1 The idea here is to try and describe from first prinicples rather than from the past experience in which for example we have authors such as Richard Barker quoting C.J. Date who describes very much as Codd did who in turn prescribed relational , in part at least, as a ruling out of certain features that would be natural to and understood by the programmers of his time - many of whom would have programmed in Cobol.
2 It is usual to think of relational and hierarchical as contrasting terms and we use the terms so here, but, conceptually at least message systems which are relational are a special case of those that are hierarchical. In the message system that is a relational database the messages, i.e. the individual rows, rather than being entirely context free generally rely on the database itself for their interpretation.