Entity Modelling

www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory

Network or Matrix Structure

Marriages and Directed Graphs

If we model the marriage entity then we make it double dependent on person entity and since at the time of writing same sex marriage is not legal in my part of the world I can model it as a dependent on a male entity on the one hand, and a female on the other:

In this model, unlike in those of figures 11 and 12 to which it is superficially similar, the subordinate entity, marriage, has no exclusion arc between its dependencies -- which is to say that the model shows each subordinate entity marriage to be dependent on two superordinate entities - those shown as male and female. For the first time we see that the modelling notation is not constrained to situations where every entity has at most one other that it is dependent on. It is common practice to describe situations that do have this characteristic as hierarchical - it is the defining principle of hierarchy that in a hierarchical system every entity within the system is subordinate to at most one other entity.

In other instances of matrix structure the branch types (by which I mean for example the types male and female above) are not distinct types. We'll see this in the next example - the type in question being arc.

Mathematicians define and use various notions each which abstract the idea of a network of points and connecting lines independently of how or whether physically realised. They define such abstract notions using the language of sets and relations; they use the term graph for the most abstract concept of such a network and they variously use the terms point, node or vertex for the things connected; generally they use the term edge, directed edge, arc or arrow for the connections. There is not a single terminology and so we have to plump for one of several available; our definitions will clarify the choices made. For an entity modeller, and therefore for a database designer, the most straight forward of the graph notions is that of a directed graph.

Simple Graphs and Identifying Relationships

One of the restrictions often places on graph structures is that there be at most one edge between any two vertices. The term simple graph is used for graphs having this property and having the further property that vertices are not linked to themselves. In such cases the fact that there is at most one edge between any two verticies means that an edge can be uniquely identified by identifying the vertices that it connects.

The arc type as used in models of simple directed graphs is an example of an entity type each of whose instances may be uniquely identified by relationships with other entities. Relationships such as these are said to be identifying relationships though this terminology lacks rigour for it is the set of them which are identifying not any one of them individually.

Identifying relationships are represented with a mark across the relationship having the form of the letter 'I' . Adding these marks to the model of a directed graph we get the model of a simple directed graph:

The terms network and matrix are commonly used in contrast to the term hierarchical to refer to arrangements of entities not constrained to be hierarchical; for example in organisational structure the term matrix management is used in situations were different dimensions are managed by different management hierarchies and in which individuals therefore have multiple reporting lines. The term hierarchical is etymologically derived from Greek sacred ruler and emerged in its modern sense via its use in medieval times in relation to the church organisation.

Matrices and Tabular Displays

In mathematics, a matrix is a rectangular array of numeric elements, as for example the matrix
23 15 29 22
31 6 9 8
-1 8 17 52

having 3 rows and 4 columns. Though, this is essentially 2 dimensional, the content can be communicated in a linear message either row by row as [23,15,29,22] followed by [31,6,9,8] then [-1,8,17,52] which we can describe by the message structure

         matrix => row*
         row => element*
or column by column as [23,31,-1] then [15,6,8] then [29,9,17] then [22,8,52] which we can describe by the message structure

         matrix => column*
         column => element*

As you see with this example, each element of any matrix is both part of a row and part of a column and so, as with the previous marriage entity, the element entity is modelled as a subordinate to two others of different types as shown in figure 25. The two ways of communicating the matrix correspond to the two branches of this entity model. The row by row communication is described by this model:

and the column by column communication by this one:

  • every matrix has one or more rows
  • every matrix has one or more columns
  • every matrix has one or more elements
  • every element is part of row and part of a column
Figure 25
A mathematical matrix

There is a similar shape to the models representing the structure1 of rectangular tables of data. An example is given in figure 26. In the HTML language, and in other computer markup languages, such data tables are communicated row by row rather than column by column.

In other tabular displays the rows or columns of a table, or both, may be grouped together to represent some grouping of the subjects. The structure then has different branches that are hierarchical and joined at the detail level into the recognisable shape of the 2 dimensional matrix structure. One such is illustrated in figure 27.

Figure 26
Model of the structure of a tabular display. Note that what is modelled is the structure of the display rather than the structure of the subject entities though, having said which, there is an important meta-relationship between the two - for of necessity there is a meta-relationship between the structure of a system of subject entities and the structure of a medium through which details of such a system may be communicated or visualised.
Team sheet
Goalkeeper GK Paul Robinson
Defenders LB Lucus Radebe
DC Michael Duberry
DC Dominic Matteo
RB Didier Domi
Midfielders MC David Batty
MC Eirik Bakke
MC Jody Morris
Forward FW Jamie McMaster
Strikers ST Alan Smith
ST Mark Viduka
Figure 27
Tabular structure in which rows are grouped - part of dataset from U.S. Bureau of the Census2
Rank by Population of the 100 Largest Urban Places
State Urban Place 1960 1970 1980 1990
ARIZONA Phoenix - - - 85
Tucson 61 23 22 15
CALIFORNIA Fresno 24 27 29 36
Long Beech - - - -
Los Angeles - 90 88 93
Oakland 83 - - -
TEXAS Dallas 43 44 36 44
Los Angeles - 90 88 93
Oakland 83 - - -
VIRGINIA Virginia Beech 2 2 2 3
WASHINGTON Chicago 2 2 2 3
WISCIONSIN Milwauke 2 2 2 3

Communicating Hierarchy - XML

Whereas, matrix structure, as considered in the previous section, is essentially 2 dimensional, hierarchically structured information may be flattened into a linear i.e. 1 dimensional structure in which nesting of detail represents the hierarchy. XML is a language designed for this purpose.

In the XML language communication of a hierarchically structured entity, each type X of entity is enclosed by its own parenthesis in the form of character sequences for start and end element; this message that follows these conventions3:


We continue this subject of hierarchy versus matrix structure in the next section: Representing Graphs

1 I am distinguishing here between the structure of the tabular display from the structure of the subject entitites
2 http:/www.census.gov/population/www/documentation/twps0027/tab01.txt,
Internet Release date: June 15, 1998
3 I have written the message on many lines and indented to make more readable