Entity Modelling

www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory

If we model the marriage entity then we make it double dependent on person entity and since at the time of writing same sex marriage is not legal in my part of the world I can model it as a dependent on a male entity on the one hand, and a female on the other:

In this model, unlike in those of figures 11 and 12
to which it is superficially similar, the subordinate
entity, *marriage*, has no exclusion arc between its dependencies -- which is to say that the
model shows each subordinate entity *marriage* to be dependent on two superordinate entities -
those shown as *male* and *female*.
For the first time we see that the modelling notation is not constrained to situations
where every entity has at most one other that it is dependent on.
It is common practice to describe situations that do have this characteristic as *hierarchical* - it is the defining principle of
hierarchy that in a hierarchical system every entity within the system is subordinate
to at most one other entity.

In other instances of matrix structure the branch types (by which I mean for example
the types *row* and *column* above) are not
distinct types. We'll see this in the next example - the type in question being *arc*.

Mathematicians define and use various notions each which abstract the idea of a network
of points and connecting lines independently of
how or whether physically realised.
They define such abstract notions using the language of sets and relations;
they use the term *graph* for the most abstract concept of such a network and they variously use the terms
*point*,
*node* or *vertex* for the
things connected; generally they use the term *edge*, *directed edge*, *arc* or *arrow* for the connections.
There is not a single terminology and so we have to plump for one of several available;
our definitions will clarify the choices made.
For an entity modeller, and therefore for a database designer, the most straight forward
of the graph notions is that of a *directed graph*.

One of the restrictions often places on graph structures is that there be at most
one edge between any two vertices.
The term *simple graph* is used for graphs having this property and having the further property that vertices
are not
linked to themselves. Now the fact that there is at most one edge between any two
verticies means that an edge can be uniquely
identified by identifying the vertices that it connects.

The *arc* type as used in models of simple directed graphs is an example of one each whose
instances may be uniquely
identified by relationships with other entities. Such relationships in entity modelling
terminology are said
to be *identifying relationships* though this terminology lacks rigour for it is the set of them which
are identifying not the individual relationships.

Identifying relationships are represented with a mark across the relationship having the form of the letter 'I' . Adding these marks to the model of a directed graph we get the model of a simple directed graph:

The terms *network* and *matrix* are commonly used in contrast to the term hierarchical to refer to arrangements of
entities not
constrained to be hierarchical; for example in organisational structure the term *matrix management* is used in
situations were different dimensions are managed by different management hierarchies
and in which individuals therefore have multiple reporting lines.
The term hierarchical is etymologically derived from Greek *sacred ruler* and emerged in its modern sense via its use
in medieval times in relation to the church organisation.

In mathematics, a matrix is a rectangular array of numeric elements, for example

23 | 15 | 29 |

31 | 6 | 9 |

-1 | 8 | 17 |

is a matrix having 3 rows and 3 columns. Though, as with all matrix structure, this is essentially 2 dimensional, the content can be communicated linearly either row by row ([23,15,29], [31,6,9] and [-1,8,17]) or column by column ([23,31,-1],[15,6,8] and [29,9,17]).

As you see with this example, each element of any matrix is both part of a row and
part of a column and so,
as with the *marriage* entity, the *element* entity is
modelled as a subordinate to two others of different types as shown in figure 23.
The two ways of communicating the matrix correspond to the two branches of this entity
model.
The row by row communication is described by this model:

- every
*matrix*has one or more*rows* - every
*matrix*has one or more*columns* - every
*matrix*has one or more*elements* - every
*element*is part of*row*and part of a*column*

There is a similar shape to the models representing the
structure^{1}
of a rectangular table of data. An example is given in figure 24.
In the HTML language, and in other computer markup languages, data tables are communicated
row by row rather than column by column.

In other tabular displays the rows or columns of a table, or both, may be grouped together to represent some grouping of the subjects. The structure then has different branches that are hierarchical and joined at the detail level into the recognisable shape of the 2 dimensional matrix structure. One such is illustrated in figure 25.

Team sheet | ||
---|---|---|

Goalkeeper | GK | Paul Robinson |

Defenders | LB | Lucus Radebe |

DC | Michael Duberry | |

DC | Dominic Matteo | |

RB | Didier Domi | |

Midfielders | MC | David Batty |

MC | Eirik Bakke | |

MC | Jody Morris | |

Forward | FW | Jamie McMaster |

Strikers | ST | Alan Smith |

ST | Mark Viduka |

Rank by Population of the 100 Largest Urban Places | |||||
---|---|---|---|---|---|

State | Urban Place | 1960 | 1970 | 1980 | 1990 |

ARIZONA | Phoenix | - | - | - | 85 |

Tucson | 61 | 23 | 22 | 15 | |

CALIFORNIA | Fresno | 24 | 27 | 29 | 36 |

Long Beech | - | - | - | - | |

Los Angeles | - | 90 | 88 | 93 | |

Oakland | 83 | - | - | - | |

• | |||||

• | |||||

TEXAS | Dallas | 43 | 44 | 36 | 44 |

Los Angeles | - | 90 | 88 | 93 | |

Oakland | 83 | - | - | - | |

VIRGINIA | Virginia Beech | 2 | 2 | 2 | 3 |

WASHINGTON | Chicago | 2 | 2 | 2 | 3 |

WISCIONSIN | Milwauke | 2 | 2 | 2 | 3 |

Whereas, matrix structure, as considered in the previous section, is essentially 2 dimensional, hierarchically structured information may be flattened into a linear i.e. 1 dimensional structure in which nesting of detail represents the hierarchy. XML is a language designed for this purpose.

In the XML language communication of a hierarchically structured entity, each type
\texttt{*X*} of entity is enclosed by its own parenthesis in the form
of character sequences
for start and
end element; this message that follows these
conventions^{3}:

We continue this subject of hierarchy versus matrix structure in the next section: Representing Graphs