Entity Modelling

www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory


Attributes

To use entity modelling in the description and construction of information systems alongside the meta-concepts of entity type and relationship there is a third and absolutely vital meta-concept — that of attribute. In the literature an attribute is variously defined as

any detail that serves to qualify, identify, classify, quantify or express an entity1,
the smallest discrete component of the system information that is meaningful2,
the abstraction of a single characteristic pocessed by all the entities that were, themselves, abstracted as an [entity type]3.

Whatever you make of these definitions it is clear that examples are needed; our first examples are given below and then others follow in subsequent sections. Regarding these examples, Schlaer and Mellor4 illustrate these concepts with examples in which entitites correspond to the rows of example tables and attributes correspond to columns of the table; other authors do the same and one at least treats column of table and attribute as synonymous terms5. We will also rely on examples in tabular form but slightly unusually we shall have examples in which some rows have other rows nested within them; in our examples the nested rows represent dependent entitites — they are a visible representation of compositional structure within an entity model i.e. of composition relationships between entitites.

An Aside - Attributes as Shorthand for Certain Relationships

You can consider what I now say as a parenthetical remark about the foundations of what we are engaged in. Believe me too that it is worth working from foundations because without that mathematics, philosophy, statistics, computer science and linguistics are muddied and fragmented6

Put aside attributes for a moment. Pretend that we only sculpt with entity types and relationships. Consider that entity types are only as useful as our ability to communicate their instances. When we model a type of entity in an entity model we specfying how entities of that type can be communicated by communicating their relationships with other entities. These other entitites can only be communicated by communicating their relationships with others and so on. This has to end somewhere else there is an infinite regress. Therefore for practical application in computer science there have to be some types of entity which a priori we have a means of communicating; so that instances of these givens will be the atoms in all communications of complex entitites. In practice, as used in computer science, but not in theory, the atoms are certain universal types that are not domain specific such as the types character, character string, number both integer or floating point, date and boolean and these usually occurring many times in relation to each domain-specific entity type as shown in the example in figure 31. Because they are universal types it profits us little to show them on the entity model diagram as we have on figure 31; for this reason and also because relationships with universal types are sufficiently different to other relationships (as we shall see later), they are not ususally recognised as relationships at all; instead the word attribute is used in place of the word relationship.

For practical purposes then, an attribute is not a kind of relationship (though really it is!) it is a named property of a type of entity, it can be optional or mandatory and it can have a type which is one of character, string (meaning character string), integer, float or boolean. The attributes are shown itemised in the boxes, nested or otherwise, representing the entity types on which they occur. See figure 32 for an illustration.

Figure 31
How a model might look with types on the diagram for character string, number and date.

Relationships to the universal types string, number and date are represented as attributes. Optional attributes are depicted with a leading circle; these correspond with the cardinality zero-or-one relationships in the previous diagram whilst mandatory attributes — those shown with a leading square — replace relationships of cardinality exactly-one. Identifying attributes, i.e. those replacing identifying relationships, are distinguished by having thier names underlined.

Figure 32
The same model as figure 31 but now using the attribute notation.

Example - Attributes in a model of a Personal Computer

In the following example based on the personal computer all files and folders are shown as having ‘name’ and ‘date modified’ attributes. In addition files, but not folders, have ‘size’ and ‘content’ attributes:

Figure 33
Model of a Personal Computer.

Example - Chemical Formulae

In chemistry the chemical formula of a compound is one of the ways of expressing information about the numbers of the different kinds of atoms that constitute a particular molecular compound. For example water, famously, has formula H2O, commmon salt which is also known as sodium chloride or halite has formula NaCl and aspirin which is also know as acetylsalicylic acid has formula C9H8O4. We can express these formulae in tabular form like this:
name formula
symbol number
water
H 2
O
common salt
Na
Cl
aspirin
C 9
H 8
O 4
We can model the formula of a chemical compound like this:

Example - Chemical Elements

In any given situation, getting to the right blend of entity types, attributes and relationships may be an iterative process as demonstrated in this next example in we explore two entries for chemical elements from a scientific dictionary7:

oxygen (Chem.). A nonmetalic element, symbol O, at. no. 8, r.a.m. 15.994, valency 2. It is a colourless, odourless gas which supports combustion and is essential for the respiration of most forms of life. M.p. -218℃, b.p.-183℃, density 1.42904 g/dm 3 at s.t.p., formula O2. An unstable form is ozone, O3. Oxygen is the most abundant element, etc.
chlorine (Chem.). Element, symbol Cl, at.no. 17, r.a.m. 35.453, valencies 1-,3+, 5+, 7+, m.p. -101℃, b.p. -34.6℃. The second halogen, chlorine is a geenish yellow gas, with an irritating smell etc.

From these entries, and with some expansion of abbreviations, I first surmise that each element has a name, a symbol, an atomic number, a relative atomic mass, one or more valencies, a melting point, a boiling point, and, optionally maybe, a density. I am left unsure whether all elements have a formulae, or whether they may have forms which may have formulae which is probably closer to the truth. I provisionally model this situation on an entity model diagram like so:

Reading the entry for sulphur, I find this preliminary model to be inadequate:

sulphur (Chem.). A nonmetalic element occurring in several allotropic forms. Symbol S, at. no. 16, r.a.m. 32.06, valencies 2,4,6. Rhombic(α-) sulphur is a lemon yellow powder; m.p. 112.8 ℃, rel. d. 2.07. Monoclinic (β-) sulphur has a deeper colour than the rhombic form; m.p. 119 ℃, rel.d. 1.96, b.p. 444.6C. Chemically, sulphur resembles oxygen etc.
Reading this third entry, I learn that different forms of sulphur have different melting points and that my provisional model has miss-positioned the ‘melting point’ attribute; it turns out that chemical elements, in and by themsleves, cannot be attributed melting points. After a little further research I learn that these forms taken by chemical elements are, technically speaking, allotropes and that it is these that have melting-points, boiling points and densities. I reach this model:

In tabular form the dictionary entries for oxygen, chlorine and sulphur can be structured like this8 :

name symbol atomic no r.a.m. valency allotrope
number
name m.p. b.p. density
oxygen O 8 15.994
2
dioxygen -218 -183 .0014
Ozone -192.5 -119.5 .0032
chlorine Cl 17 35.453
1-
3+
5+
7+
Dichlorine -101 -34.6 >??
sulphur S 16 32.06
2
4
6
Rhombic 112.8 ??? 2.07
Monoclinic 119 444.6 1.96

Think of the rows of this table as messages communicating entitites:

  • the columns correspond to attributes — each column heading is the name of an attribute,
  • the rows correspond to subject entities,
  • each cell presents the value of an attribute of a subject entity,
  • some rows have other rows nested within them and these representing subordinate entities in the sense defined by the composition structure.

Isotopes Relative Atomic Mass as a Derived Attribute

Entity models may have greater or lesser amounts of detail, depending on the purpose at hand, and this model of chemical elements that we have just presented isn't the whole story, either intellectually so as to do justice to the physics, or practically for the purposes of scientists detecting and analysing samples using mass spectroscopy; for them the fact that each chemical element has a number of istopes comes into play — The relative abundancies of the isotopes becomes significant and also, to a mass spectroscopist, the mass of the most abundant isotope. Once we add isotope and its relative isotopic mass attribute to the entity model the relative atomic mass attribute can no longer be regarded as a core attribute. Instead of removing it from the diagram altogether I have shown it with a hollow marker and followed it with parentheses. This is a reminder that this attributes isn't core — it is an attribute which can be calculated from the core by following a rule. The diagram doesn't specify the rule so that will need to be documented separately.

  • an element has one or more isotopes, one or more allotropes and one or more valences
  • uniqueness — an isotope is uniquely identified by its parent element and its numberOfNeutrons,
  • uniqueness — an allotrope is uniquely identified by its name,
  • uniqueness — a valency is uniquely identified by its parent element and its number,
  • rule — the relative atomic mass of an element is equal to the sum of the relative isotopic masses of its isotopes weighted by their abundancy ratio.
Figure 34
Chemical element — the relative atomic mass is an example of a derived attribute9.

Summary

In summary we use the following notation for attributes:

In addition we underline those attributes which are identifying.


1 Barker, Richard. Case Method: Entity Relationship Modelling. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1990.
2 Goodland, Mike and Slater, Caroline. SSADM Version 4: A prractical Approach. McGraw Hill Publishing Co., 1995.
3 From . Object-Oriented System Analsyis: Modeling the World in Data. Yourdon Press, Upper Saddle River, NJ, USA, 1988. but with my word ‘entity type’ in place of their word ‘object’ .
4 . Object-Oriented System Analsyis: Modeling the World in Data. Yourdon Press, Upper Saddle River, NJ, USA, 1988.
5 Date, C.J.. An Introduction to Database Systems. Addision-Wesley systems programming series, Addison-Wesley Publishing Company Inc., 1995.
6 This word attribute is a case in point, Wikipedia (November 2017) has four entries — one definition purports to speak for philosophy, mathematics and logic, a second purportedly speaking for science and reseach. Entries three and four are in the areas of lingustics and langauge processing.
7 Chambers Dictionary of Science and Technology, 1974, ISBN 0 550 13202 3
8 In fact, the dictionary, in an appendix, has such a table, albeit one less detailed than the one shown here.
9 Because it is derived from children in the composition hierarchy some computer scientists would refer to it as a synthetic attribute.