Entity Modelling

www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory


A Quick Introduction to Attributes

As stored within information systems then the individually represented items — the names, colours, quantities, the monetary amounts, the dates, etc — in the language of entity modelling, are said to be the values of attributes. Thus an actual name like "John Smith" is said to the value of a ‘name’ attribute of a ‘person’ entity. We may express that information about a person is communicated or stored as message with two components name and data of birth we might say that a ‘person’ entity may be attributed a ‘name’ and a ‘date of birth’ on show an entity model diagram representing type ‘person’ with ‘name’ and ‘date of birth’ attribute annotations, like this:

Alternatively to say that date of birth is optional we may use a circle in place of the square:

To show that the name attribute within a message is the identifying attribute we underline the annotation in the diagram:

If it is nessary to give both a person's name and their data of birth to uniquely identify them then we underline both of these attributes on the diagram:

Generally, systems will hold and communicate many different attributes of each type of entity and these attributes are shown beneath the identifying attributes:

It is clear that computer programs are effective only in so long as the data items they manipulate are intended and understood as attributes of subject entities. It follows that to have an effective information system we must first have agreed types of subject entity and also what may be attributed to entities of these types. In this agreement we agree the data content of the program or system i.e. its subject matter.

In a message about a person two or more phone numbers may be communicated but it is a rule of entity modelling that for a single attribute an entity may only be attributed a single value. For this reason if a person can have multiple phone numbers then ‘phone_number’ is not an attribute of a ‘person’ entity type per se but an attribute of a ‘phone’ entity type that stands in relation to (is owned by) the ‘person’ entity:

A More Considered View - Attributes as a Shorthand for Certain Relationships

Entity types, for practical purposes, are only as useful as our ability to communicate their instances and to communicate such an instance, i.e. an entity, we need communicate its relationships with other entitites. A subset of these relationships serve to identify the entity within the chosen perspective. These other entitites, those we might wish to communicate relationships with, need in turn be identified which in turn implies communicating their relationships with other entities. This has to end somewhere else there is an infinite regress. Therefore for practical application in computer science there need be types of entity which, a priori, we are given a means of communicating so that instances of these givens will be the atoms in all communications of complex entitites. In practice, as used in computer science, but not in theory, the atoms are certain universal types that are domain non-specific such as the types character, character string, number both integer or floating point, date and boolean and these usually occurring many times in relation to each domain-specific entity type as shown in the example in figure 1.

It profits us little to show these relationships with the atomic types on the entity model diagram and therefore a different notation is used as illustrated in figure 2. Because relationships with the atomic types are sufficiently different to other relationships (as we shall see later), there is a third term, attribute used for them instead in place of the term relationship.

To conclude, in theory, an attribute is a kind of relationship however to a practicing entity modeller this is not the way we think, we think instead of an attribute as a named property of a type of entity, it can be optional or mandatory and it can have a type which is one of character, string (meaning character string), integer, float or boolean. The attributes are shown itemised in the boxes, nested or otherwise, representing the entity types on which they occur. See figure 2 for an illustration.

Figure 1
How a model might look with types on the diagram for character string, number and date.
Figure 2
The same model as figure 1 but now using the attribute notation. Relationships to the universal types string, number and date are represented as attributes. Optional attributes are depicted with a leading circle; these correspond with the cardinality zero-or-one relationships in the previous diagram whilst mandatory attributes — those shown with a leading square — replace relationships of cardinality exactly-one. Identifying attributes, i.e. those replacing identifying relationships, are distinguished by having their names underlined.

Example - Chemical Formulae

In chemistry the chemical formula of a compound is one of the ways of expressing information about the numbers of the different kinds of atoms that constitute a particular molecular compound. For example water, famously, has formula H2O. Consider the information given in (9) below. Each bullet describes a distinct chemical compound giving its molar mass, some of the names that it is known by and its chemical formula.

  • aspirin (also know as acetylsalicylic acid) has the formula C9H8O4;its molar mass is 180.16
  • common salt (also known as sodium chloride or halite) has the formula NaCl; its molar mass is 58.44
  • water (also know as oxidane) has the formula H2O; its molar mass is 18.01

Analysing this information we understand that each chemical compound may have many aliased names, and that its formula describes many occuring types of element. These considerations lead us to this entity model in figure 3 and also to the tabular display in table 1.

Figure 3
A model of chemical compound information.
name molar mass
aliased name
symbol number
aspirin 180.16
acetylsalicylic acid
C 9
H 8
O 4
common salt 58.44
sodium chloride
water 18.01
H 2
Table 1
Table of chemical compounds. The rows of this table are messages communicating entitites:
  • the columns correspond to attributes,
  • the rows correspond to subject entities,
  • each cell presents the value of an attribute of a subject entity,
  • some rows have other rows nested within them; these are subordinate messages and representing subordinate entities in the sense defined by the composition structure in the model.

Example - Chemical Elements

In any given situation, getting to the right blend of entity types, attributes and relationships may be an iterative process as demonstrated in this next example in we explore two entries for chemical elements from a scientific dictionary1:

oxygen (Chem.). A nonmetalic element, symbol O, at. no. 8, r.a.m. 15.994, valency 2. It is a colourless, odourless gas which supports combustion and is essential for the respiration of most forms of life. M.p. -218℃, b.p.-183℃, density 1.42904 g/dm3 at s.t.p., formula O2. An unstable form is ozone, O3. Oxygen is the most abundant element, etc.
chlorine (Chem.). Element, symbol Cl, at.no. 17, r.a.m. 35.453, valencies 1-,3+, 5+, 7+, m.p. -101℃, b.p. -34.6℃. The second halogen, chlorine is a geenish yellow gas, with an irritating smell etc.

From these entries, and with some expansion of abbreviations, I first surmise that each element has a name, a symbol, an atomic number, a relative atomic mass, one or more valencies, a melting point, a boiling point, and, optionally maybe, a density. I am left unsure whether all elements have a formulae, or whether they may have forms which may have formulae which is probably closer to the truth. I provisionally model this situation on an entity model diagram like so:

Reading the entry for sulphur, I find this preliminary model to be inadequate:

sulphur (Chem.). A nonmetalic element occurring in several allotropic forms. Symbol S, at. no. 16, r.a.m. 32.06, valencies 2,4,6. Rhombic(α-) sulphur is a lemon yellow powder; m.p. 112.8 ℃, rel. d. 2.07. Monoclinic (β-) sulphur has a deeper colour than the rhombic form; m.p. 119 ℃, rel.d. 1.96, b.p. 444.6C. Chemically, sulphur resembles oxygen etc.
Reading this third entry, I learn that different forms of sulphur have different melting points and that my provisional model has miss-positioned the ‘melting point’ attribute; it turns out that chemical elements, in and by themsleves, cannot be attributed melting points. After a little further research I learn that these forms taken by chemical elements are, technically speaking, allotropes and that it is these that have melting-points, boiling points and densities. I reach this model:

In tabular form the dictionary entries for oxygen, chlorine and sulphur can be structured like this2 :

name symbol atomic no r.a.m.
name m.p. b.p. density
oxygen O 8 15.994
dioxygen -218 -183 .0014
Ozone -192.5 -119.5 .0032
chlorine Cl 17 35.453
Dichlorine -101 -34.6 >??
sulphur S 16 32.06
Rhombic 112.8 ??? 2.07
Monoclinic 119 444.6 1.96

Example - Attributes in a Model of a Personal Computer

In the example in figure 4 based on the operational state of a personal computer all files and folders are shown as having ‘name’ and ‘date modified’ attributes. In addition files, but not folders, have ‘size’ and ‘content’ attributes.

Figure 4
Model of a Personal Computer.

In this example I have based the attributes of the process entity type on the data that I typically see in the table of processes that I can view through Task Manager, reached, for example, through Ctrl-Alt-Delete on my current Windows system. I have omitted the name of the executable program because this is the name of a referenced something or other (the program); instead, I have represented this by a reference relationship — this being the core concept — rather than by an attribute which is a derivative of this core concept.


Attributes are represented on entity model diagrams in place of relationships with types character,string, number etc. They are represented as mandatory or optional. If they are identifying then they are shown underlined. Entities are communicated, in tabular or other forms, by communicating the values of their attributes. Models should distinguish between core attributes and those derivative of the core; we describe this further in the next section.

1Chambers Dictionary of Science and Technology, 1974, ISBN 0 550 13202 3
2The dictionary, in an appendix, has such a table, albeit one less detailed than the one shown here.