www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory
To use entity modelling in the description and construction of information systems alongside the meta-concepts of entity type and relationship there is a third and absolutely vital meta-concept — that of attribute. In the literature an attribute is variously defined as
Whatever you make of these definitions it is clear that examples are needed; our first examples are given below and then others follow in subsequent sections. Regarding these examples, Schlaer and Mellor4 illustrate these concepts with examples in which entitites correspond to the rows of example tables and attributes correspond to columns of the table; other authors do the same and one at least treats column of table and attribute as synonymous terms5. We will also rely on examples in tabular form but slightly unusually we shall have examples in which some rows have other rows nested within them; in our examples the nested rows represent dependent entitites — they are a visible representation of compositional structure within an entity model i.e. of composition relationships between entitites.
You can consider what I now say as a parenthetical remark about the foundations of what we are engaged in. Believe me too that it is worth working from foundations because without that mathematics, philosophy, statistics, computer science and linguistics are muddied and fragmented6
Put aside attributes for a moment. Pretend that we only sculpt with entity types and relationships. Consider that entity types are only as useful as our ability to communicate their instances. When we model a type of entity in an entity model we specfying how entities of that type can be communicated by communicating their relationships with other entities. These other entitites can only be communicated by communicating their relationships with others and so on. This has to end somewhere else there is an infinite regress. Therefore for practical application in computer science there have to be some types of entity which a priori we have a means of communicating; so that instances of these givens will be the atoms in all communications of complex entitites. In practice, as used in computer science, but not in theory, the atoms are certain universal types that are not domain specific such as the types character, character string, number both integer or floating point, date and boolean and these usually occurring many times in relation to each domain-specific entity type as shown in the example in figure 31. Because they are universal types it profits us little to show them on the entity model diagram as we have on figure 31; for this reason and also because relationships with universal types are sufficiently different to other relationships (as we shall see later), they are not ususally recognised as relationships at all; instead the word attribute is used in place of the word relationship.
For practical purposes then, an attribute is not a kind of relationship (though really it is!) it is a named property of a type of entity, it can be optional or mandatory and it can have a type which is one of character, string (meaning character string), integer, float or boolean. The attributes are shown itemised in the boxes, nested or otherwise, representing the entity types on which they occur. See figure 32 for an illustration.
Relationships to the universal types string, number and date are represented as attributes. Optional attributes are depicted with a leading circle; these correspond with the cardinality zero-or-one relationships in the previous diagram whilst mandatory attributes — those shown with a leading square — replace relationships of cardinality exactly-one. Identifying attributes, i.e. those replacing identifying relationships, are distinguished by having thier names underlined.
In the following example based on the personal computer all files and folders are shown as having ‘name’ and ‘date modified’ attributes. In addition files, but not folders, have ‘size’ and ‘content’ attributes:
In chemistry the chemical formula of a compound is one of the ways of expressing information about the numbers of the different kinds of atoms that constitute a particular molecular compound. For example water, famously, has formula H2O, commmon salt which is also known as sodium chloride or halite has formula NaCl and aspirin which is also know as acetylsalicylic acid has formula C9H8O4. We can express these formulae in tabular form like this:
name | formula | ||||||
---|---|---|---|---|---|---|---|
|
|||||||
water |
|
||||||
common salt |
|
||||||
aspirin |
|
In any given situation, getting to the right blend of entity types, attributes and relationships may be an iterative process as demonstrated in this next example in we explore two entries for chemical elements from a scientific dictionary7:
From these entries, and with some expansion of abbreviations, I first surmise that each element has a name, a symbol, an atomic number, a relative atomic mass, one or more valencies, a melting point, a boiling point, and, optionally maybe, a density. I am left unsure whether all elements have a formulae, or whether they may have forms which may have formulae which is probably closer to the truth. I provisionally model this situation on an entity model diagram like so:
Reading the entry for sulphur, I find this preliminary model to be inadequate:
In tabular form the dictionary entries for oxygen, chlorine and sulphur can be structured like this8 :
name | symbol | atomic no | r.a.m. | valency | allotrope | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|||||||||||||||
oxygen | O | 8 | 15.994 |
|
|
|||||||||||
chlorine | Cl | 17 | 35.453 |
|
|
|||||||||||
sulphur | S | 16 | 32.06 |
|
|
Think of the rows of this table as messages communicating entitites:
Entity models may have greater or lesser amounts of detail, depending on the purpose at hand, and this model of chemical elements that we have just presented isn't the whole story, either intellectually so as to do justice to the physics, or practically for the purposes of scientists detecting and analysing samples using mass spectroscopy; for them the fact that each chemical element has a number of istopes comes into play — The relative abundancies of the isotopes becomes significant and also, to a mass spectroscopist, the mass of the most abundant isotope. Once we add isotope and its relative isotopic mass attribute to the entity model the relative atomic mass attribute can no longer be regarded as a core attribute. Instead of removing it from the diagram altogether I have shown it with a hollow marker and followed it with parentheses. This is a reminder that this attributes isn't core — it is an attribute which can be calculated from the core by following a rule. The diagram doesn't specify the rule so that will need to be documented separately.
In summary we use the following notation for attributes: