Tutorial Part Two

Entity Modelling Tutorial Part Two - Representing Data

In Part One we have shown how entity models describe both the types of things, entity types, and the relationships between different types of things within a chosen perspective and which perspective is then considered the whole or the absolute from a logical point of view. In this chapter we focus on how entitites may be communicated. In doing so we explain how entity models are used for data specification i.e. how they may prescribe or, a posteriori, document the structure of data.

To use entity modelling in this way in the description and construction of information systems, we require, alongside of the meta-concepts of entity type and relationship, a third and vital meta-concept — that of attribute. In the literature an attribute is variously defined as

any detail that serves to qualify, identify, classify, quantify or express an entity¹,

the smallest discrete component of the system information that is meaningful²,

the abstraction of a single characteristic pocessed by all the entities that were, themselves, abstracted as an [entity type]³.

Whatever you make of these definitions it is clear that examples are needed; our first examples are given below and then others follow in subsequent sections. Regarding such examples, Schlaer and Mellor⁴ illustrate these same concepts with examples in which entitites correspond to the rows of example tables and attributes correspond to columns of the table; other authors do likewise and one at least treats column of table and attribute as synonymous terms⁵. We will also rely on examples in tabular form but with one important difference: we shall have examples in which some rows have other rows nested within them; in our examples the nested rows represent dependent entitites — they are a visible representation of compositional structure within an entity model i.e. of composition relationships between entitites.

Now, when we call to mind data, then we think of names, quantities, monetary values, addresses, dates, temperatures, geographical coordinates, and so on. Such items of data as these convey information only within specific contexts and when attributed to subjects at hand. A temperature, a colour, a price, a height, a distance — all these tell us nothing less they be the temperature, the colour, the price, the height or the distance of some thing. We can paraphrase in the language of entity modelling and say they tell us nothing less they be attributed to an entity. This then is our starting point. I ask you to start with the view that data only conveys information when it is embedded in messages built systematically following some sorts of rules, just as vocabulary is only meaningful within the context of text that is grammatical and free from category mistakes and other howlers. Primarily we focus on data as the consituent parts of messages rather than thinking of it as content within a database. If we do this then data specification is a more general term than data modelling and a methodology for data specification i.e. for the specification of message structures is of more general utility than one for specification of database structure i.e for data modelling — for the former subsumes the latter. Data specification is the act of specifying rules by which data will be combined and communicated to convey information about subject entities. The same entities, essentially the same data, may take many different forms — they may be stored in a database, they may be communicated over a network, enriched by a program according to a set of rules, displayed to a user, as, say, on a web page. Each form that they take, when analysed, will likely have different message structure but each will map one to the other and therefore one to all.

Just as we conceive abstract lingustic structure common to speech and writing, for our purposes here we require an abstract concept of a message system — what we lose in ease of explanation we gain in generality. To achieve this level of abstraction, we take it as incidental, i.e. as a given, how we communicate universals; among these are the terminal instances, including numbers, dates and strings, and the identities of entity types and attributes and we also take it as given that one set of messages may be embedded or otherwise presented within the context of another.

We require that a message communicates the identifying features of the subject entity, the attribute values for each of its attributes, which we have said is given, and that it communicates all the relationships of the subject entity with other entities. Optionally, it may communicate one or more of the subject entitites parts (i.e. entitites reached through composition relationships), recursively. Finally we require that all data in a message communicate something i.e. that there is no redundant data in the overall message set that describes a state of affairs.

This tutuorial proceeds as follows. First we introduce the fundamentals of the notation and give some examples:

attributes

then following this we consider the distinction between core attributes and others that can be expressed in terms or the core and which we call:

derived attributes.

Relationships are represented within messages either by message containment or by explicit message content in the form of

reference attributes or by a combination of the two.

Next we describe:

what is meant by relational? In brief, message systems that do not place a reliance on message containment are called relational. Note that it is usual to think of relational and hierarchical as contrasting terms and we use the terms so here, but, conceptually at least message systems which are relational are a special case of those that are hierarchical.

Models that prescribe linear message representations in all or in part having a reliance on containment are called hierarchical. In fact message structure may be

relational, hierarchical or multi-dimensional.

Historically, as used for data specification, an ER model will often be said to be logical or physical or to constitute a data model; I will use the terms somewhat differently as explained in the final section which finishes with an example from Chen.

logical and physical. For now at least, this brings us to the end of this tutorial.

¹ Barker, Richard. Case Method: Entity Relationship Modelling. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1990.

² Goodland, Mike and Slater, Caroline. SSADM Version 4: A prractical Approach. McGraw Hill Publishing Co., 1995.

³ From . Object-Oriented System Analsyis: Modeling the World in Data. Yourdon Press, Upper Saddle River, NJ, USA, 1988. but with my word ‘entity type’ in place of their word ‘object’.

⁴ . Object-Oriented System Analsyis: Modeling the World in Data. Yourdon Press, Upper Saddle River, NJ, USA, 1988.

⁵ Date, C.J.. An Introduction to Database Systems. Addision-Wesley systems programming series, Addison-Wesley Publishing Company Inc., 1995.