
Semi-structured (XML): late 1990’s to the present
In each case, we discuss the data model and associated query language, using a neutral
notation. Hence, we will spare the reader the idiosyncratic details of the various
proposals. We will also attempt to use a uniform collection of terms, again in an attempt
to limit the confusion that might otherwise occur.
Throughout much of the paper, we will use the standard example of suppliers and parts,
from [CODD70], which we write for now in relational form in Figure 1.
Supplier (sno, sname, scity, sstate)
Part (pno, pname, psize, pcolor)
Supply (sno, pno, qty, price)
A Relational Schema
Figure 1
Here we have Supplier information, Part information and the Supply relationship to
indicate the terms under which a supplier can supply a part.
II IMS Era
IMS was released around 1968, and initially had a hierarchical data model. It understood
the notion of a record type, which is a collection of named fields with their associated
data types. Each instance of a record type is forced to obey the data description
indicated in the definition of the record type. Furthermore, some subset of the named
fields must uniquely specify a record instance, i.e. they are required to be a key. Lastly,
the record types must be arranged in a tree, such that each record type (other than the
root) has a unique parent record type. An IMS data base is a collection of instances of
record types, such that each instance, other than root instances, has a single parent of the
correct record type.
This requirement of tree-structured data presents a challenge for our sample data, because
we are forced to structure it in one of the two ways indicated in Figure 2. These
representations share two common undesirable properties:
1) Information is repeated. In the first schema, Part information is repeated for
each Supplier who supplies the part. In the second schema, Supplier information
is repeated for each part he supplies. Repeated information is undesirable,
because it offers the possibility for inconsistent data. For example, a repeated
data element could be changed in some, but not all, of the places it appears,
leading to an inconsistent data base.
2) Existence depends on parents. In the first schema it is impossible for there to be
a part that is not currently supplied by anybody. In the second schema, it is
impossible to have a supplier which does not currently supply anything. There is
no support for these “corner cases” in a strict hierarchy.
What Goes Around Comes Around 3
文档被以下合辑收录
评论