and semantic (or consistency) constraints relating to the stored database
are actually an extension of the database definition. A semantic constraint
defines the acceptable value domain for an attribute or a consistency
relationship between several attribute values, for example, the net
pay for a period should equal the gross pay less deductions for income
tax withheld, pension, insurance premiums, etc. In the example of net
pay, the database designer may define the net pay as a derived data
item to be calculated whenever requested. In this case, the expression
becomes the derivation rule.
Alternatively, the net pay could be explicitly defined, stored and maintained
in the database. Now the expression becomes a validation rule used to
check the consistency of the values in the database. Due to the dependency,
whenever any gross pay or deduction data is entered or modified, the
net pay must be modified simultaneously. The designer clearly faces
a tradeoff in choosing to explicitly store the data item or to derive
its value when necessary.
Nevertheless, with either alternative, the user must specify the semantic
information as part of the data definition process. If these semantics
are not defined to the system such that it knows what the data should
look like, the users are collectively responsible for ensuring that
the values in the database remain consistent. Some argue that validation
criteria and semantic constraints should not be a part of the database
definition. The difficulty with such an argument is in determining where
definitional information ends and validation information begins.
Validation involves comparing stored data to some expression of what
the stored data should look like. In fact, all database definition information
provides a basis for validation. To define an item as numeric integer
means that a value containing any alphabetic or special characters must
be rejected as being invalid for that item. Other parts of the database
definition information provide semantic rules for screening out unacceptable
values or operations in the database: enumeration of values or ranges
on a value set, limitations on the number of instances of an item or
repeating group, declaring a data item to be mandatory in a record or
unique across entities in a class, or exclusive or dependent characteristics
of a relationship.
The viewpoint here is that validation is a process, not a set of criteria.
It is a process which compares data to its definition. The data may
be stored in the database or in update transactions. The validation
criteria may be stored as part of the database definition or the update
transaction definition, or it may be embedded in the validation program
or in the transaction processing program. Ideally, it should be part
of the stored database definition so that it can be enforced at all
avenues of access which update the stored data. The more complete and
comprehensive the definition the more effective can be the validation
The process of validation
requires the specification of three pieces of information:
- Validation criteria
or semantic constraints
- Condition under which the database is to be tested against the criteria.
- Action the system is to take in response to a detected violation.
If the database
were static, there would be no need for the second piece of information,
the database would be checked continuously and would always satisfy
the stated validation criteria. The database does change, however as
update processes act upon it. If an update process consists of multiple
steps, the database could temporarily pass through an invalid state.
This would happen, for example, between the posting of a debit and its
corresponding credit in an accounting transaction.
Specifying when to apply validation criteria can avoid testing a database
during a temporarily invalid state. Although the structural information
may be sufficient for a user to comprehend and reference the database,
it does not provide sufficient information to enable the system to store
and subsequently access the stored data. Whereas the structural definition
serves to build up a data structure, the storage structure information
defines how the system is to break down the structure, map it onto storage
media and devices and subsequently access it.
The characteristics of the secondary storage devices must be defined
or assumed by the system. This includes such information as the physical
block size, the devices and volumes used to store the data and how to
partition the database to fit onto the volumes. The dominant characteristics
of storage devices are that they consist of a linear sequence of physical
blocks, each consisting of a linear sequence of character spaces.