How to maintain database quality in an organization?

Given that database existence is protected, it is now necessary to maintain the quality of the stored data by

- Ensuring that it always conforms to its definition.
- Validating the stored data and the input data
- Controlling the execution of update processes ensuring proper authorization, controlling concurrent update and synchronizing update of multiple copies.

Database quality can be threatened by erroneous input or improper update actions. Even if good quality control procedures exist for input data, undetected errors can propagate, gradually degrading the quality of the stored data. Every DBMS retains some information regarding the structure and format of the stored data. The system uses this information to properly interpret and process the stored data.

Users also use this information to properly interpret the data and to establish what to expect from the system. It may seem obvious, but, a system should ensure that the stored data always conforms to its definition – the definition as understood by the users. It is a grievous mistake for a system to permit the definition of certain data characteristics, such as alpha or numeric type data fields, but not ensure that all stored values conform to the declared type. This will undoubtedly lead to a loss of user confidence in a system, or unwilling tolerance of its shortcomings.

Whether a system stores a sketchy database definition or a comprehensive definition is immaterial here. A skeleton definition makes it easy for the system to check for conformance. Users are responsible for testing additional conditions which should be satisfied for the data to be valid. A comprehensive definition means more work for the system but also means higher quality data. This can instill greater user confidence in the database and its management system.

Data validation means comparing data to an expression of what the data should look like. For stored data, the definition represents validity conditions. It can also include explicit validation criteria beyond the normal size and type declarations. A more comprehensive definition of the stored data provides bases for better quality control. In addition to testing data for conformance to its definition, validation of input data before it is used to update the database can increase the quality of the database.

With a reduced database definition capability, input transaction validation becomes relatively more important for database integrity. It is generally easier and more efficient to validate input transactions than to continuously monitor the database against a comprehensive database definition. This may account for the greater emphasis placed on transaction validation in practice. Nevertheless, it would be wrong to conclude that input validation can be a substitute for monitoring against the database definition. The database must still conform to its definition.

Processes which change the database can disrupt the information system by destroying the quality of the database. Threats may result from multiple processes attempting to update the same data concurrently, a runaway update process, an incompletely debugged program, or an update initiated by an unauthorized user. These threats suggest the need to control the development, cataloguing, initiation and execution of update processes.

Various levels of update may demand different levels of control. Merely adding data to a database is not generally as disruptive as changing the existing data. Tighter controls may be needed on processes which delete data, particularly whole records or files. Not everyone in an organization should be permitted to freely update the database. Some responsible authority must tell the system that is permitted to initiate and what update operations, and the system must check every requested update action to ensure that it is properly authorized.

The independent and uncontrolled execution of the concurrent update processes can threaten the quality of the database. The solution of allowing a process to lockout concurrent update processes can lead to deadlock. Every multi-user DBMS must have some solution to the potential of deadlock. Update synchronization is required when data is stored redundantly, in multiple copies. Besides the obvious cost of additional storage space, the major cost of data redundancy is in synchronizing updates. These costs must be weighed against the benefits of increased availability of data, faster response to requests for data and better recovery with the redundant data backup.

FREE Subscription

Stay Current With the Latest Trends & Developments Realted to Management. Signup for Our Newsletter and Receive New Articles Through Email

Note: We never rent, trade, or sell our email lists to anyone. We assure that your privacy is respected and protected.