Data Model Tracing, Reporting: Relational Database

In relational databases, data is stored on tables created with a certain model and relationship network. Especially in systems that are not fully mature yet, changes in data models occur frequently and they make it difficult to control data and data models in terms of integrity, consistency, and granularity. This necessitates control mechanisms that will work on the basis of changes made to the data model and a catalog mechanism.

The control mechanisms mentioned above should not block the development processes, but should also prevent the data model from turning into “garbage.” In this respect, the data model (table names, existence of related tables and their compatibility with each other, consistency of related tables in terms of columns, etc.) should be considered in the planning to be made.

In this article, besides the construction of a data model catalog in a relational database (here, Oracle DB is being used), a mechanism that monitors and reports the changes related to this data model and content will be discussed.

Prerequisites

When it comes to standard rules and their controls, certain rules should be followed, too, while developing the environment in question. These rules should be written down and should be carefully followed by the whole development team while making any data model changes. Below are some rules that are valid in the structure described in this article. Additions to these rules may be done to cover other database object types.

Rules

In this article, the following controls and rules will be discussed as an example:

  1. Compliance of data model changes with agreed rules
  2. Whether there are tables without descriptions as metadata
  3. Whether there are columns without descriptions as metadata
  4. Whether there are tables with columns that are not added to the related audit log table
  5. Whether there are tables with columns that are not added to the audit log trigger

As previously mentioned, this list should be taken as a reference point to be improved according to the architectural perspective and the structure of the data and data model.

Workflow and Methodology

Before moving on to the implementation details, it will be helpful to deliver a brief walkthrough of the technical process explained in this article.

The process starts with triggering the job by Oracle Scheduler. Further details will also be covered under the next sections, but simply, the following code is executed here:

BEGIN
   schema_name.p_data_model_reporter();
END;

Firstly, the above-mentioned job fetches the data model changes which are related to table creating, dropping, or altering. For instance, a development for adding a column to a table is captured here for the data model architect’s review against predefined development standards for the data model structure mentioned in the previous section.

Then, the whole data model is programmatically controlled for the rules which are mostly for protecting the data model catalog’s consistency and reliability.

Lastly, the data collected throughout this process is formatted as a report and sent to the development team for refactorings, further inspections, etc.

Here is an illustration of the flow:

Flow illustration

On the other hand, this process is not extra work in the daily development cycle, but rather should be considered as part of that cycle. In this respect, it is possible to talk about a flow from the beginning of the development cycle to the end, as follows:

Development, Check, Analyze, DiscussDevelopment, Check, Analyze, Discuss

Here, the “Development” and “Check” phases may be classified as standard “development work” and should be done with no doubt. But, since discussions, brain-stormings, human interactions, etc. take place in the “Analyze” and “Discuss” phases; the desired value is also added within these phases into the workspace.

Implementation Details

The rules given above are controlled through a database procedure. Procedure content is also available on GitHub. Here, instead of going through the entire database procedure content line by line, the operation logic and general flow of the above rules will be emphasized. As output for each rule, the content appropriate to the rule result is sent as an e-mail to the data model architect for review.

Compliance of Data Model Changes With Agreed Rules

The admin.ddl_history_log table is used for this control. 'ALTER', 'DROP'and 'CREATE' Operations that were made on the tables of the schema in question the day before are pulled in differential logic. It works through a query like the one below.

  SELECT TO_CHAR (action_date, 'dd.mm.yyyy') action_date,
         action_osuser,
         action_username,
         object_name,
         ddl_sql
    FROM admin.ddl_history_log
   WHERE     object_type="TABLE"
         AND object_owner = :schema_name
         AND ddl IN ('ALTER', 'DROP', 'CREATE')
         AND action_date BETWEEN TRUNC (SYSDATE - 1) AND TRUNC (SYSDATE)
ORDER BY action_osuser;

Whether There Are Tables Without Descriptions as Metadata

The sys.user_tab_comments view is used for this control. Cumulative logic applies here. A list of records that need to be corrected is retrieved with a query as below:

SELECT utcom.table_name
  FROM sys.user_tab_comments  utcom
       INNER JOIN sys.user_objects uobj ON utcom.table_name = uobj.object_name
 WHERE uobj.object_type="TABLE" AND utcom.comments IS NULL;

Whether There Are Columns Without Descriptions as Metadata

The sys.user_tab_comments view is used for this control. Cumulative logic applies here. A list of records that need to be corrected is retrieved with a query as below:

SELECT ucc.table_name || '.' || ucc.column_name     column_name
  FROM sys.user_col_comments  ucc
       INNER JOIN sys.user_objects uobj ON ucc.table_name = uobj.object_name
 WHERE uobj.object_type="TABLE" AND ucc.comments IS NULL;

Whether There Are Tables With Columns Which Are Not Added To Related Audit Log Table

The sys.user_objects view is used for this control. Cumulative logic applies here. First of all, for each table, it is checked whether there is another table with the suffix “_log” (see the rules in the “Prerequisites” section). For each table that meets this criterion, column parity is checked between the table and the related audit log table. Tables that do not comply are reported.

Whether There Are Tables With Columns Which Are Not Added to the Audit Log Trigger

The sys.user_objects view is used for this control. Cumulative logic applies here. First of all, for each table, it is checked whether there is a database trigger with the prefix “trg_” (see the rules in the Prerequisites section). For each table that meets this criterion, it is checked whether coding for audit logging is done for all columns in the table within the conjugate trigger code. Tables that do not comply are reported.

Scheduling

Controls are made with a scheduled Oracle database job triggered daily at a certain time through the procedure detailed here. These Oracle job definitions are made via Oracle’s built-in DBMS_JOB package. An example definition script is as follows:

DECLARE
    l_job_id   BINARY_INTEGER;
BEGIN
    DBMS_JOB.submit (job         => l_job_id,
                     what        => 'BEGIN
                                        schema_name.p_data_model_reporter();
                                     END;',
                     next_date   => SYSDATE,
                     interval    => 'SYSDATE+(4/(24))');

    DBMS_OUTPUT.put_line (l_job_id);
END;

The created job can be accessed with the following query.

select * from sys.dba_jobs where job = :l_job_id;

Summary and Further Work

In this article, besides the construction and guarantee of a data model catalog in a relational database, a mechanism that monitors and reports the changes related to this data model and catalog content is discussed. The controls to be evaluated in this context should not be limited to providing control over the data model, but also the consistency between the data stored in the data model and the data model itself may be taken into consideration (matching data type and data itself, data model usage effectiveness, etc.).

.

Leave a Comment