PostgreSQL database verifier.
The script to check the database structure for errors or non-recommended practices.
The script consists of a set of checks that access the system catalog tables and do not require access to data in user tables.
The current version is applicable to PostgreSQL 12 and later. Tested in versions PostgreSQL 15.5.
db_verifier.sql- script, checks and displays a list of errors/warnings/recommendations with commentstests- directory with DDL/DML scripts that are used to debugdb_verifier.sqland demonstrate errors
| code | parent_code | name | level | default state | description |
|---|---|---|---|---|---|
| no1001 | no unique key | error | enable | Relation has no unique key. | |
| no1002 | no1001 | no primary key constraint | error | enable | Relation has no primary key constraint. |
| fk1001 | fk uses mismatched types | error | enable | Foreign key uses columns with mismatched types. | |
| fk1002 | fk uses nullable columns | warning | disable | Foreign key uses nullable columns. | |
| fk1007 | not involved in foreign keys | notice | disable | Relation is not involved in foreign keys. | |
| c1001 | constraint not validated | warning | enable | Constraint was not validated for all data. | |
| i1001 | similar indexes | warning | enable | Indexes are very similar. | |
| i1002 | index has bad signs | error | enable | Index has bad signs. | |
| i1003 | similar indexes unique and not unique | warning | enable | Unique and not unique indexes are very similar. | |
| i1005 | similar indexes (roughly) | notice | disable | Indexes are roughly similar. | |
| i1010 | b-tree index for array column | notice | enable | B-tree index for array column. | |
| s1010 | less 5% unused sequence values | critical | enable | The sequence has less than 5% unused values left. | |
| s1011 | s1010 | less 10% unused sequence values | error | enable | The sequence has less than 10% unused values left. |
| s1012 | s1011 | less 20% unused sequence values | warning | enable | The sequence has less than 20% unused values left. |
An example of using a script to check database metadata using a docker container with PostgreSQL 15.
Let's dump the database schema in sql format using pg_dump, specifying the necessary parameters for the connection.
pg_dump \
--host=database_host \
--port=database_port \
--username=user_name \
--dbname=database_name \
--schema-only \
--format=plain \
--file=database_schema.sqlLet's launch a container with PostgreSQL 15, port 5444 of the local interface will be forwarded to the container.
docker container run \
-p 127.0.0.1:5444:5432 \
--name db_verifier \
--env POSTGRES_USER=user_name \
--env POSTGRES_PASSWORD=user_password \
--env POSTGRES_DB=db_verifier \
--detach postgres:15-alpineLet's connect to the container on port 5444 of the local interface using the psql console client.
psql \
--host=localhost \
--port=5444 \
--username=user_name \
--dbname=db_verifierLet's set up data output in the psql and execute the db_verifier.sql.
db_verifier=# \pset format wrapped
Output format is wrapped.
db_verifier=# \pset columns 0
Target width is unset.
db_verifier=# \i db_verifier.sql
object_id | object_name | object_type | check_code | check_level | check_name | check_result_json
-----------+----------------------+-------------+------------+-------------+--------------------------+------------------------------------------
16456 | fk1001_2_fk_fk1001_2 | constraint | fk1001 | error | fk uses mismatched types | {"object_id" : "16456", "object_name" : .
| | | | | |."fk1001_2_fk_fk1001_2", "object_type" : .
| | | | | |."constraint", "relation_name" : "public..
| | | | | |.fk1001_2_fk", "relation_att_names" : ["f.
| | | | | |.k1001_2_id"], "foreign_relation_name" : .
| | | | | |."public.fk1001_2", "foreign_relation_att.
| | | | | |._names" : ["id"], "check" : {"check_code.
| | | | | |.":"fk1001","parent_check_code":null,"che.
| | | | | |.ck_name":"fk uses mismatched types","che.
| | | | | |.ck_level":"error","check_source_name":"s.
| | | | | |.ystem catalog","description_language_cod.
| | | | | |.e":null,"description_value":"Foreign key.
| | | | | |. uses columns with mismatched types."}}Stop and remove the container.
docker stop db_verifier
docker container remove db_verifierChanging the message localization setting to en, attribute conf_language_code.
sed -i "/AS conf_language_code,/c\'en' AS conf_language_code," db_verifier.sqlExplicitly disabling the i1001 check (similar indexes), the enable_check_i1001 attribute.
sed -i "s/AS enable_check_i1001/AND false AS enable_check_i1001/" db_verifier.sql-- before
true AS enable_check_i1001 -- [warning] similar indexes
-- after
true AND false AS enable_check_i1001 -- [warning] similar indexesExplicitly enabling the fk1007 check (not involved in foreign keys), attribute enable_check_fk1007.
sed -i "s/AS enable_check_fk1007/OR true AS enable_check_fk1007/" db_verifier.sql-- before
false AS enable_check_fk1007, -- [notice] not involved in foreign keys
-- after
false OR true AS enable_check_fk1007, -- [notice] not involved in foreign keysFiltering scan results is necessary to exclude false positives or to implement exclusion functionality
known errors (baseline, error suppression).
To do this, you can add a WHERE condition to the script at the stage of filtering the test results, the point for setting such
conditions are specified in the comment line >>> WHERE.
Example of conditions for filtering results (suppressing some errors).
cat examples/where.sql
WHERE
NOT (check_code = 'fk1007' AND object_name = 'public.schema_migrations')sed -i "/>>> WHERE/ r examples/where.sql" db_verifier.sqlAn implementation option for obtaining an aggregated score.
Let's associate each result line with a numeric value based on check_level, example in the file
examples/cumulative_score.sql.
cat examples/cumulative_score.sql
SELECT
COALESCE(SUM(cumulative_score_value), 0) AS cumulative_score
FROM (
VALUES
('critical', 55),
('error', 25),
('warning', 12),
('notice', 3)
) AS t(check_level, cumulative_score_value)
INNER JOIN (
-- >>> db_verifier
) AS r ON t.check_level = r.check_level
;Let's combine the scripts, the result in examples/cumulative_score.sql.
sed -i "/^;$/d" db_verifier.sql
sed -i "/>>> db_verifier/ r db_verifier.sql" ./examples/cumulative_score.sql| column name | description |
|---|---|
| object_id | id (oid) of the object in the corresponding system table |
| object_name | name of the object, in some cases with a schema |
| object_type | type of object being checked (relation, constraint, index, sequence) |
| check_code | check code (see table above) |
| check_level | level (see table above) |
| check_name | check name (see table above) |
| check_result_json | detailed test results in json format |
- [RU]
README.ru.md
There is a Python wrapper implemented for the SQL checks in this project. It allows running the DB verifier as a standalone command or as a part of the automated testing process.
Until it is not published to PyPi, please build from sources and install as a wheel package.
The following instructions will be replaced with simply: pip install db_verifier after publishing.
The following instructions are tested on Ubuntu 20.04.
Please adapt to your development environment accordingly.
The project is built with Poetry (https://python-poetry.org/docs/pyproject/). To install Poetry use the instructions on its site: https://python-poetry.org/docs/#installing-with-the-official-installer
This is a possible one-line installation using their official installation script:
curl -sSL https://install.python-poetry.org | python3 -Initializing and building the project:
cd db_verifier
poetry init
poetry buildAs a result it will create a distribution package in db_verifier/dist dir.
You can install DB Verifier python wrapper into your Python virtual environment or any other way you prefer.
Installing from PyPi (not published yet):
pip install db_verifierInstalling from pre-built wheel package:
pip install db_verifier/dist/db_verifier-0.1.0-py3-none-any.whlAfter the package is installed you can use db_verifier command.
PGUSER=user PGPASSWORD=pass db_verifier --connection=postgresql://my_host:5432/my_dbBy default, it prints the report in human-readable text format to stdout and exits with the following OS error codes:
- 0: no errors detected (can be configured to fail on warnings also);
- 1: errors detected;
- 2: connection problem or any other error unrelated to the problems in the DB structure.
Verify DB and print report in CSV to stdout:
PGUSER=user PGPASSWORD=pass db_verifier --connection=postgresql://my_host:5432/my_db --format=csvVerify DB and print report in JSON-lines stdout (each line is a JSON object):
PGUSER=user PGPASSWORD=pass db_verifier --connection=postgresql://my_host:5432/my_db --format=jsonHere is an example of how DB Verifier can be integrated into a Django automated testing pipeline.
Django has a built-in ability to generate a test database based on the configured models and migrations and use it for the tests.
DB Verifier can be included as one of the tests.
For example in a Django project with pytest tests the following works good:
import pytest
from db_verifier import verify_and_print_txt
from django.db import connection
@pytest.mark.django_db(transaction=True)
def test_db_structure():
verify_and_print_txt(connection, fail_on_warnings=True)The test is marked with "django_db".
This means an empty test DB will be initialized based on the
Django ORM configuration.
Next "verify_and_print_txt" is used on the Django's connection (psycopg Connection object).
This function runs the SQL checks and prints error to stdout in human-readable form similar to how linters and
other common utilities do.
In the end it raises an exception if there were errors/warnings found, and it will be a failed test in the logs.
This way it can be included into a CI process of a typical Django project.
It can be used the similar way to integrate with other types of Python projects.
See above - building the project with Poetry.
The project was developed with the following scheme:
- dependencies are to be started in docker;
- python code is started on the host machine in a virtualenv managed by Poetry.
To initialize the project:
git clone ...
cd db_verifier_python_wrapper
poetry initCopy .env.default to .env and change the dev config if needed.\
To activate virtualenv:
poetry shellTo run the tests:
Start dependencies in docker in one terminal session:
poetry shell
make start_depsRun the tests in IDE or in the 2nd terminal session:
poetry shell
make run_testsAuto-format code:
poetry shell
make black
Check code-style:
poetry shell
make check_lint