The DB Schema Export Tool exports SQL Server or PostgreSQL database objects as schema files. Exported objects include tables, views, stored procedures, functions, and synonyms, plus also database properties including database roles and logins.
Uses integrated authentication or a named user to connect to a Microsoft SQL Server database.
Uses a named user, password, and port to connect to a PostgreSQL database.
The latest version of the application is available for one month on the AppVeyor CI server
In addition to creating schema files for tables, views, stored procedures, functions, and synonyms, the program can also export table data to create SQL files with an Insert statement for each row.
The DB Schema Export Tool has both a command line interface and a GUI mode. Starting the program without any parameters will show the GUI.
The command line interface is useful for automatically exporting schema files on a regular basis. Example command (wrapped here for readability):
DB_Schema_Export_Tool.exe
C:\Cached_DBSchema
/Server:Proteinseqs
/DBList:Manager_Control,Protein_Sequences
/Sync:"F:\Projects\Database_Schema\DMS"
/Git /Commit
/L /LogDir:Logs
/Data:ProteinSeqs_Data_Tables.txt
DB_Schema_Export_Tool.exe
SchemaFileDirectory /Server:ServerName
/DB:Database /DBList:CommaSeparatedDatabaseName
[/DBUser:username] [/DBPass:password]
[/PgUser:username] [/PgPass:password] [/PgPort:5432]
[/DirectoryPrefix:PrefixText] [/NoSubdirectory] [/CreateDBDirectories]
[/DataTables:TableDataToExport.txt] [/Map:ColumnMapping.txt]
[/DataSort:PgDumpTableDataSortOrder.txt]
[/TableFilterList] [/SchemaSkipList]
[/ColumnFilter:FileName] [/DateFilter:FileName]
[/NameFilter:FilterSpec]
[/DataExportOrder:FileName]
[/Schema:SchemaName] [/ExistingDDL:SchemaFileName]
[/NoAutoData] [/ExportAllData] [/MaxRows:1000] [/NoData]
[/DisableStatementLogging:true] [/MinLogDuration:5000]
[/DeleteExtraRows] [/ForceTruncate:TableList]
[/DisableTriggers]
[/ScriptLoad] [/ScriptUser:username] [/ScriptDB:Database]
[/ScriptHost:HostName] [/ScriptPort:Port]
[/SnakeCase] [/PgDump] [/KeepPgDumpFile]
[/PgInsert] [/PgInsertChunkSize:50000]
[/ServerInfo] [/NoSchema]
[/Sync:TargetDirectoryPath] [/Git] [/Svn] [/Hg] [/Commit]
[/L] [/LogFile:BaseName] [/LogDir:LogDirectoryPath]
[/Preview] [/Stats] [/Trace]
[/ParamFile:ParamFileName.conf] [/CreateParamFile]
SchemaFileDirectory is the path to the directory where the schema files will be saved (aka the output directory)
- Optionally use named parameter
/OutputDir:DirectoryPath
To process a single database, use /Server and /DB
- The server is assumed to be Microsoft SQL Server
- However, if
/PgUseris provided (orPgUseris defined in the parameter file), will treat as a PostgreSQL server
Use /DBList to process several databases (separate names with commas)
Use /DBUser and /DBPass to specify a username and password for connecting to a SQL Server instance
- If not defined, will use Windows Integrated Authentication
Use /PgUser to define the username to use to connect to a PostgreSQL database
Use /PgPass to define the password for the PostgreSQL user
- As described below, you can create a file named
pgpass.confat%APPDATA%\postgresqlto track passwords for PostgreSQL users - The format is
server:port:database:username:password pg_dump.exewill obtain the password from this file if it exists- On Linux, the file is stored at
~/.pgpass
Use /PgPort to specify the port to use for PostgreSQL
- Defaults to 5432
By default, a subdirectory named DBSchema__DatabaseName will be created below SchemaFileDirectory/
- Customize this prefix text using
/DirectoryPrefix
Use /NoSubdirectory to disable auto creating a subdirectory for the database being exported
- Note: subdirectories will always be created if you use
/DBListand specify more than one database
Use /CreateDBDirectories:False to disable creating a subdirectory for the schema files for each database
Use /DataTables or /Data to define a text file with table names (one name per line) for which the data
should be exported. In addition to table names defined in /Data, there are default tables
which will have their data exported; disable the defaults using /NoAutoData
- Also supports a multi-column, tab-delimited format
- Put
<skip>in the TargetTableName column to indicate that the table should not be included when using/ExportAllData
- Put
- The file can also track renamed views
- File format
| SourceTableName | TargetSchemaName | TargetTableName | PgInsert | KeyColumn(s) |
|---|---|---|---|---|
| T_Analysis_State_Name | public | t_analysis_job_state | true | job_state_id |
| T_DatasetRatingName | public | t_dataset_rating_name | true | dataset_rating_id |
| T_Log_Entries | public | t_log_entries | false | |
| T_Job_Events | cap | t_job_Events | false | |
| T_Job_State_Name | cap | t_job_state_name | true | job |
| T_Users | public | t_users | true | user_id |
| x_T_MgrState | public | <skip> |
||
| V_Cell_Culture | public | v_biomaterial | ||
| V_Cell_Culture_Date | public | v_biomaterial_date |
Tables with PgInsert set to true will have data insert commands formatted as PostgreSQL compatible
INSERT INTO statements using the ON CONFLICT (key_column) DO UPDATE SET syntax
- This is only applicable when exporting data from SQL Server
- Use the
KeyColumn(s)column to specify the primary key (or pair of unique keys) used for determining a conflict- Any columns that are not key columns will be updated when a conflict is found
- If
PgInsertis true, but theKeyColumn(s)column is empty, the program will look for an identity column on the source table
- Example generated SQL:
INSERT INTO mc.t_mgr_types (mt_type_id, mt_type_name, mt_active)
OVERRIDING SYSTEM VALUE
VALUES
(1, 'Analysis', 1),
(2, 'Archive', 0),
(3, 'Archive Verify', 0),
(4, 'Capture', 1),
(5, 'Space', 1),
(6, 'Data Import', 1)
ON CONFLICT (mt_type_id)
DO UPDATE SET
mt_type_name = EXCLUDED.mt_type_name,
mt_active = EXCLUDED.mt_active;
-- Set the sequence's current value to the maximum current ID
SELECT setval('mc.t_mgr_types_mt_type_id_seq', (SELECT MAX(mt_type_id) FROM mc.t_mgr_types));
-- View the current ID for the sequence
SELECT currval('mc.t_mgr_types_mt_type_id_seq');Use /Map or /ColumnMap to define a tab-delimited text file mapping source column names to target column names, by table
- Use keyword
<skip>in theTargetColumnNamecolumn to indicate that a source column should not be included in the output file - If
/SnakeCaseis used to auto-change column names, mappings in this file will override the snake case auto-conversion - File format:
| SourceTableName | SourceColumnName | TargetColumnName |
|---|---|---|
| T_Analysis_Job | AJ_jobID | job |
| T_Analysis_Job | AJ_start | start |
| T_Analysis_Job | AJ_finish | finish |
| t_users | name_with_prn | <skip> |
Use /DataSort (or /PgDumpDataSort or /PgDumpTableDataSortOrder) to define a tab-delimited text file that defines the columns to use for sorting data exported using PgDump
- The export file will be re-written with data sorted by the specified column numbers (1-based)
- Table names should include the schema, e.g. "public.t_event_target" or "sw.t_job_state_name"
- Sort_Columns should list the column number (or comma-separated list of column numbers) to sort the data on
- "1" means to sort using the first column
- "2, 3" means to sort using the second column, then the third column
- Values in Sort_Numeric should be True, False, or an empty string
- True means to assume the data in the first sort column is numeric (integer or double)
- False is ignored; if any sort column has numbers for every row, it will be sorted as numeric
- File format:
| Table_Name | Sort_Columns | Sort_Numeric |
|---|---|---|
| cap.t_automatic_jobs | 1, 2 | |
| cap.t_scripts | 2 | |
| cap.t_scripts_history | 3, 1 | |
| cap.t_step_tools | 2, 1 | |
| cap.t_task_state_name | 1 | True |
| public.t_analysis_job_state | 1 | True |
| public.t_default_psm_job_parameters | 2, 3, 4, 5, 6 | |
| public.t_user_operations_permissions | 1, 2 | True |
| sw.t_job_state_name | 1 | True |
| timetable.task | 2, 1 | True |
Use /TableFilterList or /TableNameFilter to specify a table name (or comma separated list of names) to restrict table export operations
- This is useful for exporting the data from just a single table (or a few tables)
- This parameter does not support reading names from a file; it only supports actual table names
- If a file was defined via the
/DataTablesparameter, the list of tables loaded from that file will be filtered by this list (matching both SourceTableName and TargetTableName)
Use /SchemaSkipList to specify a schema name (or comma separated list of names) of schema to skip
- This is useful if using partitioned tables
Use /ColumnFilter or /TableDataColumnFilter to define a tab-delimited text file that defines columns to skip when exporting data from tables
- The program auto-skips computed columns and timestamp columns when exporting data as PostgreSQL compatible INSERT INTO statements
- File format:
| Table | Column | Comment |
|---|---|---|
| T_Cached_Dataset_Links | DS_RowVersion | In PostgreSQL, tracked via data type xid; comes from t_dataset.xmin |
| T_Cached_Dataset_Links | SPath_RowVersion | In PostgreSQL, tracked via data type xid; comes from t_storage_path.xmin |
| T_EUS_Proposals | Numeric_ID | Computed column (redundant here since the program auto-skips computed columns) |
Use /DateFilter or /TableDataDateFilter to define a tab-delimited text file that defines date filters to use when exporting data from tables
- The data file will include the start date in the name, for example:
mc.t_log_entries_Data_Since_2020-01-01.sql - File format:
| SourceTableName | DateColumnName | MinimumDate |
|---|---|---|
| T_Event_Log | Entered | 2020-01-01 |
| T_Log_Entries | posting_time | 2020-01-01 |
| T_ParamValue | last_affected | 2020-01-01 |
| T_ParamValue_OldManagers | last_affected | 2016-01-07 |
Use /NameFilter to define a filter to apply to exported tables, views, procedures, etc.
- Will only export objects that contain this text in the name
- This is only applicable when exporting objects from SQL Server
- Use
^to indicate the object names must start with the text, for example^Autoto start with Auto- This only works if using a parameter file, since the Windows command line parser removes ^
- Use
$to indicate the object names must end with the text, for exampleReport$to end with Report - Other RegEx qualifiers are supported; matches are case insensitive
- Multiple filters can be provided by separating with commas
- At the command line, surround comma separated lists with double quotes
- Will not split on commas if the object name filter has square brackets
Use /DataExportOrder to define a text file with table names (one name per line) defining the order that data should be exported from tables
- Should have table names only, without schema
- Tables not listed in this file will have their data exported alphabetically by table name
- If the first line is Table_Name, will assume that it is a header line (and will thus ignore it)
- If
/ScriptLoadis used to create a shell script for loading data (see below), the data will be loaded in the order specified by theDataExportOrderfile, followed by tables not listed in the file- This is useful for assuring that data is loaded into the database in an order that will guarantee that foreign keys are valid
- For example, given a multi-campus university that tracks graduate students and professors using four tables:
graph TD;
T_Campus-->T_Department;
T_Department-->T_Professor;
T_Department-->T_Student;
the DataExportOrder file should have the following:
T_Campus
T_Department
T_Professor
T_Student
- Loading data in this order assures that when the department data is loaded, the Campus_ID for each department will refer to an existing row in the campus table
- Next, when the professor and student data is loaded, the Department_ID for each person will refer to an existing row in the department table
Use /Schema or /DefaultSchema to define the default schema name to use when exporting data from tables
- Entries in the
/DataTablesfile will override this default schema
Use /ExistingDDL or /ExistingSchema to define a text file that should be parsed to rename columns, based on data loaded from the /ColumnMap file
- Will create a new text file with CREATE TABLE blocks that include the new column names
- Will skip columns with
<skip>
- Will skip columns with
- Will skip tables (and views, procedures, etc.) that are defined in the
/DataTablesfile but have<skip>in the TargetTableName column- Does not rename tables
Use /ExportAllData or /ExportAllTables to export data from every table in the database
- Will skip tables (and views) that are defined in the
/DataTablesfile but have<skip>in the TargetTableName column - Ignored if
/NoTableDatais true
Use /MaxRows to define the maximum number of data rows to export
- Defaults to 1000
- Use 0 to export all rows
- If you use
/ExportAllDatabut do not specify/MaxRows, the program will auto set/MaxRowsto 0
Use /NoTableData or /NoData to prevent any table data from being exported
- This parameter is useful when processing an existing DDL file with
/ExistingDDL
By default, the table data load script will update the PostgreSQL system settings to set log_min_duration_statement to -1, which will disable logging statements that execute longer than a given threshold
- If your system has
log_min_duration_statementdefined, and if the insert queries take longer than the threshold, the entire insert query (including all of the data values) will be written to the log file- If you are importing a large amount of data, this could lead to very large PostgreSQL log file sizes
- By using
/DisableStatementLogging:True(the default) you avoid the possibility of long-runningINSERT INTOqueries bloating up the log files- The database user must be a superuser to update the
log_min_duration_statementsystem setting, since that setting affects all current and future connections in all databases on the server
- The database user must be a superuser to update the
- After importing all of the data, the script will set
log_min_duration_statementto 5000 (meaning to log statements that run longer than 5 seconds)- This can be customized using
/MinLogDuration(orMinLogDurationAfterLoad=in a parameter file)
- This can be customized using
- Use
/DisableStatementLogging:Falseto disabling adjusting the value oflog_min_duration_statement
Use /MinLogDuration to customize the value to set log_min_duration_statement to once all of the data has been imported
- The default is
/MinLogDuration:5000
Use /DeleteExtraRows to create files with commands to delete extra rows in database tables (provided the table has one or more primary keys)
-
If
/ScriptLoadis enabled (or the parameter file hasScriptLoad=True), the shell script will use these files to delete extra rows prior to importing new data -
If
/DataExportOrderwas used to specify the export order for table data, the shell script will delete extra rows from tables in the reverse order from that defined in theDataExportOrderfile- The logic for this is that the purpose of the
DataExportOrderfile is to assure that
- The logic for this is that the purpose of the
-
If a table has a single-column primary key, the delete queries will be of the form:
DELETE FROM "t_user_status" target
WHERE status BETWEEN 'Active' AND 'Obsolete' AND
NOT EXISTS ( SELECT S.status
FROM "t_user_status" S
WHERE target.status = S.status AND
S.status in (
'Active','Inactive','Obsolete'
)
);- If a table has a multi-column based primary key, the delete queries will be of the form:
DELETE FROM t_internal_std_composition target
WHERE target.component_id BETWEEN 10 AND 72 AND
NOT EXISTS ( SELECT S.component_id
,S.mix_id
FROM t_internal_std_composition S
WHERE target.component_id = S.component_id AND
target.mix_id = S.mix_id AND
(
S.component_id = 10 AND S.mix_id = 1 OR
S.component_id = 10 AND S.mix_id = 3 OR
S.component_id = 11 AND S.mix_id = 1 OR
S.component_id = 11 AND S.mix_id = 3 OR
S.component_id = 13 AND S.mix_id = 1 OR
S.component_id = 13 AND S.mix_id = 3 OR
S.component_id = 21 AND S.mix_id = 5 OR
S.component_id = 22 AND S.mix_id = 5
)
);Use /ForceTruncate to specify a table name (or comma separated list of names) that have a primary key, but you would rather truncate the table data instead of using either of the DELETE FROM queries shown above
- This is useful for large tables where the
DELETE FROMqueries run very slowly - Names in this list can match either source table names or target table names
- Ignored if
/DeleteExtraRowsis false
Use /DisableTriggers (or /IncludeDisableTriggerCommands) to include trigger disable / enable commands in data export files
- Ignored if
PgInsertis true sincesession_replication_roleis set to "replica" prior to inserting/deleting data
Use /ScriptLoad or /Script to create a bash script file for loading exported table data into a PostgreSQL database
Use /ScriptUser to define the username to use when calling psql in the bash script
- If not defined, will use the username being used to run the DB Schema Export Tool
Use /ScriptDB to define the target database name to use when calling psql in the bash script
- The default is
/ScriptDB:dms
Use /ScriptHost to define the host name to use when calling psql in the bash script
- The default is
/ScriptHost:localhost
Use /ScriptPort to define the port name to use when calling psql in the bash script
- The default is
/ScriptPort:5432 - The port number will only be included in the bash script if it is not the default
Use /SnakeCase to auto change column names from Upper_Case and UpperCase to lower_case when exporting data from tables
- Also used for table names when exporting data from tables
- Entries in the
/DataTablesand/ColumnMapfiles will override auto-generated snake_case names
Use /PgDump or /PgDumpData to specify that exported data should use COPY commands instead of INSERT INTO statements
- With SQL Server databases, table data will be exported using pg_dump compatible COPY commands when
/PgDumpis used - With PostgreSQL data, table data will be exported using the pg_dump application
- With SQL Server data and PostgreSQL data, if
/PgDumpis not provided, data is exported withINSERT INTOstatements - With SQL Server data, if
/PgDumpis provided, but the/DataTablesfile hastruein thePgInsertcolumn,INSERT INTOstatements will be used for that table
Use /KeepPgDumpFile to not delete the _AllObjects_.sql created for each database
- By default, this is false, meaning the
_AllObjects_.sqlfile will be deleted (since it is no longer needed because a separate .sql file has been created for each object) ** If any unhandled scripting commands are encountered, the_AllObjects_.sqlfile will not be deleted, even if KeepPgDumpFile is false - In addition, when this is true, store the original PgDump table data files in directory
Replaced_PgDump_Files
The /PgInsert applies when exporting data from SQL Server database tables
- While reading the TableDataToExport.txt file, default the PgInsert column to true when
/PgInsertis provided, meaning table data will be exported via INSERT INTO statements, as mentioned above - Ignored for PostgreSQL data
Use PgInsertChunkSize to specify the number of values to insert at a time when PgInsert is true for a table
- Only applicable when exporting data from SQL Server
Use /ServerInfo to export server settings, logins, and SQL Server Agent jobs
Use /NoSchema to skip exporting schema (tables, views, functions, etc.)
Use /Sync to copy new/changed files from the output directory to an alternative directory
- This is advantageous to prevent file timestamps from getting updated every time the schema is exported
Use /Git to auto-update any new or changed files using Git
Use /Svn to auto-update any new or changed files using Subversion
Use /Hg to auto-update any new or changed files using Mercurial
Use /Commit to commit any updates to the repository
Use /L to log messages to a file
Use /LogFile to specify the base name for the log file
- Log files will be named using the base name, followed by the current date
Use /LogDir or /LogDirectory to specify the directory to save the log file in
- By default, the log file is created in the current working directory
Use /Preview to count the number of database objects that would be exported
Use /Stats to show (but not log) export stats
Use /Trace to show additional debug messages at the console
The processing options can be specified in a parameter file using /ParamFile:Options.conf or /Conf:Options.conf or /P:Options.conf
- Define options using the format
ArgumentName=Value - Lines starting with
#or;will be treated as comments - Additional arguments on the command line can supplement or override the arguments in the parameter file
Use /CreateParamFile to create an example parameter file
- By default, the example parameter file content is shown at the console
- To create a file named Options.conf, use
/CreateParamFile:Options.conf
This program leverages the SQL Server Management Objects (SMO) Framework
- All of the required DLLs should be included in the DBSchemaExportTool .zip file
In order to use the /Git, /Svn, or /Hg switches, you need the following software installed
and the executables present at a specific location:
Git
C:\Program Files\Git\bin\git.exe- Installed with 64-bit Git for Windows, available at https://git-scm.com/download/win
Subversion:
C:\Program Files\TortoiseSVN\bin\svn.exe- Installed with 64-bit Tortoise SVN, available at https://tortoisesvn.net/downloads.html
Mercurial
C:\Program Files\TortoiseHg\hg.exe- Installed with 64-bit Tortoise Hg, available at https://tortoisehg.bitbucket.io/download/index.html
Schema export from PostgreSQL databases utilizes the pg_dump utility
pg_dump.exeis available as part of the PostgreSQL for Windows installer from the EDB website- Run the installer
- De-select installing PostgreSQL Server
- Copy
pg_dump.exeplus several DLLs fromC:\Program Files\PostgreSQL\16\binto the directory with the DB_Schema_Export_Tool executable- See batch file
Copy_pg_dump_files.bat
- See batch file
File pgpass.conf at %APPDATA%\postgresql can be used to store the password for the PostgreSQL user
- On Linux, the file is stored at
~/.pgpass
mkdir %APPDATA%\postgresql
echo prismweb3:5432:dms:dmsreader:dmspasswordhere > %APPDATA%\postgresql\pgpass.conf
Example call to pg_dump:
pg_dump.exe -h prismweb3 -p 5432 -U dmsreader -d dms --schema-only --format=p --file=_AllObjects_.sql
Note that on Windows the .sql file created by pg_dump.exe includes extra carriage returns, resulting in CR CR LF instead of CR LF
- DB_Schema_Export_Tool.exe removes the extra carriage returns while parsing the SQL dump file
Written by Matthew Monroe for the Department of Energy (PNNL, Richland, WA)
E-mail: matthew.monroe@pnnl.gov or proteomics@pnnl.gov
Website: https://github.com/PNNL-Comp-Mass-Spec/ or https://www.pnnl.gov/integrative-omics
Source code: https://github.com/PNNL-Comp-Mass-Spec/DB-Schema-Export-Tool
The DB Schema Export Tool is licensed under the 2-Clause BSD License; you may not use this program except in compliance with the License. You may obtain a copy of the License at https://opensource.org/licenses/BSD-2-Clause
Copyright 2019 Battelle Memorial Institute