validator-cli is a command line interface to validator. It enables you to :
- control your data (against a data schema)
Usage: validator-cli.py control [OPTIONS] INPUTFILE SCHEMAFILE
inputfile : input file path or directory (depends if you specified
--directory)
schemafile : data schema file path, with field definitions, types, patterns and enum lists
--directory : type --directory if you want to control an entire directory
Arguments:
INPUTFILE [required]
SCHEMAFILE [required]
Options:
--directory / --no-directory [default: False]
--help Show this message and exit.
- transform it (against a mapping file)
Usage: validator-cli.py transform [OPTIONS] INPUTFILE MAPPINGFILE
inputfile : input file path or directory (depends if you specified
--directory)
mappingfile : data mapping file path. The data mapping file specifies
source field and target field names.
--directory : type --directory if you want to control an entire directory
Arguments:
INPUTFILE [required]
MAPPINGFILE [required]
Options:
--directory / --no-directory [default: False]
--outputdata TEXT
--help Show this message and exit.
The data schema file used in validator is a simplified form of frictionlessdata table schema
Create a CSV file (named schema.csv for example).
You will find a CSV example file here.
Here is an example content :
| name | type | pattern | enum |
|---|---|---|---|
| id_site | integer | ||
| name | character | ||
| weight | number | ||
| date | date | ||
| ok | boolean | ||
| values | character | ["a", "b", "c"] | |
| city_code | character | ^([013-9]\d|2[AB1-9])\d{3}$ | |
| siret | character | ^\d{14}$ |
Valid types are :
stringintegernumberdatedatetimedurationboolean
Fill pattern column if your values must match a regular expression.
Fill enum if your values must belong to a list of values.
Let's suppose you have a data file named data.csv and a data schema named schema.csv
Control a single file against your data schema
python validator-cli.py control data.csv schema.csv
NOT FOUNDmeans that the column is not found either in the data file, either on the data schema file.NOT VALIDmeans that the pattern or list of values is not respected
Here is an output :
You can also control an entire directory of files
python validator-cli.py control --directory my_dir schema.csv
You can use validator to transform your data to a particular schema.
ℹ️ Note transforming your data will only rename the columns, not modify your data cell contents.
The mapping file specifies the source fields and the target fields for the renaming of the data.
You can create this file with the validator GUI assistant
The mapping file, in the above animation, is created and named data-mapping.csv
ℹ️ Note that with the GUI Assistant, the data is also transformed at the same time.
It has the following 2-column source-to-destination structure :
You can transform your data using the GUI assistant, but you may wish transforming your data programmatically.
For this, transform will help you.
Now you have created a data mapping file with the GUI assistant, you can use the data mapping file to transform data in a script with transform.
This line will transform data.csv into data-mapped.csv, using source-target fields specifications contained in mapping.csv
python validator-cli.py transform data.csv mapping.csv
Here is an output :
You can also transform files contained in a directory with --directory
python validator-cli.py transform -d my_dir mapping.csv
⚠️ Only data with the right structure will be transformed. Data with wrong structure will be ignored and noticed in the console.



