Brickproof is a testing application and library for integrating unittests of Databricks code into CI/CD workflows. Brickproof is primarily used and manifests as a cli tool, but there is an python library that underpins the cli.
Brickproof works by remote interaction with your databricks workspace. It remotely clones your repo into databricks, uploads a pytest runner file, and remotely creates and runs a testing job. The result of your testing job, including the familiar pytest report, is returned for your CI/CD pipeline to make use of.
pip install brickproofTo initialize a brickproof project, run:
brickproof initThis will create a new brickproof.toml file, which you will edit for your own usecase. Running the init command multiple times will not overwrite your brickproof.toml file.
To configure connection credentials to your Databricks workspace, run:
brickproof configure
This command will prompt you to enter your databricks workspace url, personal access token, and profile name. These
will be written out to a .bprc file in your local directory.
To edit the values in your brickproof.toml using brickproof try the following:
brickproof edit-config -v section.field=new_valuewhere section is the toml file's section header value (repo or job) and field is the field name (name, git_provider, etc). Multiple values can be chained together like so:
brickproof edit-config -v repo.git_provider=gitHub job.dependencies='[requests,tomlkit,pydantic]'This command can be helpful for editing your brickproof.toml file during CICD workflows.
To run a brickproof testing event, run:
brickproof runTo run with a specific configured profile, run:
brickproof run --p <MY_PROFILE>To get the version of your brickproof instance, run:
brickproof version
Here is an example brickproof.toml file.
[repo]
name = "brickproof"
workspace_path = "/Workspace/Users/jordan.m.young0@gmail.com"
git_provider = "gitHub"
git_repo = "https://github.com/Jordan-M-Young/brickproof.git"
branch = "main"
ignore = ["./tests/test_orchestrator.py","./tests/test_databricks.py"]
[job]
job_name = "Brickproof-Test"
task_key = "Unittests"
dependencies = ["requests","tomlkit"]
runner = "default"
This is the main configuration file you'll control your testing setup with.
There are two main sections [repo] and [job]. These sections contain configuration variables for your code base and
testing job specifications respectively. Below we'll go into more detail on the variables.
name: the name of directory your git repo will be cloned into on databricksworkspace_path: the base workspace directory where brickproof files will live during the jobgit_provider: the place your repo lives. Valid values include:gitHub,bitbucketCloud,gitLab,azureDevOpsServices,gitHubEnterprise,bitbucketServer,gitLabEnterpriseEditionandawsCodeCommitgit_repo: url of the repo you're cloning / testingbranch: the branch you're testingignore: a list of files or directories you want brickproof to ignore when testing.
job_name: the name you wish your testing job to be associated with in databrickstask_key: the key you want the testing job's single task to be associated with in databricksdependencies: python libraries you want installed on cluster for your jobrunner: determines what runner you'd like to use for your test. For brickproof's well-tested runner usedefaultfor a custom runner use/path/to/my/runner.py
A blank brickproof.toml file can be generated by running the init command.
The other configuration file needed to run brickproof testing jobs is the .nprc file. An example of which is shown below:
[default]
workspace=https://my-databricks-workspace.cloud.databricks.com
token=dummytoken
[secondary]
workspace=https://my-databricks-workspace2.cloud.databricks.com
token=dummytoken2
There are three parts to each section of the .nprc file.
- profile: the portion in in brackets (
[default]or[secondary]) in the example. workspace: the url of your databricks workspace where you'd like to run testing jobtoken: a valid PAT token generated in your databricks workspace
Currently Brickproof supports the following
- Testing Frameworks:
- Pytest
- CICD:
- Github Actions
This is pretty low coverage for the possible combinations of testing/CICD frameworks available. If you want to see something added please open up an issue or check out the Contributing section and open a PR.
See the Contributing doc for more details!
To setup, run the following:
./scripts/dev_setup.shBrickproof remotely runs git repo management and job orchestration in databricks. Using a custom runner, brickproof creates and runs a job that runs pytest on a copy of the repo/branch you specify in your brickproof.toml
Currently, brickproof does work seamlessly with Databricks Community edition. Brickproof uses itself (and Databricks Community version) to conduct the unit testing portion of our CICD pipelines. However, we haven't tested on paid versions of databricks.
Definitely! Check out the contributing document or create an issue.
Brickproof only supports pytest unit testing currently.
No. While brickproof as a CLI is important and useful. There is an underlying library that makes the CLI functionality possible. Feel free to use it for your needs.