Skip to content

Jordan-M-Young/brickproof

Repository files navigation

Brickproof

Brickproof is a testing application and library for integrating unittests of Databricks code into CI/CD workflows. Brickproof is primarily used and manifests as a cli tool, but there is an python library that underpins the cli.

Brickproof works by remote interaction with your databricks workspace. It remotely clones your repo into databricks, uploads a pytest runner file, and remotely creates and runs a testing job. The result of your testing job, including the familiar pytest report, is returned for your CI/CD pipeline to make use of.

Installation

pip install brickproof

Use

Initialize Project

To initialize a brickproof project, run:

brickproof init

This will create a new brickproof.toml file, which you will edit for your own usecase. Running the init command multiple times will not overwrite your brickproof.toml file.

Configure Databricks Connection

To configure connection credentials to your Databricks workspace, run:

brickproof configure

This command will prompt you to enter your databricks workspace url, personal access token, and profile name. These will be written out to a .bprc file in your local directory.

Edit Config

To edit the values in your brickproof.toml using brickproof try the following:

brickproof edit-config -v section.field=new_value

where section is the toml file's section header value (repo or job) and field is the field name (name, git_provider, etc). Multiple values can be chained together like so:

brickproof edit-config -v repo.git_provider=gitHub job.dependencies='[requests,tomlkit,pydantic]'

This command can be helpful for editing your brickproof.toml file during CICD workflows.

Run Brickproof

To run a brickproof testing event, run:

brickproof run

To run with a specific configured profile, run:

brickproof run --p <MY_PROFILE>

Version

To get the version of your brickproof instance, run:

brickproof version

Config

Brickproof.toml

Here is an example brickproof.toml file.

[repo]
name = "brickproof"
workspace_path = "/Workspace/Users/jordan.m.young0@gmail.com"
git_provider = "gitHub"
git_repo = "https://github.com/Jordan-M-Young/brickproof.git"
branch = "main"
ignore = ["./tests/test_orchestrator.py","./tests/test_databricks.py"]

[job]
job_name = "Brickproof-Test"
task_key = "Unittests"
dependencies = ["requests","tomlkit"]
runner = "default"

This is the main configuration file you'll control your testing setup with. There are two main sections [repo] and [job]. These sections contain configuration variables for your code base and testing job specifications respectively. Below we'll go into more detail on the variables.

repo

  • name : the name of directory your git repo will be cloned into on databricks
  • workspace_path : the base workspace directory where brickproof files will live during the job
  • git_provider : the place your repo lives. Valid values include: gitHub, bitbucketCloud, gitLab, azureDevOpsServices, gitHubEnterprise, bitbucketServer, gitLabEnterpriseEdition and awsCodeCommit
  • git_repo : url of the repo you're cloning / testing
  • branch : the branch you're testing
  • ignore : a list of files or directories you want brickproof to ignore when testing.

job

  • job_name : the name you wish your testing job to be associated with in databricks
  • task_key : the key you want the testing job's single task to be associated with in databricks
  • dependencies : python libraries you want installed on cluster for your job
  • runner : determines what runner you'd like to use for your test. For brickproof's well-tested runner use default for a custom runner use /path/to/my/runner.py

A blank brickproof.toml file can be generated by running the init command.

.nprc

The other configuration file needed to run brickproof testing jobs is the .nprc file. An example of which is shown below:

[default]
workspace=https://my-databricks-workspace.cloud.databricks.com
token=dummytoken

[secondary]
workspace=https://my-databricks-workspace2.cloud.databricks.com
token=dummytoken2

There are three parts to each section of the .nprc file.

  • profile: the portion in in brackets ([default] or [secondary]) in the example.
  • workspace: the url of your databricks workspace where you'd like to run testing job
  • token: a valid PAT token generated in your databricks workspace

Support

Currently Brickproof supports the following

  • Testing Frameworks:
    • Pytest
  • CICD:
    • Github Actions

This is pretty low coverage for the possible combinations of testing/CICD frameworks available. If you want to see something added please open up an issue or check out the Contributing section and open a PR.

Contributing

See the Contributing doc for more details!

Setup

To setup, run the following:

./scripts/dev_setup.sh

FAQ

How Does Brickproof work?

Brickproof remotely runs git repo management and job orchestration in databricks. Using a custom runner, brickproof creates and runs a job that runs pytest on a copy of the repo/branch you specify in your brickproof.toml

Does Brickproof work?

Currently, brickproof does work seamlessly with Databricks Community edition. Brickproof uses itself (and Databricks Community version) to conduct the unit testing portion of our CICD pipelines. However, we haven't tested on paid versions of databricks.

Can I help?

Definitely! Check out the contributing document or create an issue.

Is X supported by brickproof?

Brickproof only supports pytest unit testing currently.

Is brickproof only a cli?

No. While brickproof as a CLI is important and useful. There is an underlying library that makes the CLI functionality possible. Feel free to use it for your needs.

About

Databricks Testing Suite

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published