diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 5122320..084e2f7 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,24 +1,208 @@ -# Contributing +# Contributing to olap-sql -> **Work in progress.** A full contribution guide will be added in a follow-up PR. +Thank you for your interest in contributing! This guide covers everything you need to get up and running, understand the project, and submit quality changes. -Thank you for your interest in contributing to olap-sql! +--- -## Quick start +## Table of Contents -1. Fork the repository and create a feature branch from `main`. -2. Make your changes with tests where applicable. -3. Run `go test ./...` to ensure all tests pass. -4. Open a pull request against `main` and describe your changes. +- [Development environment](#development-environment) +- [Project layout](#project-layout) +- [Running tests](#running-tests) +- [Making changes](#making-changes) +- [Pull request checklist](#pull-request-checklist) +- [Commit message style](#commit-message-style) +- [Code style](#code-style) +- [Adding a new database backend](#adding-a-new-database-backend) +- [Reporting bugs and requesting features](#reporting-bugs-and-requesting-features) -## Guidelines +--- -- Follow standard Go conventions (`gofmt`, clear package names, documented exports). -- Keep commits focused and write meaningful commit messages. -- For larger changes, open an issue first to discuss the approach. +## Development environment -## Coming soon +### Prerequisites -- Detailed development environment setup -- Test strategy and coverage expectations -- Release process +| Tool | Minimum version | Notes | +|------|----------------|-------| +| Go | 1.22 | Project uses range-over-integer (Go 1.22) and built-in `min`/`max` (Go 1.21). | +| Git | any recent | | +| SQLite (optional) | — | Only needed to run the SQLite-backed integration tests locally. | +| ClickHouse / MySQL / PostgreSQL (optional) | — | Only needed for the respective backend integration tests. | + +### Clone and build + +```bash +git clone https://github.com/AWaterColorPen/olap-sql.git +cd olap-sql +go mod tidy +go build ./... +``` + +No additional setup steps are required for the core library. The project has no generated code and no `Makefile` targets. + +--- + +## Project layout + +``` +. +├── api/ +│ ├── models/ # TOML schema structs (Dictionary data model) +│ └── types/ # Public query/result/filter types used by callers +├── docs/ # User-facing documentation (Markdown) +│ └── superpowers/ # Internal design specs and iteration plans +├── test/ # Integration test fixtures (TOML configs, SQL seeds) +├── client.go # Database client abstraction and GORM wiring +├── configuration.go # Configuration and DBOption types +├── database.go # Low-level query execution helpers (RunSync, RunChan) +├── dependency_graph.go # Metric dependency resolution (METRIC_DIVIDE, etc.) +├── dictionary.go # Dictionary: loads and caches the TOML schema +├── dictionary_adapter.go # Adapter layer (TOML file → in-memory model) +├── dictionary_column.go # Column resolution helpers +├── dictionary_splitter.go # Multi-source JOIN splitting +├── dictionary_translator.go # Query → Clause translation (core logic) +├── manager.go # Public Manager API +└── run.go # Result assembly helpers +``` + +The **core translation pipeline** lives in `dictionary_translator.go` → `clause.go` (in `api/types`). If you want to understand how a `Query` becomes SQL, start there and follow the `Translate` → `Statement` call chain. + +--- + +## Running tests + +### Unit and SQLite integration tests (no external DB required) + +```bash +go test ./... +``` + +All tests that use SQLite run with an in-memory database; no setup is needed. + +### Running a specific test + +```bash +go test -run TestManager ./... +``` + +### Running tests with verbose output + +```bash +go test -v ./... +``` + +### Integration tests with other backends + +Tests that target ClickHouse, MySQL, or PostgreSQL are skipped automatically when the corresponding DSN environment variable is not set. To run them, set the relevant variable before running `go test`: + +```bash +# ClickHouse example +CLICKHOUSE_DSN="clickhouse://localhost:9000/default" go test ./... +``` + +Check the test file comments in `manager_test.go` for the exact env var names per backend. + +--- + +## Making changes + +1. **Fork** the repository and create a feature branch from `main`: + ```bash + git checkout -b feature/your-change-description + ``` +2. Make your changes. Keep each PR **focused on a single concern** — one PR per feature, refactor, or bug fix. +3. Add or update **tests** for your change. New functionality without tests will not be merged. +4. Run the test suite and confirm it passes: + ```bash + go test ./... + ``` +5. Run `gofmt` and `go vet`: + ```bash + gofmt -w . + go vet ./... + ``` +6. Open a pull request against `main`. + +### When to open an issue first + +For **large or design-changing** contributions (new backends, query language extensions, API changes), please open an issue to discuss the approach before writing code. This avoids wasted effort if the direction doesn't align with the project. + +For **small, self-contained** changes (typo fixes, documentation improvements, minor bug fixes), a PR without a prior issue is fine. + +--- + +## Pull request checklist + +Before marking a PR ready for review, confirm the following: + +- [ ] All existing tests pass (`go test ./...`). +- [ ] New tests cover the added or changed behaviour. +- [ ] `gofmt` has been run; no formatting changes in the diff. +- [ ] `go vet` reports no issues. +- [ ] Public types and functions have godoc comments. +- [ ] The PR description explains *what* changed and *why*. +- [ ] For breaking changes: a `CHANGELOG.md` entry is included. + +--- + +## Commit message style + +Follow the [Conventional Commits](https://www.conventionalcommits.org/) convention: + +``` +(): + +[optional body] +``` + +Common types: + +| Type | Use for | +|------------|--------------------------------------------------| +| `feat` | New features | +| `fix` | Bug fixes | +| `refactor` | Code changes that neither add features nor fix bugs | +| `docs` | Documentation only | +| `test` | Adding or fixing tests | +| `chore` | Dependency updates, build changes, CI config | + +**Examples:** + +``` +feat(filter): add FILTER_OPERATOR_HAS for ClickHouse array columns +fix(translator): handle nil TimeInterval without panicking +docs: add API reference page +chore: upgrade gorm to v1.25 +``` + +--- + +## Code style + +- Follow standard Go conventions as enforced by `gofmt` and `go vet`. +- Prefer **explicit error handling** over panics. +- Use the `any` alias instead of `interface{}`. +- Prefer stdlib (`slices`, `maps`) over third-party utility packages for simple helpers. +- Keep package names short and lowercase; avoid underscores in package names. +- Exported symbols must have godoc comments; unexported helpers are encouraged but not required to. + +--- + +## Adding a new database backend + +olap-sql currently supports ClickHouse, MySQL, PostgreSQL, and SQLite. If you want to add a new backend: + +1. Add a new `DBType` constant in `api/types/db_type.go`. +2. Implement the GORM driver initialisation in `client.go` (see the existing `newGormDB` switch statement). +3. If the new database uses dialect-specific SQL functions (e.g. array operations, date truncation), add handling in `dictionary_translator.go` where existing dialect branches exist. +4. Add integration test coverage under `test/` with a representative fixture. +5. Update `README.md` to list the new backend under **Requirements**. + +Please open an issue first to discuss the new backend — we want to ensure test infrastructure and CI are set up correctly. + +--- + +## Reporting bugs and requesting features + +- **Bugs:** Open a GitHub issue with a minimal reproducible example — include your TOML schema, the `Query` you built, and the SQL or error you got vs. what you expected. +- **Features:** Open a GitHub issue describing the use case and your proposed API. For query language extensions, include the SQL you'd like to generate. diff --git a/README.md b/README.md index cfcaaa1..e9abeac 100644 --- a/README.md +++ b/README.md @@ -199,6 +199,7 @@ fmt.Println(sql) | [Query](./docs/query.md) | Define metrics, dimensions, filters, orders, and limits | | [Result](./docs/result.md) | Parse and work with query results | | [Examples](./docs/examples.md) | Common usage scenarios (ClickHouse joins, time filters, concurrency) | +| [API Reference](./docs/api.md) | Full public API — Manager, Query, Filter, Result | | [Architecture](./docs/architecture.md) | Internal design for contributors | | [Contributing](./CONTRIBUTING.md) | How to contribute to olap-sql | diff --git a/docs/api.md b/docs/api.md new file mode 100644 index 0000000..32540e4 --- /dev/null +++ b/docs/api.md @@ -0,0 +1,419 @@ +# API Reference + +This page documents the public API of olap-sql. Each section covers a type or function with its parameters, return values, and typical usage. + +--- + +## Table of Contents + +- [NewManager](#newmanager) +- [Manager](#manager) + - [RunSync](#runsync) + - [RunChan](#runchan) + - [BuildSQL](#buildsql) + - [BuildTransaction](#buildtransaction) + - [SetLogger](#setlogger) + - [GetClients](#getclients) + - [GetDictionary](#getdictionary) +- [Configuration](#configuration) +- [Query](#query) + - [TimeInterval](#timeinterval) + - [Filter](#filter) + - [FilterOperatorType](#filteroperatortype) + - [ValueType](#valuetype) + - [OrderBy](#orderby) + - [OrderDirectionType](#orderdirectiontype) + - [Limit](#limit) +- [Result](#result) + +--- + +## NewManager + +```go +func NewManager(cfg *Configuration) (*Manager, error) +``` + +Creates and initialises a `Manager` from the provided `Configuration`. + +| Parameter | Type | Description | +|-----------|------------------|------------------------------------------------| +| `cfg` | `*Configuration` | Holds client DSNs and a dictionary option. | + +**Returns** a ready-to-use `*Manager`, or an error if any client DSN is invalid or the dictionary file cannot be parsed. + +**At least one** of `ClientsOption` or `DictionaryOption` must be non-nil; if both are nil, the manager is created but no queries can be executed. + +**Example:** + +```go +cfg := &olapsql.Configuration{ + ClientsOption: olapsql.ClientsOption{ + "clickhouse": { + DSN: "clickhouse://localhost:9000/default", + Type: types.DBTypeClickHouse, + }, + }, + DictionaryOption: &olapsql.Option{ + AdapterOption: olapsql.AdapterOption{Dsn: "olap-sql.toml"}, + }, +} +manager, err := olapsql.NewManager(cfg) +if err != nil { + log.Fatal(err) +} +``` + +--- + +## Manager + +`Manager` is the main entry point. It holds a set of registered database clients and an OLAP dictionary. + +### RunSync + +```go +func (m *Manager) RunSync(query *types.Query) (*types.Result, error) +``` + +Executes the query **synchronously** and returns the full result. + +| Parameter | Type | Description | +|-----------|---------------|--------------------------------| +| `query` | `*types.Query`| The OLAP query to execute. | + +**Returns** a `*types.Result` containing column names (`Dimensions`) and row data (`Source`), or an error. + +Use this for typical queries where the result set fits in memory. + +--- + +### RunChan + +```go +func (m *Manager) RunChan(query *types.Query) (*types.Result, error) +``` + +Executes the query and **streams rows** internally over a channel before assembling the result. + +| Parameter | Type | Description | +|-----------|---------------|--------------------------------| +| `query` | `*types.Query`| The OLAP query to execute. | + +**Returns** a `*types.Result` (same structure as `RunSync`), or an error. + +Prefer `RunChan` for large result sets to avoid peak memory pressure. + +--- + +### BuildSQL + +```go +func (m *Manager) BuildSQL(query *types.Query) (string, error) +``` + +Translates the query into its **SQL string** without executing it. + +| Parameter | Type | Description | +|-----------|---------------|--------------------------------| +| `query` | `*types.Query`| The OLAP query to translate. | + +**Returns** the generated SQL string, or an error if translation fails. + +Useful for debugging, audit logging, or displaying the query to end users. + +**Example:** + +```go +sql, err := manager.BuildSQL(query) +if err != nil { + log.Fatal(err) +} +fmt.Println("Generated SQL:", sql) +``` + +--- + +### BuildTransaction + +```go +func (m *Manager) BuildTransaction(query *types.Query) (*gorm.DB, error) +``` + +Translates the query into a `*gorm.DB` ready to execute. + +| Parameter | Type | Description | +|-----------|---------------|--------------------------------| +| `query` | `*types.Query`| The OLAP query to translate. | + +**Returns** a configured `*gorm.DB`, or an error. + +Use this when you need direct access to the GORM object — for example to attach custom hooks, inspect the SQL via `ToSQL`, or integrate with an existing GORM session. + +--- + +### SetLogger + +```go +func (m *Manager) SetLogger(log logger.Interface) +``` + +Attaches a custom GORM logger to all registered database clients. + +| Parameter | Type | Description | +|-----------|--------------------|-----------------------------------------| +| `log` | `logger.Interface` | A GORM-compatible logger implementation.| + +Call this after `NewManager` to enable query logging, debug output, or custom log routing. + +--- + +### GetClients + +```go +func (m *Manager) GetClients() (Clients, error) +``` + +Returns the registered database clients. Returns an error if the manager has no `ClientsOption`. + +--- + +### GetDictionary + +```go +func (m *Manager) GetDictionary() (*Dictionary, error) +``` + +Returns the OLAP dictionary. Returns an error if the manager has no `DictionaryOption`. + +--- + +## Configuration + +```go +type Configuration struct { + ClientsOption ClientsOption + DictionaryOption *Option +} +``` + +| Field | Type | Description | +|--------------------|------------------|--------------------------------------------------------------------| +| `ClientsOption` | `ClientsOption` | Map of database name → `*DBOption`. Each entry registers one DB. | +| `DictionaryOption` | `*Option` | Points to the TOML schema file via `AdapterOption.Dsn`. | + +### ClientsOption / DBOption + +```go +type ClientsOption map[string]*DBOption + +type DBOption struct { + DSN string // e.g. "clickhouse://localhost:9000/default" + Type types.DBType // e.g. types.DBTypeClickHouse +} +``` + +Supported `DBType` values: + +| Constant | Database | +|---------------------------|-------------| +| `types.DBTypeClickHouse` | ClickHouse | +| `types.DBTypeMySQL` | MySQL | +| `types.DBTypePostgreSQL` | PostgreSQL | +| `types.DBTypeSQLite` | SQLite | + +--- + +## Query + +```go +type Query struct { + DataSetName string `json:"data_set_name"` + TimeInterval *TimeInterval `json:"time_interval"` + Metrics []string `json:"metrics"` + Dimensions []string `json:"dimensions"` + Filters []*Filter `json:"filters"` + Orders []*OrderBy `json:"orders"` + Limit *Limit `json:"limit"` + Sql string `json:"Sql"` +} +``` + +| Field | Type | Required | Description | +|----------------|------------------|----------|-----------------------------------------------------------------| +| `DataSetName` | `string` | ✅ | Must match a `sets[].name` entry in your TOML schema. | +| `TimeInterval` | `*TimeInterval` | ❌ | Shorthand for a start/end filter on a time dimension. | +| `Metrics` | `[]string` | ❌ | Metric names defined in the TOML schema. | +| `Dimensions` | `[]string` | ❌ | Dimension names defined in the TOML schema. | +| `Filters` | `[]*Filter` | ❌ | Arbitrary filter conditions (supports nesting via AND/OR). | +| `Orders` | `[]*OrderBy` | ❌ | Sort order for the result set. | +| `Limit` | `*Limit` | ❌ | Pagination — row limit and offset. | +| `Sql` | `string` | ❌ | Reserved field; set by the library during translation. | + +### TimeInterval + +```go +type TimeInterval struct { + Name string `json:"name"` + Start string `json:"start"` + End string `json:"end"` +} +``` + +Convenience wrapper that expands to two `Filter` entries (`>= Start` and `< End`) on the named dimension. + +| Field | Description | +|---------|-----------------------------------------------------| +| `Name` | Dimension name to filter on (e.g. `"date"`). | +| `Start` | Inclusive lower bound (e.g. `"2021-05-06"`). | +| `End` | **Exclusive** upper bound (e.g. `"2021-05-08"`). | + +**Example — equivalent SQL fragment:** + +```sql +WHERE date >= '2021-05-06' AND date < '2021-05-08' +``` + +--- + +### Filter + +```go +type Filter struct { + OperatorType FilterOperatorType `json:"operator_type"` + ValueType ValueType `json:"value_type"` + Table string `json:"table"` + Name string `json:"name"` + FieldProperty FieldProperty `json:"field_property"` + Value []any `json:"value"` + Children []*Filter `json:"children"` +} +``` + +| Field | Description | +|----------------|-----------------------------------------------------------------------------| +| `OperatorType` | Comparison operator (see `FilterOperatorType`). | +| `ValueType` | How to quote values in SQL (see `ValueType`). | +| `Name` | Dimension or metric name to filter on. | +| `Value` | One or more comparison values (slice for `IN`/`NOT IN`, single for others). | +| `Children` | Nested filters; only used with `FILTER_OPERATOR_AND` / `FILTER_OPERATOR_OR`.| + +#### FilterOperatorType + +| Constant | SQL equivalent | +|----------------------------------|------------------------------| +| `FILTER_OPERATOR_EQUALS` | `field = value` | +| `FILTER_OPERATOR_IN` | `field IN (v1, v2, ...)` | +| `FILTER_OPERATOR_NOT_IN` | `field NOT IN (...)` | +| `FILTER_OPERATOR_LESS_EQUALS` | `field <= value` | +| `FILTER_OPERATOR_LESS` | `field < value` | +| `FILTER_OPERATOR_GREATER_EQUALS` | `field >= value` | +| `FILTER_OPERATOR_GREATER` | `field > value` | +| `FILTER_OPERATOR_LIKE` | `field LIKE value` | +| `FILTER_OPERATOR_HAS` | `has(field, value)` (CK) | +| `FILTER_OPERATOR_EXTENSION` | raw SQL expression | +| `FILTER_OPERATOR_AND` | `( child1 AND child2 ... )` | +| `FILTER_OPERATOR_OR` | `( child1 OR child2 ... )` | + +#### ValueType + +| Constant | SQL quoting | +|-----------------|--------------------------| +| `VALUE_STRING` | `'value'` (quoted) | +| `VALUE_INTEGER` | `value` (unquoted) | +| `VALUE_FLOAT` | `value` (unquoted) | +| `VALUE_UNKNOWN` | auto-detect from Go type | + +**Nested filter example (AND + OR):** + +```go +filter := &types.Filter{ + OperatorType: types.FilterOperatorTypeAnd, + Children: []*types.Filter{ + { + OperatorType: types.FilterOperatorTypeEquals, + Name: "project", + ValueType: types.ValueTypeString, + Value: []any{"en"}, + }, + { + OperatorType: types.FilterOperatorTypeGreater, + Name: "hits", + ValueType: types.ValueTypeInteger, + Value: []any{1000}, + }, + }, +} +``` + +Generated SQL: +```sql +( project = 'en' AND hits > 1000 ) +``` + +--- + +### OrderBy + +```go +type OrderBy struct { + Name string `json:"name"` + Direction OrderDirectionType `json:"direction"` +} +``` + +| Field | Description | +|-------------|-------------------------------------------------------| +| `Name` | Metric or dimension name to sort by. | +| `Direction` | `ORDER_DIRECTION_ASCENDING` or `ORDER_DIRECTION_DESCENDING`. | + +#### OrderDirectionType + +| Constant | SQL | +|-------------------------------|--------------| +| `ORDER_DIRECTION_ASCENDING` | `name ASC` | +| `ORDER_DIRECTION_DESCENDING` | `name DESC` | + +--- + +### Limit + +```go +type Limit struct { + Limit uint64 `json:"limit"` + Offset uint64 `json:"offset"` +} +``` + +| Field | Description | +|----------|-------------------------------------------------| +| `Limit` | Maximum number of rows to return. | +| `Offset` | Number of rows to skip (for pagination). | + +--- + +## Result + +```go +type Result struct { + Dimensions []string `json:"dimensions"` + Source []map[string]any `json:"source"` +} +``` + +| Field | Description | +|--------------|----------------------------------------------------------------------------------------------------| +| `Dimensions` | Ordered list of column names — dimensions first, then metrics (mirrors the query field order). | +| `Source` | Slice of row maps. Each map is `column_name → value`. Value type matches the database driver type.| + +**Example response:** + +```json +{ + "dimensions": ["date", "hits", "hits_avg"], + "source": [ + {"date": "2021-05-06T00:00:00Z", "hits": 147, "hits_avg": 49}, + {"date": "2021-05-07T00:00:00Z", "hits": 7178, "hits_avg": 897.25} + ] +} +```