One-file, pure-stdlib Python tool that converts instrumented LibreOffice .ods sheets into SQL for SQLite/Postgres/MySQL.
This dependency-free CLI reads native LibreOffice Calc .ods files (via zipfile + xml.etree) and emits SQL you can pipe straight into your database. Instrument your sheet with a tiny first-column markup and get reproducible CREATE TABLE, INSERT, and index statements—perfect for quick ETL, data audits, and sharing spreadsheet data as a real database.
- What: One-file, pure-stdlib CLI that turns instrumented LibreOffice
.odsspreadsheets into SQL (SQLite/Postgres/MySQL). Emits SQL on stdout; diagnostics on stderr. - How: Clone this repo and run it directly, or just copy the single file
src/ods2sql.pyinto your project and execute it with Python. - Example:
# If you copied the script next to your data
./ods2sql.py document.ods | sqlite3 document.sqlite- Pure standard library (no external deps)
- Multiple sheets & multiple tables per sheet
- Column selection via a “columns row”; SQL types via a “types row” (defaults to TEXT)
- Dialects: SQLite (default), Postgres, MySQL (affects quoting & booleans)
- Fast batched INSERTs;
--schema-only,--data-only,--list,--table - Indexes by default (per-column), plus composite indexes and PRIMARY KEY support
- Emits SQL on stdout; diagnostics on stderr (pipe-safe)
DuckDB: The SQLite dialect output works well with DuckDB in practice.
You can run the script directly or install as a CLI.
- Local script: clone and run
./src/ods2sql.py --help - Package build:
python -m buildthenpip install dist/ods2sql_python-*.whl - Editable dev install:
pip install -e .
Column A is reserved for control keywords; data starts at column B. The sheet must begin with:
A1 = 'sqltable' B1 = <table_name>
Later control rows accept synonyms:
- Row 2 (columns row): A2 = one of 'sqlcolumn', 'columns', 'column', 'fields' → non-empty B2.. define column names
- Row 3 (types row): A3 = one of 'sqltype', 'types', 'type' → non-empty B3.. define SQL types (blank → TEXT)
Any row whose first cell is 'comment' is ignored.
Data rows have an empty first cell (in column A). Fully empty data rows are skipped.
Multiple tables per sheet are supported; each new 'sqltable' row starts a new block.
./ods2sql.py data.ods | sqlite3 data.sqlite
./ods2sql.py data.ods --dialect postgres --batch 1000 > load.sqlWhen installed as a package, use the ods2sql command.
By default, the tool creates a non-unique index for every column (handy for browsing/search in SQLite). You can customize:
--no-indices— don’t create any indexes--index-columns "c1,c2"— only index those columns (instead of all)--index "c1+c2"— define a composite index; flag may repeat--primary-key "c1[,c2,...]"— set a PRIMARY KEY inCREATE TABLE; PK columns are not redundantly indexed
--dialect {sqlite,postgres,mysql}— quoting & boolean style (default: sqlite)--if-not-exists— use IF NOT EXISTS in CREATE TABLE/INDEX where supported--no-drop— don’t emit DROP TABLE IF EXISTS--schema-only/--data-only— only DDL or only INSERTs--batch N— rows per INSERT (0/1 means one row per statement)--table NAME— only export specific table(s) (flag may repeat)--list— list detected tables/columns to stderr and exit
- The parser collapses repeated empty rows in ODS to avoid expanding millions of blank lines at the end of a sheet.
- Identifiers are quoted and escaped per dialect; you can use schema-qualified names (e.g.,
schema.table).
Architectural choices are captured in docs/adr/. See the index here:
- docs/adr/README.md
- Identifier length limits: generated index names are truncated to fit common limits (Postgres 63, MySQL 64, SQLite treated as 64 for portability). A short hash suffix is added when truncation occurs.
- MySQL index constraints: indexes on
TEXT/BLOBcolumns require a prefix length in MySQL. This tool skips such indexes and prints a warning to stderr to avoid invalid SQL. Define composite indexes that avoidTEXT/BLOBcolumns, or add indexes manually with a prefix length if needed. - Empty strings: empty Python strings (
"") are emitted as SQLNULLvalues unconditionally. - TEXT formatting: when a column is declared
TEXT, the tool prefers the cell's formatted display over the raw typed value when sensible. For example, a percentage cell with value0.12is emitted as'12%'instead of'0.12'.
This project’s source code was generated with extensive AI assistance and then human‑curated. See ADR-0010 for the formal policy. All architectural intent, testing strategy, and release decisions are documented and reviewed by a human maintainer ("Software Curator"). AI output is treated as a draft; only curated, test‑verified changes are merged. Copyright and authorship remain with the human contributors.
See CONTRIBUTING.md and CODE_OF_CONDUCT.md. Run CI checks locally:
ruff check .
mypy --strict src tests scripts
pytest -qTo work on this repo locally (humans and AI agents alike), please create and activate a Python virtual environment and install development dependencies from requirements-dev.txt:
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements-dev.txtOnce activated, you can run linters and tests as shown above. Remember to re-activate the environment in new shells: source .venv/bin/activate.
This repo ships a .pre-commit-config.yaml to enforce fast hygiene and curation policy checks locally:
Hooks on commit:
- Trailing whitespace / EOF fixers
- Ruff lint + format (
--fix) - Import smoke (
import ods2sql) guard - ADR index sync check (fails if a new ADR file isn’t indexed)
- Debug
print(blocker forsrc/ods2sql.py
Hook on push:
- Fast sanity tests subset (
scripts/fast_tests.sh)
Enable:
pip install -r requirements-dev.txt
pre-commit install
pre-commit install --hook-type pre-pushRun all hooks manually:
pre-commit run --all-files