Skip to content
Merged
93 changes: 92 additions & 1 deletion python/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,99 @@ All notable changes to this project will be documented in this file.

This project adheres to [Semantic Versioning](http://semver.org/).

## [v0.14.0] - April 28, 2026

https://github.com/sift-stack/sift/commit/e1305363e95cafb4c980fdd493f69e72660a52fb
### What's New

#### Data Import API in SiftClient
The `sift_client` module now exposes a data import API supporting CSV, Parquet, TDMS, and HDF5. With this addition, all features previously available only in `sift_py` are now available in `sift_client`, which remains the recommended interface for new development. `sift_py` (deprecated since [v0.10.0](#v0100---january-30-2026)) continues to work and ship in this release.

Migrating from `sift_py`: the per-format upload services (`CsvUploadService`, `ParquetUploadService`, `Hdf5UploadService`, `TdmsUploadService`) collapse into a single `client.data_import.import_from_path` method. `sift_py` only auto-detected for CSV via `simple_upload`; other formats required more setup. `sift_client` unifies all four with auto-detection built into `import_from_path` itself: the `config` argument is optional, so the common call takes just a file path and target asset. Call `client.data_import.detect_config(...)` first if you want to inspect or patch the config before importing. `import_from_path` returns a `Job` you can optionally wait on.
```python
# sift_py (deprecated)
from sift_py.data_import.csv import CsvUploadService
from sift_py.rest import SiftRestConfig

rest_conf: SiftRestConfig = {"uri": sift_uri, "apikey": apikey}
csv_service = CsvUploadService(rest_conf)
import_service = csv_service.simple_upload(asset_name="my_asset", path="data.csv")
import_service.wait_until_complete()

# sift_client
from sift_client import SiftClient

client = SiftClient(api_key=apikey, grpc_url=grpc_url, rest_url=rest_url)
job = client.data_import.import_from_path("data.csv", asset="my_asset")
job.wait_until_complete()
```

Format-by-format support:
- **CSV**: auto-detected from `.csv`. Supports an optional JSON metadata row (row 1 or row 2) for specifying channel names, units, data types, and the time column format.
- **Parquet**: requires an explicit `data_type` (`PARQUET_FLATDATASET` or `PARQUET_SINGLE_CHANNEL_PER_ROW`) since `.parquet` alone doesn't disambiguate the layout. Detection only reads the file footer, so it stays fast on large files.
- **HDF5**: new in this release. Auto-detected from `.h5` / `.hdf5`. Detection works out-of-the-box for files with a compound-dataset layout (first field = time) or a shared root-level `time` dataset; other layouts may need a hand-built `config`.
- **TDMS**: new in this release. Auto-detected from `.tdms` and recognizes the common timing conventions (group-level `xchannel`, first-channel `TimeStamp`, or per-channel waveform properties). `TdmsImportConfig` controls handling of untimed channels (`fallback_method`), complex values (`complex_component`), scaled vs. raw data, and waveform start-time overrides.

#### Parquet as an Export Output Format
`client.data_export.export(...)` now accepts `ExportOutputFormat.PARQUET` alongside the existing CSV and Sun/WinPlot options. Unlike the `sift_py` `DataService` + `DataFrame.to_parquet()` pattern (async-only, buffers everything in memory, name-strings only), the new export API runs as a server-side job, works sync or async, accepts `Asset`/`Channel` objects or IDs, and scales to large exports.

```python
# sift_py (deprecated): no dedicated export API, so query in-memory and write yourself
import pandas as pd
from sift_py.data.query import ChannelQuery, DataQuery
from sift_py.data.service import DataService
from sift_py.grpc.transport import use_sift_async_channel

async with use_sift_async_channel({"uri": sift_uri, "apikey": apikey}) as channel:
result = await DataService(channel).execute(DataQuery(
asset_name="my_asset",
start_time=start,
end_time=stop,
channels=[ChannelQuery(channel_name="my_channel")],
))
pd.DataFrame(result.all_channels()[0].columns()).to_parquet("out.parquet")

# sift_client
from sift_client import SiftClient
from sift_client.sift_types.export import ExportOutputFormat

client = SiftClient(api_key=apikey, grpc_url=grpc_url, rest_url=rest_url)
job = client.data_export.export(
assets=["my_asset"], # accepts Asset objects or IDs
channels=["my_channel"], # accepts Channel objects or IDs
start_time=start,
stop_time=stop,
output_format=ExportOutputFormat.PARQUET,
)
files = job.wait_and_download(output_dir="./exports")
```

`wait_and_download` polls the job to completion, downloads the resulting archive, and returns the list of downloaded file paths. Both arguments shown are optional: if `output_dir` is omitted, a temporary directory is created and used. By default the downloaded zip is extracted (and the archive deleted) so you get the Parquet files directly; pass `extract=False` to keep the zip instead, which is useful if you want to hand the whole bundle off to another system without unpacking it client-side.

#### Test Result Logging
Test result create and update events can now be optionally written to a `.jsonl` log file during a test run, then replayed against the Sift API later via the new `import-test-result-log` CLI command (installed with the package).

#### Progress Indicators
Adds progress indicators for job polling and file downloads for better visibility during long-running operations.

#### Stop Bundling `google.api`, `protoc_gen_openapiv2`, and `buf.validate`
Previously the package shipped its own bundled copies of the generated Python bindings for `google.api`, `protoc_gen_openapiv2`, and `buf.validate`. With this release:
- `google.api` and `protoc_gen_openapiv2` are now pulled in via the `googleapis-common-protos` and `protoc-gen-openapiv2` runtime dependencies, so pip installs the upstream-maintained versions.
- `buf.validate` has been removed entirely. The protovalidate field annotations were stripped from the affected `.proto` files, so the generated `_pb2.py` files no longer reference `buf.validate`.

### Bugfixes
- Add `py.typed` to the generated proto directory so type checkers pick up protobuf types correctly.

### Full Changelog
- [Add data import API to sift_client](https://github.com/sift-stack/sift/pull/515)
- [Add HDF5 client-side detect_config](https://github.com/sift-stack/sift/pull/536)
- [Add TDMS client-side detect_config](https://github.com/sift-stack/sift/pull/538)
- [Add optional logging of test result create and update events](https://github.com/sift-stack/sift/pull/529)
- [Add progress indicators for job polling and file downloads](https://github.com/sift-stack/sift/pull/517)
- [Refactor files using run_in_executor](https://github.com/sift-stack/sift/pull/518)
- [Add Parquet as an export output format](https://github.com/sift-stack/sift/pull/510)
- [Add py.typed file to proto dir](https://github.com/sift-stack/sift/pull/524)
- [Remove vendored google.api and protoc_gen_openapiv2 modules from buf](https://github.com/sift-stack/sift/pull/543)
- [Remove protovalidate from client-side protos](https://github.com/sift-stack/sift/pull/545)

## [v0.13.0] - March 24, 2026
### What's New
Expand Down
8 changes: 3 additions & 5 deletions python/lib/sift_client/resources/sync_stubs/__init__.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -653,9 +653,8 @@ class DataImportAPI:
is inferred from the file extension when ``data_type`` is not
provided.

CSV, Parquet, and HDF5 files are supported for auto-detection.
For other formats (TDMS), create the config manually
using ``TdmsImportConfig``.
CSV, Parquet, HDF5, and TDMS files are supported for
auto-detection.

For CSV files, the server scans the first two rows for an optional
JSON metadata row. Row 1 is checked first; row 2 is checked only
Expand Down Expand Up @@ -733,8 +732,7 @@ class DataImportAPI:
completion before proceeding.

When ``config`` is omitted the file format is auto-detected via
``detect_config`` (CSV, Parquet, and HDF5). For other formats
(TDMS), ``config`` must be provided.
``detect_config`` (CSV, Parquet, HDF5, and TDMS).
When ``asset`` is provided it overrides the config value;
otherwise the config's ``asset_name`` is used.
If neither ``run`` nor ``run_name`` is provided (and none is
Expand Down
2 changes: 1 addition & 1 deletion python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "sift_stack_py"
version = "0.13.0"
version = "0.14.0"
description = "Python client library for the Sift API"
requires-python = ">=3.8"
readme = { file = "README.md", content-type = "text/markdown" }
Expand Down
Loading