Spatially sort output features

As per [GeoParquet best practices](https://github.com/opengeospatial/geoparquet/pull/254/files) it would be good to spatially sort the output features, for two reasons:

* Typically, clients will restrict their parquet queries to a region of interest. Without spatial sorting, most parquet chunks contain data from locations all over the planet. The current lack of spatial correlation means that clients have to decompress almost all chunks in the parquet file, no matter their query bounding box — this is expensive. If the data was spatially sorted, queries would be (much) faster because they’d only have to decode those few parquet chunks that actually intersect the queried bounding box.
* Also, spatial sorting will likely reduce the output file size. Because nearby features tend to share tags like street and city names, there will be a higher chance of sharing tags between features of a single parquet chunk.

Currently, class GeoParquetWriter seems to emit features in the same order as they happen to be passed from libosmium. Consider extending the implementation of GeoParquetWriter to calculate the center lat/lon of each feature’s bounding box. Then, find the position of that point along a space-filling Hilbert curve, and use this number as a sort key for an external sort. There's several python libraries for hilbert curves, and likewise for external sorting.

To check the difference, try `gt sort hilbert` from [GeoParquet tools](https://github.com/cholmes/geoparquet-tools?tab=readme-ov-file#sort). Perhaps you could simply call this tool in a post-processing step, before uploading the layercake output. But it seems a little heavy to bundle DuckDB; doing this yourself from python seems easy enough.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Spatially sort output features #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Spatially sort output features #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions