This repository was archived by the owner on Jul 11, 2025. It is now read-only.
Improved site and region assessment processes#72
Merged
ConnectedSystems merged 9 commits intomainfrom Jun 10, 2025
Merged
Conversation
For efficient spatial querying
File is needed to use as a template when writing out COGs. TODO: Fix up hard coded filename suffix.
Based on an arbitrary threshold of estimated raster size, use a regular in-memory matrix (< 700MB) or a Extendable Sparse Matrix (to reduce memory use, at the cost of higher COG write times).
Rather than relying on index-based search of rasters or coordinate-based filtering, apply Sort-Tile-Recursive Trees (based on R-tree indexing) for quick and efficient filtering.
Signed-off-by: Peter Baker <peter.baker122@csiro.au>
We've already stored the dim values anyway
Appease the great linter!
PeterBaker0
approved these changes
Jun 10, 2025
Collaborator
PeterBaker0
left a comment
There was a problem hiding this comment.
Tested - working with very very nice performance improvements, especially for the suitability assessment.
Had some missing variables/errors but have fixed them, tested E2E for regional and suitability with new data. Will need to do a data update on EFS.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Avoids relying on any rasters for assessment (although it uses the raster of "valid" pixels to simplify writing results out).
Leverages STR trees (Sort-Tile-Recursive trees) for quick and efficient spatial indexing to reduce search costs compared to indexing rasters or searching dataframes directly.
Also uses Extendable Sparse Matrices to store raster data if the raster size is estimated to be >= 700MB (limit is arbitrary and should be tuned/tunable).
Spatial areas that resolve to be < 700MB use regular matrices and result in faster COG write times (~30% from previous timings).
Sparse matrices reduce memory use by orders of magnitude, but incur additional overhead when writing data out to file.
There's also currently a hardcoded filename suffix (
$(region_name)_valid_slopes.tif) - I didn't know where I should put it.This file is used as a template so that the same extents, spatial coordinates, and CRS are used when creating COGs.
These changes are untested - I don't have the energy tonight to work out how to run ReefGuide methods locally and there's no working example in the readme any more 😢
Note: Most recent processed data includes
lons/latscolumn in the lookup table so do not need to be recreated on data load.If added a simple check to generate these columns if needed for now.