populate_metadata.py: add batches to write_to_omero#4754
populate_metadata.py: add batches to write_to_omero#4754joshmoore merged 10 commits intoome:metadata52from
Conversation
For extremely large screens (idr0016), both adding map annotations as well as deleting them lead to either PG errors or Ice.MessageSizeMax exceptions. Now both are done in batches of 1000.
| populate.add_argument("--batch", | ||
| type=long, | ||
| default=1000, | ||
| help="Number of objects to save at once") |
There was a problem hiding this comment.
How about save at once -> process at once since it's used for deletes?
|
|
||
| @mark.parametrize("fixture", METADATA_FIXTURES, ids=METADATA_IDS) | ||
| def testPopulateMetadata(self, fixture): | ||
| @mark.parametrize("batch_size", (None, 10, 1000)) |
There was a problem hiding this comment.
RFE for sometime in the future: From past experience writing this sort of thing in omero-features it's useful to have your input test data be at least 2*batch size but not exactly divisible.
|
Note: projections are also causing issues so batching is being added to invocations of |
6a524ed to
3e350bf
Compare
The list of map annotations and file annotations have duplicates which cause secondary deletes to fail.
8c57c0a to
e4f753d
Compare
|
Tested using OMERO.server-5.2.3-170-db884ba-ice35-b45 which includes this PR, using idr0005/screenA and idr0009/screenA which had both previously failed in map annotation deletion. In both cases map annotations can now be deleted and created successfully in batches of 1000. Also idr0009/screenA has the annotation.csv file gzipped and it was successfully unzipped and loaded. |
|
Thanks, @eleanorwilliams . Merging this has happy before it becomes a mega-PR. |
|
--rebased-to #5220 |
What this PR does
The new
QueryContextimplementations inpopulate_metadata.pywere not batching in theirwrite_to_omeromethods. For very large screens (e.g. idr0016):BulkToMapAnnotationContexttends to hit MESSAGESIZEMAX limits andDeleteMapAnnotationContexthitsCaused by: java.io.IOException: Tried to send an out-of-range integer as a 2-byte value: 109734Now both are done in batches of 1000.
Note: projections are also causing issues so batching is being added to invocations of
projectionas well.Testing this PR
test/integration/metadata/test_populate.py.Related reading
cc: @eleanorwilliams