Skip to content

Merge populate_metadata PRs#5232

Merged
joshmoore merged 65 commits intoome:metadata53from
atarkowska:merge_populate
Apr 7, 2017
Merged

Merge populate_metadata PRs#5232
joshmoore merged 65 commits intoome:metadata53from
atarkowska:merge_populate

Conversation

@atarkowska
Copy link
Copy Markdown
Member

@atarkowska atarkowska commented Apr 4, 2017

What this PR does

merge #5218 #5220 #5222 #5223 #5224 #5226 #5227 #5229 (no conflicts)

TODO:

#5074 depends on #5215

emilroz and others added 30 commits April 4, 2017 11:08
Most methods for dataset loading and parsing were left unimplement.
Now a `Dataset:`-style object can be passed to populate_metadata.py
and images will be looked up by name. Note: there's a small bug with
name lookup that will be corrected separately.
The assumptions for well/imaging naming in a plate or screen
differ from those from image naming in a dataset since there's
no unique way to reference an image in a dataset like there is
well "A1" for example. This commit loosens some of those rules
to allow image columns and image name columns to work together
in the case of datasets. The assumption is that for population
the ID of the image in a dataset won't be known. Instead names
of images will be used as a unique identifier. Currently only
a warning is issued if the name is not unique.
In general, populate_metadata.py looks to be in line for a
refactoring. The number of if-clauses as well as the unhandled
cases (like no catch-all for unknown targets in delete) is
making this ever harder to work with.

All tests passing.
In order to allow Projects to smartly handle multiple images
with the same name (though not in the same dataset), the internals
of ValueResolver have been hidden within a ValueWrapper class.
ValueResolver chooses once which ValueWrapper to use internally
after which the various if/then blocks based on target object are
no longer necessary (needs further refactoring). There *are* still
if/then blocks basked on column-type. These could use some cleaning
but will likely remain to be necessary for multiple-dispatch style
handling.
For extremely large screens (idr0016), both adding
map annotations as well as deleting them lead to either
PG errors or Ice.MessageSizeMax exceptions. Now both
are done in batches of 1000.
manics and others added 22 commits April 4, 2017 16:36
This should be enabled once deletion is fixed
It just seems to add unnecessary complexity
Also add ns and id to CanonicalMapAnnotation.__str__
When a well is missing from a plate, a warning is printed. The
same now happens when an image is missing from a dataset. Likely,
a `--strict` argument should be added which will force the existence
of all objects.
This makes it easier to deal with URLs which require escaping.
It makes more sense to test small batch sizes since the integration tests use very few annotations.
The alternative is to keep batch size 1000, but this unnecessarily increases the time to run the tests.
BulkToMapAnnotation for large screens (+100K wells)
was being killed by the OOMKiller. Strategies include:

 * unload linked objects
 * drop `andReturn` where possible
 * return only sizes when possible
 * write actively in loops
Storing all of the plate data for idr0016 led to
death-by-OOM-killer during *initialization*. Now,
Ice objects are not being stored but wrapper very
thin Data objects.
@atarkowska atarkowska changed the title Merge populate Merge populate_metadata PRs Apr 4, 2017
@atarkowska
Copy link
Copy Markdown
Member Author

atarkowska commented Apr 6, 2017

@joshmoore @manics this PR could be potentially rebased to develop, to make mapr compatible

@joshmoore
Copy link
Copy Markdown
Member

Merging so that @aleksandra-tarkowska can continue the rebasing. The sum of this PR and those that follow will need to be reviewed for potential inclusion in 5.3.x.

@joshmoore joshmoore merged commit f5e261f into ome:metadata53 Apr 7, 2017
@joshmoore joshmoore deleted the merge_populate branch April 7, 2017 12:38
@atarkowska
Copy link
Copy Markdown
Member Author

--rebased-to #5241

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants