Show image with pagination#4511
Conversation
|
Tested with https://cowfish.openmicroscopy.org/merge/webclient/ Safari, Firfox and Chrome - when logged in as user and when not logged in, and when logged in as another user. All behave as expected. |
| p = omero.sys.ParametersI() | ||
| so = deepcopy(conn.SERVICE_OPTS) | ||
| so.setOmeroGroup(groupId) | ||
| if (datasetId is not None): |
There was a problem hiding this comment.
Particular reason for using parentheses here?
There was a problem hiding this comment.
Too much JavaScript programming! I'll fix...
| # Selecting a 'well' is really for selecting well_sample paths | ||
| # if a well is specified on its own, we return all the well_sample paths | ||
| # than match | ||
| def get_image_ids(datasetId=None, groupId=-1, ownerId=None): |
There was a problem hiding this comment.
This should be a top level function, be documented and be integration tested individually.
|
Functionally, I really like this as it is essential for larger datasets. Especially with the default page size of 200. We are regularly working with datasets which contain 1000+ images, with the 2x or 3x multiplier for most digital pathology use cases. What would be fabulous to know are the performance implications of the various additional queries. Maybe before and after timings on 100, 1000, 5000 and 10000 images? |
|
Some performance metrics (on my local laptop):
|
|
Great information to have! So only non-linear with respect to image count when < 5000. Probably just the floor of executing the query in the first place. Can you post up your test case in a Gist? Probably would be good to also test if it gets worse with the total number of images in the system/group or just the total number of images in the dataset or are orphaned. I guess we have to ask ourselves whether we're willing to accept a 100ms+ overhead per 5000 images to properly select the page correctly and if we can optimise this away somehow. |
|
My testing workflow for above is described at https://gist.github.com/will-moore/33dc5d021a428b161bae I think it's clearly better to accept a 100ms delay per 5000 images than not to be able to find the image at-all. Obviously would be nice to optimise this but I can't think of a way to do this with iQuery? |
|
NB: I tried a few things to speed up https://github.com/openmicroscopy/openmicroscopy/pull/4511/files#diff-9704a8389cb2e3a7dd3f753858c9a1baR432 but didn't find anything faster. If there's something specific I should look at, let me know. |
|
Thanks @joshmoore. @chris-allan anything else to do for this PR? |
|
Yes, I would expect the total number of objects to have an affect on the performance. You're performing two subqueries in an inclusion scenario. You can play with the queries themselves at the PostgreSQL level using Agreed that it's better for this functionality to work and accept the 100ms+ delay. It would however be prudent to be mindful of that delay in the context of databases with large numbers of images. |
|
While I do think further performance investigation of a form similar to the above should be done the PR probably should be merged as is. Information on performance impacts can be added to this PR retroactively. |
|
@joshmoore Any way I can connect to an IDR server from my machine, or something else with similar image counts? See #4511 (comment). Non-production server would be best since we don't know the impact of these queries. |
|
@will-moore : you should be able to make use of https://trello.com/c/rVVBCZ4r/245-sysgro-copy-for-cambridge -- please report any OOMs or similar on that card. |
|
Demo user on orca-3, there are 139597 orphaned images.
I didn't try to run any scripts to move image into large Dataset. |
|
@joshmoore Since @chris-allan is happy that this |
|
Given that sort of image count that level of performance degradation seem reasonable to me. |
This fixes the navigation to an image with the url
webclient/?show=image-<id>when that image is not on the first page of a Dataset.To test:
Also added web integration test which should pass!