Skip to content

improve graph operation performance#4398

Merged
jburel merged 1 commit intoome:developfrom
mtbc:adjust-graph-type-query
Feb 24, 2016
Merged

improve graph operation performance#4398
jburel merged 1 commit intoome:developfrom
mtbc:adjust-graph-type-query

Conversation

@mtbc
Copy link
Copy Markdown
Member

@mtbc mtbc commented Jan 6, 2016

With thanks to @joshmoore, adjust how graph traversal uses Hibernate to determine the exact class of model objects. Testing is by CI.

--no-rebase

(Supersedes #4378 because I don't have a reopen button.)

@jburel
Copy link
Copy Markdown
Member

jburel commented Jan 19, 2016

@mtbc: did that PR get tested?

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented Jan 20, 2016

@jburel: @joshmoore will try it out via #4400 once it's a better time to challenge the IDR server.

@jburel
Copy link
Copy Markdown
Member

jburel commented Feb 10, 2016

@joshmoore did you have a chance to try it out?

@jburel
Copy link
Copy Markdown
Member

jburel commented Feb 17, 2016

Still to be considered for 5.2.2?

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented Feb 17, 2016

@joshmoore: ping? (Want to try it out in some non-IDR way?)

@joshmoore
Copy link
Copy Markdown
Member

So far I've only been able to do some dry-run testing, which was ok, but I couldn't reproduce the massively slow situation that I had previously (it's since been cleaned up). I'm happy to give this a try, but a safer approach may be to have someone sign off on it in terms of correctness (i.e. are all graph operations in the last time correct and not slower).

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented Feb 17, 2016

Fine with me if you want to try something like, exclude this, run integration tests, include it and run them again, to compare timing.

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented Feb 17, 2016

Also happy to try and help with scripts like, comment on every one of the plate wells, or whatever, if that helps with constructing test data.

@joshmoore
Copy link
Copy Markdown
Member

Note: likely also to be tested by @jballanc and @chris-allan for widening DB support.

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented Feb 22, 2016

Without this PR,

 processor: delete ome.model.fs.FilesetEntry[1]
 processor: delete ome.model.fs.FilesetJobLink[1,2,3,4,5]
 processor: delete ome.model.display.Thumbnail[1]
 processor: delete ome.model.jobs.JobOriginalFileLink[33]
 processor: delete ome.model.core.Channel[1]
 processor: delete ome.model.display.ChannelBinding[1]
 processor: delete ome.model.annotations.ImageAnnotationLink[1,2]
 processor: delete ome.model.stats.StatsInfo[1]
 processor: delete ome.model.jobs.MetadataImportJob[18]
 processor: delete ome.model.jobs.PixelDataJob[19]
 processor: delete ome.model.annotations.CommentAnnotation[1]
 processor: delete ome.model.annotations.LongAnnotation[2]
 processor: delete ome.model.core.OriginalFile[41]
 processor: delete ome.model.core.OriginalFile[40]
 processor: delete ome.model.core.LogicalChannel[1]
 processor: delete ome.model.display.RenderingDef[1]
 processor: delete ome.model.jobs.IndexingJob[21]
 processor: delete ome.model.jobs.ThumbnailGenerationJob[20]
 processor: delete ome.model.jobs.UploadJob[17]
 processor: delete ome.model.core.Pixels[1]
 processor: delete ome.model.display.QuantumDef[1]
 processor: delete ome.model.core.Image[1]
 processor: delete ome.model.fs.Fileset[1]

With this PR,

 processor: delete ome.model.fs.FilesetEntry[2]
 processor: delete ome.model.fs.FilesetJobLink[6,7,8,9,10]
 processor: delete ome.model.display.Thumbnail[2]
 processor: delete ome.model.jobs.JobOriginalFileLink[34]
 processor: delete ome.model.core.Channel[2]
 processor: delete ome.model.display.ChannelBinding[2]
 processor: delete ome.model.annotations.ImageAnnotationLink[3,4]
 processor: delete ome.model.stats.StatsInfo[2]
 processor: delete ome.model.jobs.MetadataImportJob[23]
 processor: delete ome.model.jobs.PixelDataJob[24]
 processor: delete ome.model.annotations.CommentAnnotation[3]
 processor: delete ome.model.annotations.LongAnnotation[4]
 processor: delete ome.model.core.OriginalFile[44]
 processor: delete ome.model.core.OriginalFile[43]
 processor: delete ome.model.core.LogicalChannel[2]
 processor: delete ome.model.display.RenderingDef[2]
 processor: delete ome.model.jobs.IndexingJob[26]
 processor: delete ome.model.jobs.UploadJob[22]
 processor: delete ome.model.jobs.ThumbnailGenerationJob[25]
 processor: delete ome.model.core.Pixels[2]
 processor: delete ome.model.display.QuantumDef[2]
 processor: delete ome.model.core.Image[2]
 processor: delete ome.model.fs.Fileset[2]

@chris-allan: These log lines report the class and IDs about to be passed to the method that @jballanc had a breakpoint set in.

@chris-allan
Copy link
Copy Markdown
Member

So if I'm reading this right, the only difference in order is that with this PR the deletion of ome.model.jobs.UploadJob is before ome.model.jobs.ThumbnailGenerationJob? Coincidence? Differences in lexical sorting?

The reason why we have multiple instances of ome.model.core.OriginalFile deletion is because we are not performing an overarching reduce step, correct?

@mtbc
Copy link
Copy Markdown
Member Author

mtbc commented Feb 22, 2016

Difference in order: there is some initial hashing by both name and ID before the class names are used as keys in a hash table. My guess would be that a change in order caused by ID being in the mix for the first loop caused the instances of those two classes to be added to the same bucket in HashMultimap in a different order. Certainly the code isn't meaning to relatively order those two; they do end up in the same "batch" before we start going class-by-class.

The multiple original file deletions is funny: we batch them in reverse order of path length to avoid the risk of trying to delete a directory before its contents. Update: Ah, yes, the import log is in the directory above the actual image file I'd imported.

@pwalczysko
Copy link
Copy Markdown
Member

Tested plate with 16 wells and > 100 ROIs per well on first 5 wells, > 10 Tags, > 10 Attachments for each of the 16 wells, Deleted within 2.182 secs.
Also tested 96 well plate, deleted 5.353 secs.
Bruker dataset within 2 -3 sec (many accompanying files).
Lei 56-strong MIF was the longest with 7.697 secs.
Also tried Moves of all the above. All went fine.

@jburel
Copy link
Copy Markdown
Member

jburel commented Feb 24, 2016

Thanks all merging.

jburel added a commit that referenced this pull request Feb 24, 2016
improve graph operation performance
@jburel jburel merged commit 49d1a2b into ome:develop Feb 24, 2016
@mtbc mtbc deleted the adjust-graph-type-query branch February 24, 2016 10:55
@jburel jburel mentioned this pull request Feb 25, 2016
@jburel jburel added this to the 5.2.2 milestone Feb 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants