Skip to content

Handling OMERO and Bio-Formats Java dependencies #1772

@chris-allan

Description

@chris-allan

Our team has been plowing through various integration work with CellProfiler, Bio-Formats and OMERO. This involves utilizing the existing CellProfiler-OMERO integration in both a classical user focused, graphical user interface context and also a headless, run from OMERO.scripts context both for ourselves and on behalf of our users.

While running pipelines in anger and on large datasets (plates in the neighborhood of 384 well, 10 field, 2048x2048, 5 channels as an example), headless or not, we have come across some bugs in OmeroReader that need to be fixed. This class comes from OMERO's Blitz.jar and is currently bundled into the prokaryote mega JAR. As CellProfiler communicates with OMERO through a CellProfiler (Python) -- javabridge --> Bio-Formats (Java) --> OMERO (Java) conduit, we need ways to affect and confidently state to CellProfiler which JARs we want used when CellProfiler extracts pixel data from OMERO. We've played a lot with this over the past few months.

This is where we start running into issues affecting the CellProfiler CLASSPATH. Recently we saw #1733 go in which understandably attempts to manage as many of the Java dependencies that CellProfiler uses under one roof via PyPi. Completely sensible and much appreciated! Unfortunately that has the downstream effect of putting these dependencies into a singular walled garden that we cannot affect save blowing it away entirely and explicitly defining, on our own CLASSPATH, all the JARs we want to use. This is compounded by the fact that JARs on the CLASSPATH in the current user's environment are appended to the CLASSPATH that CellProfiler ends up using. Thus resulting in an inability to, via the return first match semantics of Java CLASSPATH resolution, replace one or more classes, or even entire JARs, whose classes are present in the mega JAR.

Furthermore, the versions and transitive dependencies that are pulled in are fixed. Using OMERO as an example, these are pinned to OMERO 5.1.3-ice35-b52 at the time of this writing. OMERO 5.1.4 was released in September 2015, 5.2.0 in November and 5.2.1 in December. That the major versions, at a minimum, of these client JARs match with those of the server is essential to the function of CellProfiler-OMERO integration. What we are not suggesting is that it is the job of the CellProfiler project to come up with at solution to manage these version differences. Again, we just need a way to robustly, properly, and with the CellProfiler project's knowledge state which versions we want used.

Our team are quite prepared to scratch our own itch and contribute to resolving these issues but the number of moving parts here is quite substantial. Especially in light of CellProfiler's direct dependency on Bio-Formats. As far as the CellProfiler codebase is concerned we're feeling a bit out of our depth. Before we embark on a pull request, a few questions:

  • Are we completely misrepresenting the situation? Is there something essential that we have overlooked?
  • Should we bring the user's environmental CLASSPATH to the front of the line via changes to cpjvm.py?
  • What are your thoughts about OMERO and Bio-Formats version management in the context of prokaryote.jar? Should "advanced" users just delete this and manage the versions of these dependencies ourselves via our own CLASSPATH environment variables?

Thanks!

/cc @emilroz, @joshmoore, @jburel, @sbesson, @bramalingam

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions