Skip to content

Load multiple images separately or as a polarization hypercube#106

Merged
confluence merged 25 commits intodevfrom
icchen/load_stokes_hypercube
Oct 24, 2023
Merged

Load multiple images separately or as a polarization hypercube#106
confluence merged 25 commits intodevfrom
icchen/load_stokes_hypercube

Conversation

@I-Chenn
Copy link
Copy Markdown
Contributor

@I-Chenn I-Chenn commented Apr 19, 2023

This PR is to resolve #90.

A new function, load_stokes_hypercube, is added into session.py to load (either open or append, with the usage of append argument) a hypercube with the selected stokes images.

Nevertheless, the arguments of the function are not validated yet since we don't have validator for a list (e.g. list of strings or numbers) yet. Will discuss with @confluence and see what kind of validator should be added.

The preliminary result is shown below:
https://drive.google.com/file/d/1sR8xE3NFnfYRSku6JMMXfBOp-eyyyBLU/view?usp=share_link

Comment thread carta/session.py Outdated

Parameters
----------
output_direcory : {0}
Copy link
Copy Markdown
Contributor

@kswang1029 kswang1029 Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this output_directory parameter? A Stokes hypercube is a virtual image which is constructed from two, three, or four on-disk images.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be just the current working directory under the scene.

Copy link
Copy Markdown
Contributor Author

@I-Chenn I-Chenn Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is because the parameter is required by openConcatFile and appendConcatFile (see the directory of the image shown below):
image
I'm not actually sure why this parameter is required here, since the directories of the input images are given by IStokesFile (see this line). And if I set this output_directory to other directory (different from the directory with stokes images), the hypercube is still able to be loaded out. Then when I try to Save Image in the File menu, the default directory is the one I just set. Thus I recognize it as the output directory.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one fundamental design problem here: the user facing API does not necessary follow the implementation of the frontend. We only expose parameters that really require user input from users. Others should just happen under the scene. You will need to think as a user when designing the API.

Comment thread carta/session.py Outdated
The direcory to output the hypercube, either relative to the session's current directory or an absolute path relative to the CARTA backend's root directory.
directories : {1}
The list of paths to the images, either relative to the session's current directory or an absolute path relative to the CARTA backend's root directory.
files : {2}
Copy link
Copy Markdown
Contributor

@kswang1029 kswang1029 Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be a list of path? If the paths in the path list do not contain /, then we load the images from the current directory. If there is /, we treat all the paths as absolute path to load images. This would simply the API and improve the UX I believe.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we combine the directory and the file name together, and then do something like:
截圖 2023-04-19 下午9 13 39
to separate them?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I think I got your point. I will make the modification tomorrow morning.

Comment thread carta/session.py Outdated
"""
return Image.new(self, path, hdu, True, complex, expression)

def load_stokes_hypercube(self, output_direcory, directories, files, polarization_type, hdu="", append=False):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in multiple places: direcory -> directory

Comment thread carta/session.py Outdated
Whether the hypercube should be appended. Default is ``False``.
"""
stokes_files = []
for i in range(len(directories)):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, in Python for is a "foreach", so you shouldn't ever use for i in range(len(things)). If you only need the object values, you can do for thing in things, and if you also need the index you can do for i, thing in enumerate(things).

In this case, I think that instead of taking multiple separate lists in as parameters, and combining them like this inside the function, we should accept one list of already constructed objects, one per image (this avoids a situation where the lists are different sizes or the user mixes up the element order). These objects could be dicts, but I think it would be more user-friendly to create a custom object, called e.g. StokesImage (which will be serialized to the same dict). That would allow some of the fields to have defaults, and would also make validation easier. We should probably make a separate top-level submodule for future objects like this, maybe called structs.py.

I was wrong -- we already do have a validator for iterables (IterableOf). What we don't have is a validator which just checks type (which we would need for validating the elements of the list). I will add that. I will also modify the serializer code so that it's easier to specify how our custom objects are serialized. In the meantime, you can assume that your parameter is going to be a list of dictionaries -- I'll make some more specific suggestions for creating the object.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will finish testing #107 and merge it tomorrow. When this is merged, you should be able to use IterableOf(InstanceOf(StokesImage)) for that parameter.

I still need to think about how best to handle paths -- the session tracks the wrapper's current directory independently of the GUI file browser's current directory (for complicated reasons; mostly that whenever you open a file the GUI's current directory will be updated to that file's directory, so if we made paths relative to that it would lead to very unexpected behaviour). So we calculate relative paths relative to our current directory, which the session stores. That means that we can easily calculate these paths in session and image functions, but if we want the user to be able to provide a relative path in an object outside the function, we need to do that conversion later in the function. I think I have an idea for doing this cleanly; I will write up a more detailed suggestion tomorrow.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a suggested implementation (partial and untested, and depends on the validation PR). Comments are for explanation purposes only:

###### In constants.py, before Polarization (dict keys = strings matching Polarization enum names; values = ints matching the *protobuf* enum numbering):

PROTO_POLARIZATION = {
    "I":  1,
    "Q": 2,
    # (...)
    "PANGLE": 17,
}

# Then we add an init to the Polarization enum to add this index as an additional property (like in PaletteColor):

    def __init__(self, value):
        self.proto_index = PROTO_POLARIZATION[self.name]

###### In a new structs.py:

# (imports here...)

class StokesImage:

    # It should be fine to use this decorator here
    # We validate these fields when this object is constructed, so that later we only have to validate the object type
    @validate(Constant(Polarization), String(), String())
    def __init__(self, stokes, path, hdu=""):
        self.stokes = stokes
        # Here we only split the path for now; we will make it absolute later
        self.directory, self.file_name = posixpath.split(path)
        self.hdu = hdu
        
    def json(self):
        # This is returning the object which will actually be serialized by the serializer (in this case a dictionary).
        # Note the lookup of the protobuf index through the additional property on the enum:
        return {"directory": self.directory, "file": self.file_name, "hdu": self.hdu, "polarizationType": self.stokes.proto_index}
        
###### In session.py, in your new function

    # Now we can have one list as the main parameter
    # Does it make sense to require at least two images?
    @validate(IterableOf(InstanceOf(StokesImage), min_size=2), Boolean())
    def load_stokes_hypercube(self, stokes_images, append=False):
        # Now that we're in the session, we can normalize the paths:
        for image in stokes_images:
            image.directory = self.resolve_file_path(image.directory)
        # If it really doesn't matter what the output directory is, we could use a macro
        # to get the starting directory of the file browser
        # which is used in several frontend actions that use addFrame to open generated images
        output_directory = Macro("fileBrowserStore", "startingDirectory")
        # If I'm reading the frontend code correctly, the single hdu parameter is the output hdu, and we can just use the default
        output_hdu = ""

        # Now you can get the command as before
        command = "appendConcatFile" if append else "openConcatFile"
        self.call_action(command, stokes_images, output_directory, output_hdu)

I'm assuming that you need the protobuf polarization enum numbering in the objects that you're passing to the action (this should be tested with the polarizations with indices that differ (xx, xy, etc.)).

So the API for the user would look something like this:

session.load_stokes_hypercube([
    StokesImage(Polarization.I, "path/to/imageI.fits"),
    StokesImage(Polarization.Q, "path/to/imageQ.fits"),
    StokesImage(Polarization.U, "path/to/imageU.fits"),
    StokesImage(Polarization.V, "path/to/imageV.fits"),
])

All the optional parameters are hidden, there's one parameter for each image path, and the same polarization enums are used here as in the existing functions (the mapping is done internally).

If we like this approach, we can perhaps do something similar for the function which opens multiple images (but with a different wrapper object).

@I-Chenn
Copy link
Copy Markdown
Contributor Author

I-Chenn commented Apr 21, 2023

@confluence, c4bc4f9 is a temporary commit before the merge of #107. Most your requested modification is made in this commit. I will make a test after the merge of #107.

Two items to check:

  1. Did I put PROTO_POLARIZATION at the right position?
  2. As for the validation of the stokes argument in the initial function of StokesImage, should it be PROTO_POLARIZATION instead of Polarization?

@confluence
Copy link
Copy Markdown
Collaborator

I'm going to have a look at your changes now. I have merged #107.

Comment thread carta/constants.py Outdated
Comment on lines +143 to +147
"Ptotal": 13,
"Plinear": 14,
"PFtotal": 15,
"PFlinear": 16,
"PANGLE": 17,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These all need to be in all caps -- they have to match the names in the Polarization enum here, which are in all caps because of Python naming styles.

Comment thread carta/constants.py Outdated
Comment on lines +151 to +152
def __init__(self, value):
self.proto_index = PROTO_POLARIZATION[self.name]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the wrong place -- it's the __init__ function for the Polarization class, so it needs to be inside the class definition (and indented).

This will be called automatically for each value in the enum -- I have tested that it works for the plain class attribute style of specifying the options (YX = -8 and so on), so you shouldn't have to change the rest of Polarization.

Comment thread carta/structs.py Outdated
import posixpath

from .validation import validate, String, Constant
from .constants import Polarization, PROTO_POLARIZATION
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't import PROTO_POLARIZATION here.

Comment thread carta/structs.py Outdated

class StokesImage:

@validate(Constant(PROTO_POLARIZATION), String(), String())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't use PROTO_POLARIZATION here -- it's not a validator class; it's just a dict. You use Polarization -- the user passes in the same enum object that they would use for polarizations elsewhere.

You get the protobuf index instead of the frontend int value by using the .proto_index attribute that we added to each of these enum values above.

Comment thread carta/session.py Outdated
"""
return Image.new(self, path, hdu, True, complex, expression)

# @validate(IterableOf(InstanceOf(StokesImage), min_size=2), Boolean())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should be correct.

Comment thread carta/session.py Outdated
Parameters
----------
stokes_images : {0}
The list of the paths to the images, either relative to the session's current directory or an absolute path relative to the CARTA backend's root directory.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be The list of images. -- the type should be filled in automatically, and the StokesImage class should have its own documentation which explains its parameters.

Comment thread carta/structs.py Outdated
@@ -0,0 +1,16 @@
import posixpath
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a top-level docstring here. For now I suggest """This module provides a collection of helper objects for grouping related values together."""

Comment thread carta/structs.py Outdated
from .constants import Polarization, PROTO_POLARIZATION


class StokesImage:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class needs a docstring. Have a look at util.Macro to see how to write it. For the main description I'd put An object which groups information about an image file to be used as a component in a Stokes hypercube.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@confluence, I added some documentation to the StokesImage class.
Nevertheless, when describing the stokes parameter, I quoted the carta.constants.Polarization object, but the stokes index of carta.constants.Polarization is not exactly the same with PROTO_POLARZATION and of course I can't specify the parameter like carta.constants.PROTO_POLARIZATION. What should I do to specify the correct index for stokes parameter?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user doesn't need to know anything about PROTO_POLARIZATION. The user passes in a Polarization enum as a parameter, not a number which matches the protobuf numbering. You are doing the mapping inside the function by extracting the index attribute from the enum object.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer explanation: in fact, PROTO_POLARIZATION is just used internally to set those index attributes in the Enum constructor, and is never used again.

The reason that it's not in the constructor's internal scope is 1) that would be slightly inefficient -- it would be recreated every time the constructor was called, and 2) we or the user might want to reuse this mapping for something, so it's useful to expose it. The reason it's not in the Enum's class scope is that it would pollute the Enum's namespace, and then we would have to do something else to exclude the name of the dict from the Enum values.

That's why I ended up putting those two theme palette mappings in the module's scope, and suggested the same for this mapping.

@I-Chenn
Copy link
Copy Markdown
Contributor Author

I-Chenn commented Apr 24, 2023

@confluence, I have made the requested modifications in the last few commits.

Comment thread carta/structs.py Outdated
Comment on lines +14 to +18
stokes : :obj:`carta.constants.Polarization`
The Stokes type to specify. Must be a member of :obj:`carta.constants.Polarization`
path : str
The path to the image file.
hdu : str
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These descriptions should be consistent with what would be filled in by the validation decorator (it won't be filled in here because the decorator is on the __init__ method and the docstring is on the whole class).

So it should be

a member of :obj:`carta.constants.Polarization`

and string for the other two.

Have you been able to build the docs locally?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 27031ca!
Yes, I am able to build the docs locally~

Comment thread carta/structs.py Outdated
from .constants import Polarization, PROTO_POLARIZATION


class StokesImage:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user doesn't need to know anything about PROTO_POLARIZATION. The user passes in a Polarization enum as a parameter, not a number which matches the protobuf numbering. You are doing the mapping inside the function by extracting the index attribute from the enum object.

@I-Chenn I-Chenn requested a review from confluence April 25, 2023 13:26
@I-Chenn I-Chenn marked this pull request as ready for review April 25, 2023 13:26
@confluence confluence self-assigned this May 10, 2023
@kswang1029 kswang1029 requested a review from acdo2002 May 11, 2023 12:46
Comment thread carta/session.py Outdated

Parameters
----------
stokes_images : {0}
Copy link
Copy Markdown
Contributor

@acdo2002 acdo2002 Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here should add a description of StokesImage: is array in which each element is an instance of StokesImage class under carta.structs, otherwise the user may confuse what should input in here

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type is automatically filled in by the validation decorator when the documentation is generated. You should be able to see the full text if you generate the docs locally.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! I saw it.
Screen Shot 2023-06-05 at 4 20 22 PM
But do you think still good to put the description here? Users may have a hint to find out the input structure.
Or you prefer to leave all input information in the /Users/ming-yi/carta-python/docs/build/html/index.html ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. There should be a link here to the documentation for StokesImage; I will see if I can edit the instance validator class to output that if the class is part of the wrapper.

Copy link
Copy Markdown
Collaborator

@confluence confluence Jun 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #121. @I-Chenn, please merge dev into this PR. @acdo2002 once this change is merged, the class name in this description will be a link to the documentation for StokesImage, and that's where the structure of StokesImage should be described.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #121. @I-Chenn, please merge dev into this PR. @acdo2002 once this change is merged, the class name in this description will be a link to the documentation for StokesImage, and that's where the structure of StokesImage should be described.

Done in 707fd1e.

Copy link
Copy Markdown
Contributor

@acdo2002 acdo2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have confirmed the documentation now is linking the StokesImage structure description.
The testing example is here

@confluence confluence dismissed their stale review August 4, 2023 22:53

I have taken over this PR, so I am removing myself as a reviewer.

@confluence confluence marked this pull request as ready for review August 18, 2023 10:05
@confluence
Copy link
Copy Markdown
Collaborator

I have updated this to use the frontend function for guessing polarizations (this depends on frontend PR #2232). I have also added a check that the deduced polarizations have no duplicates, since the GUI won't let you assign duplicate polarizations.

Copy link
Copy Markdown
Contributor

@kswang1029 kswang1029 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the following two cases for open_hypercube:

image_file_8 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.I.dropdeg.manual.pbcor.fits'
image_file_9 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.Q.dropdeg.manual.pbcor.fits'
image_file_10 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.U.dropdeg.manual.pbcor.fits'
image_file_11 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.V.dropdeg.manual.pbcor.fits'

image_file_8~11 are images without the a CTYPE of STOKES. This would results in errors with a long header dict plus TypeError: Cannot read properties of undefined (reading 'toUpperCase')

image_file_12 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.I.manual.pbcor.fits'
image_file_13 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.Q.manual.pbcor.fits'
image_file_14 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.U.manual.pbcor.fits'
image_file_15 = '/Users/kswang/carta_image_pool/set_StokesCube/IRCp10216_sci.spw0.cube.V.manual.pbcor.fits'

image_file_12~15 are images with explicit CTYPE as STOKES and this works.

Comment thread carta/session.py Outdated
Comment thread carta/session.py Outdated

Parameters
----------
output_direcory : {0}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be just the current working directory under the scene.

Comment thread carta/session.py Outdated

Parameters
----------
output_direcory : {0}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one fundamental design problem here: the user facing API does not necessary follow the implementation of the frontend. We only expose parameters that really require user input from users. Others should just happen under the scene. You will need to think as a user when designing the API.

@confluence
Copy link
Copy Markdown
Collaborator

image_file_8~11 are images without the a CTYPE of STOKES. This would results in errors with a long header dict plus TypeError: Cannot read properties of undefined (reading 'toUpperCase')

I think that this is something which needs to be fixed in the frontend. The toUpperCase is a new addition, and may not have been tested with these files.

@kswang1029
Copy link
Copy Markdown
Contributor

image_file_8~11 are images without the a CTYPE of STOKES. This would results in errors with a long header dict plus TypeError: Cannot read properties of undefined (reading 'toUpperCase')

I think that this is something which needs to be fixed in the frontend. The toUpperCase is a new addition, and may not have been tested with these files.

but with yuhsuan/2222_stokes_guess the GUI works fine in these cases (including lower case I q u v)

@confluence
Copy link
Copy Markdown
Collaborator

but with yuhsuan/2222_stokes_guess the GUI works fine in these cases (including lower case I q u v)

That's odd. I will investigate further.

@confluence
Copy link
Copy Markdown
Collaborator

confluence commented Aug 22, 2023

I think I see what's going on -- the header I'm sending to the frontend function contains entries with no value key. I don't know why this isn't happening in the GUI -- the header should be exactly the same. Perhaps there are empty default values which are being removed by the serialization round trip. I think that the round trip can be avoided entirely; I will make a suggestion in the frontend PR.

@confluence
Copy link
Copy Markdown
Collaborator

@kswang1029 I have adjusted the code to match the latest frontend branch, and that test case now works. This approach should be much simpler and less error-prone (the wrapper now doesn't touch the header at all; everything happens on the frontend side).

@kswang1029
Copy link
Copy Markdown
Contributor

kswang1029 commented Aug 25, 2023

@confluence @YuHsuan-Hwang I retested this with updated frontend commit and carta-python commit and found an issue
If we do the following

img0 = session.open_image(image_file_3)

the image is loaded but we see image tile streaming hangs forever (so we keep seeing the green cloud at the top-right corner and not seeing the tile counts deceasing). As a result we see empty image in the viewer. Could you see if you can also reproduce the issue?

UPDATE: I guess I see why, the frontend branch is behind dev which now includes the sync_id improvement. The dev backend does have so the protobuf commits are mis-matched. @YuHsuan-Hwang could you update dev into CARTAvis/carta-frontend#2232 please?

@YuHsuan-Hwang
Copy link
Copy Markdown

@kswang1029 I have updated the branch.

@kswang1029 kswang1029 self-requested a review August 25, 2023 02:14
Copy link
Copy Markdown
Contributor

@kswang1029 kswang1029 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all good now 👍

Copy link
Copy Markdown
Collaborator

@confluence confluence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready to merge.

@confluence confluence merged commit 91705c2 into dev Oct 24, 2023
@confluence confluence deleted the icchen/load_stokes_hypercube branch October 24, 2023 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

support loading a Stokes hypercube

5 participants