Skip to content

Restructure python data IO/actions#1080

Closed
wilsonmr wants to merge 2 commits into
masterfrom
restruct-pydat
Closed

Restructure python data IO/actions#1080
wilsonmr wants to merge 2 commits into
masterfrom
restruct-pydat

Conversation

@wilsonmr
Copy link
Copy Markdown
Contributor

@wilsonmr wilsonmr commented Feb 3, 2021

in #1052 I added some providers for the loading of commondata/covmats

I think that the flat structure will make this harder to digest as we add more things. My idea would be to also add the python make replica stuff here as well as the corresponding actions. This is just a suggestion, might not be the best way to do things.

Obviously a sticking point is that the convolution is not contained here, but the action which leverages it is. I personally would be in favour of moving convolution here as well as the fkparser. But perhaps this isn't the best way to organise things.

My endgame idea was that everything which leads up to (and includes) the tuple of data and theory aka results is contained here. Eventually it would mean that we could slim down validphys.covmats and validphys.results to contain things which are outputted to reports.

Whilst I was at it, I added some stuff to the docs on recent python data object developments.

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Feb 3, 2021

Not convinced about the subfolder. Seems to me a lot of validphys fits under

results_providers.py
Bridges between underlying functions concerned with:
 - loading theory predictions and data
 - constructing covariance matrices
 - generating pseudodata
and actions which can be accessed by other actions/providers.

@wilsonmr
Copy link
Copy Markdown
Contributor Author

wilsonmr commented Feb 3, 2021

Well perhaps I worded that too loosely but I disagree.

I think there's a clear subset of providers which are consumed by other providers or actions (similar to fixtures in pytest). In terms of identifying what these are I would say a decent proportion of them would be used by both n3fit or validphys. I suppose in that sense I could cut the theory predictions part from this and put the t0 predictions in the results_providers.covmat module. It's clear (to me anyway) that something like abs_chi2_data does not belong here since it only consumes one of these actions and is only useful to actions which are plotting or tabulating chi2.

I find the subdir structure of this particular selection of modules easier to navigate than them either being contained in one mega file a la validphys.results or scattered around randomly in the flat directory structure.

@wilsonmr
Copy link
Copy Markdown
Contributor Author

wilsonmr commented Feb 4, 2021

Obviously apart from the docs this doesn't add anything new it's just copy and paste to gather certain things up.

Does anybody else have an opinion of putting these actions together here or leaving them where they are? @RosalynLP @siranipour @voisey ? Happy to accept this is a bad suggestion 😆

@RosalynLP
Copy link
Copy Markdown
Contributor

@wilsonmr I actually really like this - I think it is a lot clearer to me to have all these actions collected together, and prevents tedious scrolling and searching through multiple files to look for the relevant actions for playing around with the data.

Copy link
Copy Markdown
Contributor

@siranipour siranipour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably not a bad shout. But for sake of completeness, I think we should add fkparser and convolution before merging. Also, the name results_provider implies that it's just going to provide for results.py. Is this necessarily true? I can see e.g n3fit using quite a few of this providers. Maybe something like fit_utils?

@wilsonmr
Copy link
Copy Markdown
Contributor Author

wilsonmr commented Feb 9, 2021

The name coule probably be improved, results_providers specifically was referring to everything in here would go into the result or dataset_inputs_results tuple of data/theory which means:

  • data centrals
  • covmats
  • theory preds

Only the former 2 really get used in a fit, whereas the latter is more the result of the fit, which I guess is why it was called result in the first place.

I am starting to wonder if we really need a subdir or if we could simply have:

  • covmats_utils (covmat loading/construction funcs EDIT: anything which isn't an action)
  • covmats (covmat providers, as is with utils moved out)
  • commondataparser (cd loading as is)
  • commondata (pretty much what the results_providers.py was)

Then convolution can remain as is. I think then it would be obvious enough where things are. Thoughts?

@wilsonmr wilsonmr marked this pull request as draft February 17, 2021 10:54
@wilsonmr
Copy link
Copy Markdown
Contributor Author

Closed in favour of #1098

@wilsonmr wilsonmr closed this Feb 25, 2021
@scarrazza scarrazza deleted the restruct-pydat branch August 27, 2021 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants