Skip to content

Python interface improvements#74

Merged
cschwan merged 40 commits into
masterfrom
py-improved
Oct 27, 2021
Merged

Python interface improvements#74
cschwan merged 40 commits into
masterfrom
py-improved

Conversation

@alecandido
Copy link
Copy Markdown
Collaborator

@alecandido alecandido commented Oct 2, 2021

While working on stuffs that make use of python interface, new needs arose. Thus, there is room to add some improvements.

Following a list of possible ones:

@alecandido alecandido changed the base branch from pyo3-simplify to master October 4, 2021 14:30
@alecandido
Copy link
Copy Markdown
Collaborator Author

As discussed with @felixhekhorn we'll reserve Grid.__get_item__() and Grid.__set_item__() and Grid.__iter__() for managing subgrids, since Grid is a subgrids container in the first place (and a metadata container only afterwards).

@alecandido
Copy link
Copy Markdown
Collaborator Author

Most of the functions of CLI consists mainly in printing tables, something we don't need for the python interface (since we can handle objects, and it is comfortable to do it), nevertheless we are wondering if it's worth to bind pineappl_cli.helpers.

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 4, 2021

Most of the functions of CLI consists mainly in printing tables, something we don't need for the python interface (since we can handle objects, and it is comfortable to do it), nevertheless we are wondering if it's worth to bind pineappl_cli.helpers.

If you find that helpful I'd rather suggest to import the functionality into PineAPPL and remove the corresponding helper (and write a wrapper to the new function in PineAPPL). I'm definitely going to do that for the convolute function, which takes care of charge conjugation and different initial states; this also overlaps with #66. Let me know if you need more.

@alecandido
Copy link
Copy Markdown
Collaborator Author

alecandido commented Oct 4, 2021

Most of the functions of CLI consists mainly in printing tables, something we don't need for the python interface (since we can handle objects, and it is comfortable to do it), nevertheless we are wondering if it's worth to bind pineappl_cli.helpers.

If you find that helpful I'd rather suggest to import the functionality into PineAPPL and remove the corresponding helper (and write a wrapper to the new function in PineAPPL). I'm definitely going to do that for the convolute function, which takes care of charge conjugation and different initial states; this also overlaps with #66. Let me know if you need more.

I would say just convolute() (and consequently SCALES_VECTOR). I can do here or we can do separately, that's fine.

You can move yourself if you wish, or I can move it, but: where should I place? and how should I call it? (my guess would be to put it in grid.rs as a separate functions, i.e. not inside the impl block, but I don't have any good name other than convolute).

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 4, 2021

Most of the functions of CLI consists mainly in printing tables, something we don't need for the python interface (since we can handle objects, and it is comfortable to do it), nevertheless we are wondering if it's worth to bind pineappl_cli.helpers.

If you find that helpful I'd rather suggest to import the functionality into PineAPPL and remove the corresponding helper (and write a wrapper to the new function in PineAPPL). I'm definitely going to do that for the convolute function, which takes care of charge conjugation and different initial states; this also overlaps with #66. Let me know if you need more.

I would say just convolute() (and consequently SCALES_VECTOR). I can do here or we can do separately, that's fine.

You can move yourself if you wish, or I can move it, but: where should I place? and how should I call it? (my guess would be to put it in grid.rs as a separate functions, i.e. not inside the impl block, but I don't have any good name other than convolute).

My current idea is that instead of passing three functions (xfx1, xfx2 and alphas) you pass a single PdfCache object (for which I have yet to write the struct). You can then easily cache PDFs, even among several convolute calls, which will speed up a few of the subcommands.

@alecandido
Copy link
Copy Markdown
Collaborator Author

Added syntactic sugar to get and set subgrids, but the get function is not really working well.

The following code:

import pineappl
g = pineappl.grid.Grid.read("dyaa.pineappl")
s = g[0, 0, 0]
g[0, 1, 0] = s

(with dyaa.pineappl the grid generated from the example) raises with the following error

AttributeError                            Traceback (most recent call last)
<ipython-input-5-346819bef8ca> in <module>
----> 1 g[0, 1, 0] = s

~/Projects/N3PDF/pineappl/pineappl_py/env/lib/python3.9/site-packages/pineappl/grid.py in __setitem__(self, key, subgrid)
    140             raise ValueError("A tuple with `(order, bin, lumi)` is required as key.")
    141
--> 142         self.set_subgrid(*key, subgrid)
    143
    144     def set_remapper(self, remapper):

~/Projects/N3PDF/pineappl/pineappl_py/env/lib/python3.9/site-packages/pineappl/grid.py in set_subgrid(self, order, bin_, lumi, subgrid)
    122                 subgrid content
    123         """
--> 124         self.raw.set_subgrid(order, bin_, lumi, subgrid.into())
    125
    126     def __setitem__(self, key, subgrid):

AttributeError: 'builtins.PySubgridEnum' object has no attribute 'into'

Essentially the error is due to the subgrid not having been recast to any specific subgrid type when retrieved, so the subgrid.into() call is unneeded. I believe it should be recast to a meaningful type to be able to do operations, but since I don't know which one should be maybe that's a hint I'm not thinking to the correct solution.

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 4, 2021

As discussed with @felixhekhorn we'll reserve Grid.__get_item__() and Grid.__set_item__() and Grid.__iter__() for managing subgrids, since Grid is a subgrids container in the first place (and a metadata container only afterwards).

Do __get_item__ and others significantly improve the usability of Grid? I'm a bit sceptical here; I'd actually like to discourage the use of subgrids, because everything should be done using convolute, convolute_subgrid and convolute_eko. This doesn't always work (for instance in yadism), but that should be the aim.

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 4, 2021

Codecov Report

Merging #74 (714b559) into master (c737cba) will increase coverage by 10.04%.
The diff coverage is 88.46%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #74       +/-   ##
===========================================
+ Coverage   67.65%   77.69%   +10.04%     
===========================================
  Files          13       14        +1     
  Lines        2226     2049      -177     
===========================================
+ Hits         1506     1592       +86     
+ Misses        720      457      -263     
Impacted Files Coverage Δ
pineappl/src/fk_table.rs 38.96% <ø> (+2.37%) ⬆️
pineappl/src/lumi.rs 70.87% <ø> (+63.18%) ⬆️
pineappl/tests/drell_yan_lo.rs 100.00% <ø> (ø)
pineappl/src/grid.rs 72.57% <88.46%> (+11.72%) ⬆️
pineappl/src/subgrid.rs 96.87% <0.00%> (-0.46%) ⬇️
pineappl_cli/src/main.rs 0.00% <0.00%> (ø)
pineappl/src/bin.rs 88.19% <0.00%> (+0.54%) ⬆️
pineappl/src/lagrange_subgrid.rs 87.13% <0.00%> (+2.52%) ⬆️
pineappl/src/empty_subgrid.rs 100.00% <0.00%> (+5.88%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c737cba...714b559. Read the comment docs.

@alecandido
Copy link
Copy Markdown
Collaborator Author

Do __get_item__ and others significantly improve the usability of Grid? I'm a bit sceptical here; I'd actually like to discourage the use of subgrids, because everything should be done using convolute, convolute_subgrid and convolute_eko. This doesn't always work (for instance in yadism), but that should be the aim.

Ok, this might be a point. Nevertheless, I wonder if giving write access without read access is a good idea (about __setitem__ instead is just for building abstractions, in python it is useful to build on existing abstractions, and if a Grid is a subgrid containers you expect to be able to access them, set them and iterate over them, doing it with standard functions helps the user to rely on known abstractions).

In case I can just revert and remove Grid.subgrid().

@alecandido
Copy link
Copy Markdown
Collaborator Author

alecandido commented Oct 4, 2021

Instead I'm running in more serious troubles with Grid.merge().

We decided to keep Grid non-clonable, and I agree it is a good idea. But for Grid.merge() we need to pass a second grid, and it looks like it doesn't like:

error[E0277]: the trait bound `PyGrid: Clone` is not satisfied
   --> src/grid.rs:345:40
    |
345 |     pub fn merge(&mut self, mut other: Self) {
    |                                        ^^^^ the trait `Clone` is not implemented for `PyGrid`
    |
    = note: required because of the requirements on the impl of `pyo3::FromPyObject<'_>` for `PyGrid`

Any suggestion?

P.S.: PyGrid is not clonable because it contains a Grid, that as said above it's not clonable

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 4, 2021

Instead I'm running in more serious troubles with Grid.merge().

We decided to keep Grid non-clonable, and I agree it is a good idea. But for Grid.merge() we need to pass a second grid, and it looks like it doesn't like:

error[E0277]: the trait bound `PyGrid: Clone` is not satisfied
   --> src/grid.rs:345:40
    |
345 |     pub fn merge(&mut self, mut other: Self) {
    |                                        ^^^^ the trait `Clone` is not implemented for `PyGrid`
    |
    = note: required because of the requirements on the impl of `pyo3::FromPyObject<'_>` for `PyGrid`

Any suggestion?

P.S.: PyGrid is not clonable because it contains a Grid, that as said above it's not clonable

That's a tough one, let me think about it.

@felixhekhorn
Copy link
Copy Markdown
Contributor

Mmm, I think the only solution (if we want to keep Grid non-clonable) is to provide a function merge_from_file(Grid, path) and then read the Grid in Rust ...

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 18, 2021

Mmm, I think the only solution (if we want to keep Grid non-clonable) is to provide a function merge_from_file(Grid, path) and then read the Grid in Rust ...

Yes, I agree this is how we should do it!

@alecandido
Copy link
Copy Markdown
Collaborator Author

Mmm, I think the only solution (if we want to keep Grid non-clonable) is to provide a function merge_from_file(Grid, path) and then read the Grid in Rust ...

Yes, I agree this is how we should do it!

Then I'm going to implement it immediately

This commit also removes the use of `anyhow`, for which instead
`thiserror` is used. Furthermore, it adds the method `Grid::write_lz4`,
which writes a compressed grid. This method is used in the CLI whenever
a file is written that has an extension 'lz4'.
@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 26, 2021

The last item is implemented in commit dfdcdcc for the Rust API and in the CLI. Whenever you run pineappl with a subcommand that writes a file, giving the file the extension .lz4 also let's the CLI compress it.

@alecandido
Copy link
Copy Markdown
Collaborator Author

alecandido commented Oct 26, 2021

Re #74 (comment)

@cschwan tested with pineko together with @felixhekhorn, it required a very tiny fix (b1bba7e) but it's now working.

@alecandido
Copy link
Copy Markdown
Collaborator Author

Sorry to have waited for so long, but now this PR should address even point 4 with a4bf1f2, it was very simple.

Closes #80

@alecandido alecandido linked an issue Oct 27, 2021 that may be closed by this pull request
@alecandido alecandido marked this pull request as ready for review October 27, 2021 09:35
@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 27, 2021

Commit 714b559 should let you select a subset of orders in a Grid when generating an FkTable with Grid::convolute_eko.

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 27, 2021

@alecandido @felixhekhorn can we merge this?

@alecandido
Copy link
Copy Markdown
Collaborator Author

I reviewed just now, and the only thing I could spot is really the migration of convolute_subgrid, but there is a TODO and we can do on master.

Then let's merge, I would say.

@cschwan cschwan merged commit 714b559 into master Oct 27, 2021
@cschwan cschwan deleted the py-improved branch October 27, 2021 13:54
@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 27, 2021

@alecandido @felixhekhorn do you need another release for the Python interface to be uploaded? If so, I'm going to release it as 0.5.0-beta.2.

@alecandido
Copy link
Copy Markdown
Collaborator Author

alecandido commented Oct 27, 2021

Yes, that would be nice. Then I'm going to update dependencies in runcards, pineko. and yadism.

@felixhekhorn
Copy link
Copy Markdown
Contributor

Not necessarily, but a release would make things easier and since the last we added several new features ...
by the way, are you're sure about the naming? shouldn't it be 0.5.0b2? see https://pypi.org/project/poetry/#history

@alecandido
Copy link
Copy Markdown
Collaborator Author

I believe there is no official way of naming alpha, beta, and release_candidate (for sure it's not specified in semver). Just be consistent with one scheme.

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 27, 2021

Yes, as far as semver is concerned both are OK (I think).

@alecandido
Copy link
Copy Markdown
Collaborator Author

alecandido commented Oct 27, 2021

Ok, I just checked and the answer is definitely yes:

A pre-release version MAY be denoted by appending a hyphen and a series of dot separated identifiers immediately following the patch version. Identifiers MUST comprise only ASCII alphanumerics and hyphens [0-9A-Za-z-]. Identifiers MUST NOT be empty. Numeric identifiers MUST NOT include leading zeroes. Pre-release versions have a lower precedence than the associated normal version. A pre-release version indicates that the version is unstable and might not satisfy the intended compatibility requirements as denoted by its associated normal version. Examples: 1.0.0-alpha, 1.0.0-alpha.1, 1.0.0-0.3.7, 1.0.0-x.7.z.92, 1.0.0-x-y-z.–.

https://semver.org/#spec-item-9

or even more: @cschwan is more correct then poetry, that is not using the dash (but even the official CPython is omitting the dash...)

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 27, 2021

@alecandido I released v0.5.0-beta.2, but building the documentation fails and I can't reproduce the error: https://readthedocs.org/projects/pineappl/builds/15107369/

@alecandido
Copy link
Copy Markdown
Collaborator Author

Sorry not to have replied before, I'm glad that you already solved the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Grid::write_lz4 method

3 participants