-
Notifications
You must be signed in to change notification settings - Fork 14
Group datasets by experiment PLOTTING label #427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5c1a820
865d33b
8f01a54
08524d6
13ceab3
0cb17f8
d03edeb
3c6c7bd
685e466
baf5865
1d2dc7a
e7a9abb
a7ecf11
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -103,6 +103,37 @@ def check_cuts_considered(use_cuts): | |
| raise CheckError(f"Cuts must be computed for this action, but they are set to {use_cuts.value}") | ||
|
|
||
|
|
||
| @make_argcheck | ||
| def check_dataset_cuts_match_theorycovmat(dataset, fitthcovmat): | ||
| if fitthcovmat: | ||
| ds_index = fitthcovmat.load().index.get_level_values(1) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally we would keep very expensive things outside the checks, and instead have a way of loading metadata on one side for verification purposes and full big objects on the other side. Anyway, that is just a comment.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. E.g. parquet files have something like https://arrow.apache.org/docs/python/generated/pyarrow.parquet.read_metadata.html
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess ideally the actual cuts would be stored in the metadata, so that not only could the metadata be used here but also the check would be better than just comparing number of points
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another option is knowing the namespace that produced the cuts, as per #224. But probably we should have both. At some point I thought about keeping the point index always relative to the full dataset (in e.g. experiments_index) but I think I decided against it because nnfit doesn't do that, and could be confusing. We should probably revisit that.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah sure
Yeah I can see pros and cons for this - could be worth looking at though |
||
| ncovmat = (ds_index == dataset.name).sum() | ||
|
|
||
| cuts = dataset.cuts | ||
| if cuts: | ||
| ndata = len(dataset.cuts.load()) | ||
| else: | ||
| ndata = dataset.commondata.ndata | ||
| check(ndata == ncovmat) | ||
|
|
||
|
|
||
| @make_argcheck | ||
| def check_experiment_cuts_match_theorycovmat( | ||
| experiment, fitthcovmat): | ||
| for dataset in experiment.datasets: | ||
| if fitthcovmat: | ||
| ds_index = fitthcovmat.load().index.get_level_values(1) | ||
| ncovmat = (ds_index == dataset.name).sum() | ||
|
|
||
| cuts = dataset.cuts | ||
| if cuts: | ||
| ndata = len(dataset.cuts.load()) | ||
| else: | ||
| ndata = dataset.commondata.ndata | ||
| check(ndata == ncovmat) | ||
|
|
||
|
|
||
|
|
||
| @make_argcheck | ||
| def check_have_two_pdfs(pdfs): | ||
| check(len(pdfs) == 2,'Expecting exactly two pdfs.') | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitively need to make it so that this information is stored somewhere and one does not need to duplicate it all the time.