Skip to content

New Common Datasets#1813

Merged
scarlehoff merged 105 commits into
masterfrom
new_commondata_collected
Feb 16, 2024
Merged

New Common Datasets#1813
scarlehoff merged 105 commits into
masterfrom
new_commondata_collected

Conversation

@scarlehoff
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff commented Oct 9, 2023

This branch/PR will collect all datasets which are fully implemented in the new commondata format.

When you finish a dataset (or group of datasets) please create a PR pointing to this branch. Then the PR will be tested by either @enocera or myself, ensuring that not only the data and uncertainties are correct but also that the validphys plotting works, all necessary variables defined, etc.

Before opening a PR please ensure that:

  1. For all new datasets which have an "old" version, you've added a mapping to buildmaster/dataset_names.yml with NEW_NAME: OLD_NAME.
  2. The plotting, kinematics and theory sections are filled with all the necessary information.
  3. The nnpdf31_process matches whatever was the nnpdf31_process in the old version.
  4. If the new implementation fixes bugs or changes the dataset, include a legacy (or similar) variant in order to reproduce the old results.

(I've update this list with more points that might be important to remember!)


In principle you should be able to use the new commondata reader: #1678 to check by yourself that everything is working.

The definition of the new commondata is documented in #1708. Please let me know if anything is missing.

In practical terms, you don't need to branch out of this branch. When you are done a think that a dataset is finished, you can either open a PR only with that dataset of with a collection of them and I will deal with the merge.
Please don't change files outside of the datasets that you are implementing unless absolutely necessary.

@Radonirinaunimi
Copy link
Copy Markdown
Member

@scarlehoff, during our meeting (w/ @cschwan and @peterkrack) this afternoon we were wondering what would happen if the different variants have different number of datapoints? Especially given that ndata is a first-class key in the metadata.

@cschwan
Copy link
Copy Markdown
Contributor

cschwan commented Oct 10, 2023

Remember this happened for the infamous CDF Z measurement, where they removed the last point.

@scarlehoff
Copy link
Copy Markdown
Member Author

If I remember correctly it should not possible to have variants with different number of datapoints. The variants are for different treatment of the uncertainties and that kind of thing. If the data is different then it is a different dataset.

@scarlehoff
Copy link
Copy Markdown
Member Author

scarlehoff commented Oct 10, 2023

Remember this happened for the infamous CDF Z measurement, where they removed the last point.

That would trigger either an update with a new commit and a comment in the version_comment or a separate dataset.

@Radonirinaunimi
Copy link
Copy Markdown
Member

Ok, agreed!

comane and others added 2 commits February 12, 2024 10:38
Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>
@scarlehoff scarlehoff force-pushed the new_commondata_collected branch from 902e9bb to cf55efb Compare February 12, 2024 09:38
@scarlehoff
Copy link
Copy Markdown
Member Author

I'm going to merge this to master.

I'll leave the scripts I've used in the buildmaster folder for future reference, they might be useful.

Copy link
Copy Markdown
Member

@Radonirinaunimi Radonirinaunimi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be indeed good to have this merged!

@scarlehoff scarlehoff merged commit 4a5ea18 into master Feb 16, 2024
@scarlehoff scarlehoff deleted the new_commondata_collected branch February 16, 2024 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants