Skip to content

Added filter cuts on top of new Main#95

Open
FrancescoMerlotti wants to merge 8 commits into
mainfrom
added_filter_cuts_2
Open

Added filter cuts on top of new Main#95
FrancescoMerlotti wants to merge 8 commits into
mainfrom
added_filter_cuts_2

Conversation

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator

Added validphys.config.parse_added_filter_rules to parse the added_filter_rules in the vp-setupfit runcard, and updated the validphys.config.produce_rules function.

Copy link
Copy Markdown
Member

@comane comane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @FrancescoMerlotti, couple of comments

Comment on lines +1364 to +1365
from validphys.filters import AddedFilterRule
return tuple(AddedFilterRule(**rule) for rule in rules) if rules else None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a circular import error if you remove it from within the function?

Comment on lines +408 to +418

# test for empty datasets
while len(level0_commondata_wc) != len(level1_commondata_instances_wc):
to_append = []
lv1_names = [lv1_cd.setname for lv1_cd in level1_commondata_instances_wc]
for lv0_cd in level0_commondata_wc:
if lv0_cd.setname not in lv1_names:
to_append.append(lv0_cd)
for cd in to_append:
level1_commondata_instances_wc.append(cd)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this being touched?

@ElieHammou
Copy link
Copy Markdown
Collaborator

Hi Francesco,
Thanks for working on this. I have just launched a fit with this PR to benchmark the result against something I have already done with #86 .

@ElieHammou
Copy link
Copy Markdown
Collaborator

I have found a bug @FrancescoMerlotti ,
In this PR n3fit adds a copy of datacuts_theory_closuretest_fitting_groups_covmat.csv in each replica folder which multiplies by a lot the size of the fit folder (from about 300MB to 20GB...). It should not be too hard to fix hopefully

Screenshot 2025-05-23 at 13 59 20

@ElieHammou
Copy link
Copy Markdown
Collaborator

On the physics side, I have run a contaminated closure test with both the branches and the cuts seem to not be applied on the new branch.

Here is a luminosity plot which shows it:
https://vp.nnpdf.science/gPWXviGvT-u4R5CrmU05Jw==

@ElieHammou
Copy link
Copy Markdown
Collaborator

After investigation, I find that the high energy points have been cut in the DATA files but not in the FILTER ones

Screenshot 2025-05-23 at 14 13 59

In a closure test the fit is performed on the filter files, is that correct?

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

The filter file is full of 0s where there are the cuts, hence it should be alright. I am trying to understand where the datacuts_theory_closuretest_fitting_groups_covmat .csv is outputted.

@ElieHammou
Copy link
Copy Markdown
Collaborator

You are right about the filter file. Now I am very confused as to how my new branch fit replicates the no cut fit...
The luminosity clearly shows that the fit is seeing constraint above 2.5 TeV I think

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

I am a bit confused as well.
I think I solved the issue with the export of the covmat, it is a table decorator that produces the output in covmats.py

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

I suggest that you try with a more stringent cut, something like 180 GeV

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

I am currently running a fit without any datapoint above Q = 150 GeV

@ElieHammou
Copy link
Copy Markdown
Collaborator

By the way does the fit still error out if all the datapoints of an observables are cut? It is definitely the case on the previous branch

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

now it should be ok, for example, in with a 150 GeV cut all the data in HLLHC datasets are cut

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

I run the kinematic coverage on two fits generated with this and the old branch, and the cuts seem to be applied the same way
https://vp.nnpdf.science/OoA-nvBbQJ2-5Lh45Eeh9g==

@ElieHammou
Copy link
Copy Markdown
Collaborator

Hi @FrancescoMerlotti ,
Sorry I left this review for a bit. Were you able to benchmark the fits with 150GeV cuts with the old branch and this new one?

@ElieHammou
Copy link
Copy Markdown
Collaborator

Hi @FrancescoMerlotti ,
I have just tried again to implement a cut with this branch and I find that it works very well :)
https://vp.nnpdf.science/e7v15jX4T5OVCklNKAZnrQ==

I am not sure what happened last time, I must have messed up something in my fit of failed to pull the latest code updates, my bad...

I am happy to merge it now.

@FrancescoMerlotti
Copy link
Copy Markdown
Collaborator Author

Hi @FrancescoMerlotti , I have just tried again to implement a cut with this branch and I find that it works very well :) https://vp.nnpdf.science/e7v15jX4T5OVCklNKAZnrQ==

I am not sure what happened last time, I must have messed up something in my fit of failed to pull the latest code updates, my bad...

I am happy to merge it now.

Glad to here this, lovely, we can merge after we finish the runs, or do you want to do that before?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants