Skip to content

make pseudodata_table correctly deal with multiple replicas#2034

Merged
RoyStegeman merged 5 commits into
masterfrom
rs-quickfix
Apr 11, 2024
Merged

make pseudodata_table correctly deal with multiple replicas#2034
RoyStegeman merged 5 commits into
masterfrom
rs-quickfix

Conversation

@RoyStegeman
Copy link
Copy Markdown
Member

For the thcovmat alphas stuff it makes a big difference whether I use the central data or the average over data replicas. While looking into it I want to use this function with multiple replicas.

replicas_nnseed_fitting_data_dict = collect("replica_nnseed_fitting_data_dict", ("replicas",))
groups_replicas_indexed_make_replica = collect(
"indexed_make_replica", ("group_dataset_inputs_by_experiment", "replicas")
"indexed_make_replica", ("replicas", "group_dataset_inputs_by_experiment")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the change of order necessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thin it makes what we do in pseudodata_table more readable. There we group the entries to groups_replicas_indexed_make_replica corresponding to a given replica. I think that's easier to understand the way it's done now than if we had to take e.g. index 0 and then skip a number of indexes equal to the number of groups to get the second input corresponding to the same replica

Comment thread validphys2/src/validphys/n3fit_data.py Outdated
Comment thread validphys2/src/validphys/n3fit_data.py Outdated
df = [
pd.concat(groups_replicas_indexed_make_replica[i : i + groups_per_replica])
for i in range(0, len(groups_replicas_indexed_make_replica), groups_per_replica)
]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you achieve this with a reshape or permutation of groups_replicas_indexed_make_replica

Copy link
Copy Markdown
Member Author

@RoyStegeman RoyStegeman Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because groups_replicas_indexed_make_replica is a list of indexed_make_replica containing a number of dataframes equal to number_of_replicas x number_of_datagroups. It's not a really clean input.

Here I group the list items (all different data groups) that correspond to the same replica into a single dataframe for each replica

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just complicating your life here, but would it be possible to do something along the lines of

np.array(groups_replicas_indexed_make_replica).reshape(replicas, groups_per_replica) ?

(or the other way around)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because it's a list of dataframes and I do need to retain the information on the labels

@RoyStegeman RoyStegeman merged commit 0207a00 into master Apr 11, 2024
@RoyStegeman RoyStegeman deleted the rs-quickfix branch April 11, 2024 14:16
@scarlehoff scarlehoff mentioned this pull request May 20, 2024
33 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants