Nn layers refactor by APJansen · Pull Request #1888 · NNPDF/nnpdf

APJansen · 2023-12-08T14:01:15Z

This PR does two things that both should leave everything identical:

1. Pull together the 3 functions that were responsible for generating the neural network layers:

generate_dense_network
generate_dense_per_flavor_network
generate_nn

Only the last one remains, the first two had a lot of overlap.
I have also pulled in the loop over replicas out of pdf_NN_layer_generator into generate_nn

This is everything up to and includingthis commit.
This PR may be easier to follow commit by commit.

2. Reverse the order of the loops over replicas and layers

This is the actual point, currently we do: for all replicas, for all layers, create the layer.
To accomodate the upcoming multi-replica layers, where one layer contains all replicas, the order needed to change.

Status

I think there are 3 relevant choices to test for:

dropout or not
dense layers or dense_per_flavour layers
1 or multiple replicas

The dense_per_flavour layers aren't compatible with multiple replicas or with dropout (they could be, but in the first case it doesn't pass a check, and the second case just wasn't implemented), but the dropout and replicas choices should be independent, so we have:

default layer:

replicas \ dropout	yes	no
1	X	X
multiple	X	X

dense_per_flavour, 1 replica, no dropout

For each of these I have taken a simple runcard, ran it for 100 epochs, and compared the last 3 digits of the chi2 between this branch and master (well, replica_axis_first, which is ready to be merged). (Xs mean they pass this test)

TODO:

Fix dense_per_flavour case, getting a size mismatch after training, at single replica models
Either test and fix or explicitly exclude the combination of dense_per_flavour and dropout (as it was, but no longer is)
Update tests which still use the old generate_dense_network and generate_dense_per_flavour_network
Check if a third step is possible without changing numerics: directly concatenating each layer over replicas. Actually no this isn't possible until the MultiDense layer itself is implemented.

APJansen · 2023-12-08T15:15:41Z

@scarlehoff @RoyStegeman Does this look good to you, the plan described above I mean?

And just in case it has changed since the last time I asked: do we still want to maintain the dense_per_flavour layer?

RoyStegeman · 2023-12-08T15:34:30Z

If combining generate_dense_network and generate_dense_per_flavor_network keeps the code readable I don't have any objection, these are not functions that are used by scripts outside of executing a full fit. Reverting the loops is also fine (even necessary, as you say).

Even if we may never end up using it, I'm afraid we should keep supporting dense_per_flavour until we release 4.1 (unless @scarlehoff thinks we can get rid of it once the first of the three upcoming papers is on the arxiv? Though it's even debatable if that would be worth waiting for). In this case it's not so interesting to be able to do parallel fits though so if it's convenient not to support multi replicas that would be fine.

scarlehoff · 2023-12-08T15:36:19Z

Even if we may never end up using it, I'm afraid we should keep supporting dense_per_flavour until we release 4.1

Yes. It is the burden of having published the code!

but indeed

if it's convenient not to support multi replicas that would be fine.

We have promised backwards compatibility but that doesn't mean that any improvement will affect the entire code.

APJansen · 2023-12-11T07:59:00Z

Ok no problem, working now!

The issue I had with the per_flavour layer came from this old TODO about the basis_size coming from the last entry of the nodes list, while it should come form the runcard. I was overwriting nodes[-1] with 1 in this case, but this changed the original which was reused for the single replica model.
Is it worth fixing this old TODO? There's a basis argument in the runcard, which is a list, so we can just take its length, but I'm not sure how to add dependence on runcard arguments.

Also, I don't think it makes sense to add the possibility of combining these layers with dropout, and indeed it wasn't possible before, so I just raised an error in that case.

scarlehoff

lgtm, they are all suggestions for style / loops / naming

scarlehoff · 2023-12-13T10:21:34Z

+        inits = [
+            initializer_generator(initializer_name, replica_seed, i_layer)
+            for replica_seed in replica_seeds
+        ]
+        layers = [
+            base_layer_selector(
+                layer_type,
+                kernel_initializer=init,
+                units=nodes_out,
+                activation=activation,
+                input_shape=(nodes_in,),
+                **custom_args,
+            )
+            for init in inits
+        ]


Suggested change

inits = [

initializer_generator(initializer_name, replica_seed, i_layer)

for replica_seed in replica_seeds

]

layers = [

base_layer_selector(

layer_type,

kernel_initializer=init,

units=nodes_out,

activation=activation,

input_shape=(nodes_in,),

**custom_args,

)

for init in inits

]

for replica_seed in replica_seeds:

init = initializer_generator(replica_seed, i_layer)

layers = base_layer_selector(

layer_type,

kernel_initializer=init,

units=nodes_out,

activation=activation,

input_shape=(nodes_in,),

**custom_args,

)

I think like this it is better for readability (I think you don't need inits later right? otherwise of course no)
I've also removed the initializer name.

Agreed, I think I was trying to anticipate how it will look with multi dense layers, but it doesn't matter.

Actually, your revision should have a layers = [] and a layers.append(layer), which I think makes it ugly again, how about what I have now? If you prefer the regular loop with appending, I'll change it again.

scarlehoff · 2023-12-13T10:33:05Z

+    # ... then apply them to the input to create the models
+    xs = [layer(x) for layer in list_of_pdf_layers[0]]
+    for layers in list_of_pdf_layers[1:]:
+        if type(layers) is list:


Please, add a comment here.

You mean on why the if statement is needed? I added a comment, it's because dropout is shared between layers. I could also remove the if statement and replace dropout_layer with [dropout_layer for _ in range(num_replicas)] or something.

github-actions · 2023-12-13T10:45:23Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-0c2eef648-2023-12-13
Fit Report: https://vp.nnpdf.science/B8_pqM40TXOaE6MRHsSKwA==
Fit Data: https://data.nnpdf.science/fits/NNBOT-0c2eef648-2023-12-13.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

Radonirinaunimi

Looks very good!

scarlehoff

lgtm

scarlehoff · 2023-12-20T08:43:06Z

Please don't merge this yet!

APJansen · 2024-01-08T10:12:05Z

Thanks, I won't merge yet! When is the next tag expected?

RoyStegeman · 2024-01-08T10:33:47Z

Hopefully once #1901 is merged

scarlehoff · 2024-01-10T18:01:21Z

Hopefully once #1901 is merged

And when the papers are out...

APJansen · 2024-01-11T06:37:19Z

And when the papers are out...

The ... seems to indicate that that will take a while? ;P

Maybe it's worth creating a general waiting-for-next-tag branch or something, so that we can keep master as is but also not block further development?

…nerate_dense and generate_dense_per_flavour

Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>

…1871, #1864, #1861 and #1888 WIP Merging

APJansen added Refactoring n3fit Issues and PRs related to n3fit escience labels Dec 8, 2023

APJansen self-assigned this Dec 8, 2023

APJansen mentioned this pull request Dec 8, 2023

Multi Replica PDF #1880

Closed

Base automatically changed from replica-axis-first to master December 8, 2023 16:59

APJansen marked this pull request as ready for review December 11, 2023 10:26

APJansen requested review from RoyStegeman and scarlehoff December 11, 2023 10:28

RoyStegeman reviewed Dec 11, 2023

View reviewed changes

Comment thread n3fit/src/n3fit/model_gen.py Outdated

APJansen added the run-fit-bot Starts fit bot from a PR. label Dec 13, 2023

scarlehoff reviewed Dec 13, 2023

View reviewed changes

Radonirinaunimi reviewed Dec 13, 2023

View reviewed changes

Comment thread n3fit/src/n3fit/model_gen.py

scarlehoff removed the run-fit-bot Starts fit bot from a PR. label Dec 15, 2023

scarlehoff approved these changes Dec 18, 2023

View reviewed changes

RoyStegeman approved these changes Dec 18, 2023

View reviewed changes

Radonirinaunimi approved these changes Dec 20, 2023

View reviewed changes

scarlehoff added the waiting for next tag PR which are now completed and are waiting for a next tag label Dec 20, 2023

APJansen mentioned this pull request Jan 9, 2024

Multi dense layer #1905

Merged

scarlehoff mentioned this pull request Jan 10, 2024

Merge prefactors into single layer #1881

Merged

3 tasks

APJansen and others added 24 commits January 17, 2024 10:54

Add constant arguments

1ad7960

Move dropout to generate_nn

7758497

Move concatenation of per_flavor layers into generate_nn

2d388d9

Make the two layer generators almost equal

1ef87dc

remove separate dense and dense_per_flavor functions

806e2c1

Add documentation.

bdbc3c3

Simplify per_flavor layer concatenation

e3f9f0c

Reverse order of loops over replicas and layers

3d9070f

Fixes for dropout

c8300c8

Fixes for per_flavour

0cf23f2

Fix issue with copying over nodes for per_flavour layer

b0a8e3b

Fix seeds in per_flavour layer

97d2efe

Add error for combination of dropout with per_flavour layers

4c4a2d5

Add basis_size argument to per_flavour layer

2287194

Fix model_gen tests to use new generate_nn in favor of now removed ge…

2f68e3d

…nerate_dense and generate_dense_per_flavour

Allow for nodes to be a tuple

4dd1649

Move dropout, per_flavour check to checks

6bd6466

Clarify layer type check

2cd9e52

Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>

Clarify naming in nn_generator

1ae1b84

Remove initializer_name argument

e7a7cb4

clarify comment

07c1e7d

Co-authored-by: Juan M. Cruz-Martinez <juacrumar@lairen.eu>

Add comment on shared layers

25b8308

Rewrite comprehension over replica seeds

692014b

Add check on layer type

903c75b

APJansen force-pushed the nn-layers-refactor branch from 0475196 to 903c75b Compare January 17, 2024 09:55

scarlehoff mentioned this pull request Jan 19, 2024

Pre-merge branch for batch of changes #1913

Merged

scarlehoff changed the base branch from master to develop_merge_20240119 January 19, 2024 13:57

scarlehoff merged commit 64379d9 into develop_merge_20240119 Jan 19, 2024

scarlehoff deleted the nn-layers-refactor branch January 19, 2024 14:37

scarlehoff added a commit that referenced this pull request Jan 24, 2024

Merge pull request #1914 from NNPDF/develop_merge_20240119, includes: #…

af3306a

…1871, #1864, #1861 and #1888 WIP Merging

Conversation

APJansen commented Dec 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Pull together the 3 functions that were responsible for generating the neural network layers:

2. Reverse the order of the loops over replicas and layers

Status

TODO:

Uh oh!

APJansen commented Dec 8, 2023

Uh oh!

RoyStegeman commented Dec 8, 2023

Uh oh!

scarlehoff commented Dec 8, 2023

Uh oh!

APJansen commented Dec 11, 2023

Uh oh!

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scarlehoff Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

APJansen Dec 14, 2023

Choose a reason for hiding this comment

Uh oh!

APJansen Dec 14, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

scarlehoff Dec 13, 2023

Choose a reason for hiding this comment

Uh oh!

APJansen Dec 14, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Dec 13, 2023

Uh oh!

Radonirinaunimi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

scarlehoff commented Dec 20, 2023

Uh oh!

APJansen commented Jan 8, 2024

Uh oh!

RoyStegeman commented Jan 8, 2024

Uh oh!

scarlehoff commented Jan 10, 2024

Uh oh!

APJansen commented Jan 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

APJansen commented Dec 8, 2023 •

edited

Loading