Skip to content

n3fit merge-ready PR#499

Merged
Zaharid merged 15 commits into
masterfrom
n3fit-ProductionVersion
Jul 19, 2019
Merged

n3fit merge-ready PR#499
Zaharid merged 15 commits into
masterfrom
n3fit-ProductionVersion

Conversation

@scarlehoff
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff commented Jul 3, 2019

As requested during today's meeting. This PR supersedes #461 containing all the new code from n3fit in the form of 3 very elegant commits.

Note: the documentation won't compile automatically (because obv this is not merged with doc_folders) but it should work by just adding the following line to the makefile

 sphinx-apidoc -o ./source/modules/n3fit ../../n3fit/n3fit

(writing it here mainly for my own convenience)

The how to for n3fit can still be read in the text of PR #461.

If this PR works there will be a conda package in the server called n3pdf. If it doesn't work... then I'll be sad because the PR won't be so elegant anymore!

In the next PRs...

There are several things that are added in this PR because they work but I don't think they are production-ready, this corresponds to Statistics.py and the hyperopt routines.

So after this one is merged there will be two more PR:

Statistics

For a lack of a better name. This will break what is now in the Statistics module into several different files. Probably something along the lines of: stopping, animations, positivity checks.

Hyperopt

The hyperoptimization capabilities work perfectly fine, but the analysis routines are a bit of a mess since they were written as part of the analysis and were constantly changing. Now that we have a final version a lot of refactoring is needed to have them in some nice form.

Tests

Some nice 1/2-minutes long regression test would be nice to know when things change (although not sure we would want to take a difference in a fit regression tests as a failure)

@scarlehoff scarlehoff requested a review from Zaharid July 3, 2019 15:57
@scarlehoff
Copy link
Copy Markdown
Member Author

scarlehoff commented Jul 3, 2019

Ok, stupid question @Zaharid, I copied the way VP is in the cmake recipe

option(VP_DEV "validphys in developer mode" ON)
option(N3_DEV "n3fit in developer mode" ON)

but shouldn't the default be OFF? Naively I would think we would want the default to be not the pip install -e?

EDIT: Indeed, the default is off, I missed the VP_DEV=off flag.

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 4, 2019

(Had replied by email, but don't see it here as a comment):

The reasoning was that you are supposed to use the conda package for deployment. But if you bother to install from source, then you probably want to modify the source.

In any case, this discussion is making me thing that perhaps we really want one python package rather than several. That is a net win for an user of nnfit because it would automatically set up the validphys dependency, and it is ok for validphys as long as the extra dependencies don't cause too much problems.

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 5, 2019

In fact how difficult would be to move everything into the existing python package? It would be a net win for the infrastructure because you wouldn't have to worry about these kinds of issues, it would be a win for users of n3fit because it is one less thing to set up, but would imply that tensorflow and keras become a dependency to everything.

@scarlehoff
Copy link
Copy Markdown
Member Author

The only thing you need to do is to add the n3fit directory to the validphys setup.py, right? Everything else should be automatic, the nnpdf conda package is just a subset of the n3pdf package.

@scarlehoff
Copy link
Copy Markdown
Member Author

Do you want me to do it or this harmonization is something you prefer to do yourself?

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 5, 2019 via email

@scarlehoff
Copy link
Copy Markdown
Member Author

scarlehoff commented Jul 5, 2019

The two solutions I can see means either putting the n3fit directory together with validphys (bad idea, since we are mixing also the evolven3fit, runcards, hyperopt plotting) or having a very clunky setup.py (bad idea because it defeats the purpose).

So I think it is better to have (for now) two separated packages. The conda package can still be just one.

note: if there is some setuptools syntax that allows for having two separate packages in just one beautiful setup file I have failed to find it.

The only thing you need to do [...] Everything else should be automatic [...]

I was young and inexperienced...

@scarlehoff scarlehoff mentioned this pull request Jul 5, 2019
@wilsonmr
Copy link
Copy Markdown
Contributor

wilsonmr commented Jul 8, 2019

Firstly having looked through and used the code I think it's very nice and in general I find it easier to modify than the c++ code (obviously)

One question I had was wrt a comment @Zaharid had made with merging this code but having a PR which could be checked in more detail - what are the practical implications of that? Will this PR be merged but then checked after or did I miss something?

I wrote a really long comment before with one observation for perhaps some slightly longer term development of this code which I think would improve the readability, cleanliness and future extension I've tried to summarise it shorter here:

TL;DR

This code could possibly be improved by leveraging reportengine in particular the ConfigParser more to read the runcard right at the start (instead of a bunch of different functions) and building these resources in a similar way to validphys and passing them to providers in a similar way. Of course this could change with a potential rewrite of reportengine

Example

the parameters key is a dictionary defined in runcard with the following (I just left the keys which control PDF layer specification):

    nodes_per_layer: [35, 25, 8]
    activation_per_layer: ['tanh', 'tanh', 'linear']
    initializer: 'glorot_normal'
    layer_type: 'dense'
    dropout: 0.0

for starters there could actually be more control on a layer by layer basis:

layers:
 - name: layer1
   layer_type: dense
   activation: tanh
   nodes: 10
   dropout: 0.0
   initializer: 'glorot_normal'

if any of those keys are incorrect then the fit would not run (this is less important since the parsing of the runcard in general happens before the actual fitting part happens - although could generate cleaner errors). More importantly there isn't an unresolved dictionary floating around which appears both in the fit function and gets passed to ModelTrainer etc. Also one could still define 'global' variables because of the namespaces of reportengine:

dropout: 0.0
initializer: 'glorot_normal'
layer_type: dense
layers:
 - name: layer1
   activation: tanh
   nodes: 10

then I think it could be more transparent how certain resources are built in this fitting framework and also they could be passed around in a slightly cleaner way (maybe this is biased by the fact I just know my way around validphys a lot better than the n3fit code)

@scarlehoff
Copy link
Copy Markdown
Member Author

scarlehoff commented Jul 8, 2019

One question I had was wrt a comment @Zaharid had made with merging this code but having a PR which could be checked in more detail - what are the practical implications of that? Will this PR be merged but then checked after or did I miss something?

As far as I understand it was for organizational purposes. The history tree of the previous commit became a bit of a mess after I did too many rebases.

This code could possibly be improved by leveraging reportengine in particular the ConfigParser more to read the runcard right at the start (instead of a bunch of different functions) and building these resources in a similar way to validphys and passing them to providers in a similar way. Of course this could change with a potential rewrite of reportengine

I think it is a good idea and would be an interesting enhancement to have. In particular I really like this syntax

layers:

  • name: layer1
    layer_type: dense
    activation: tanh
    nodes: 10
    dropout: 0.0
    initializer: 'glorot_normal'

What worries me (and I have not much experience with validphys so take this with a grain of salt) is that I am not sure how this will play along with the hyperparameter scan.
The reason why ModelTrainer is such a monster is basically to allow the scan of hyperparameters to occur. In other words, those validphys providers will need to parse the runcard within ModelTrainer.

This last part is highly non-trivial to do in a nice way (for instance, having exactly the same functions in Model Trainer but making them into calls to more complicated functions somewhere else would not be nice imho).

I think there are two different things here that can be eventually added to the code:

  • Rationalizing the syntax of the runcard: that can be done for sure. It's not trivial in that the syntax for the hyperoptimization has to follow but to a certain level is just refactoring: can always be done.
  • Reading the NN-part of the runcard from reportengine: not entirely sure if it can be done in a sensible way without loss of generality.

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 8, 2019

@wilsonmr I guess in my opinion this PR to be reviewed and exercised in some depth, especially with regards to the overall structure. However at this point, I can't do a line by line review in a reasonable amount of time (and I don't see a solution involving an actionable amount of effort), so we should settle for merging as soon as the big picture is settled.

A few comments.

  1. Define lists of namespaces.
  2. Collect fitting results over the list.
  3. Use the result to compute overall statistics.

Of course in actual fact reportengine doesn't do that much, which means that there are various other ways to organize things. And it may be that some library has its own opinion on how to set up the pipeline, which is fine.

  • I associate "production ready" with high backward compatibility guarantees for the runcard. I think that having a way of rerunning a fit that is stable upon code changes is crucially important. And btw one of the few places where we have actually done a pretty good job so far. So I'd say having an idea on how the runcards will look like now and in the future should be a priority. Also like @wilsonmr says good error messages are important for quality of life purposes. Note that if you are lucky validphys will even tell you the offending line of the runcard on error (and that should get much better with reportengine 1.0).

  • Some more docstrings (especially on top of the modules) wouldn't hurt. For example it is not clear to me what Stat_Info does (and seems that the #TODO comment that it should be refactored is apt).

  • Would it be too much to ask to standarize on PEP8 naming conventions? lowercasemodulename variable_or_function_name, CONSTANT, ClassName https://www.python.org/dev/peps/pep-0008/#naming-conventions.

This version looks at the sum instead of at the norm
"""

class MinMaxWeight(MinMaxNorm):
Copy link
Copy Markdown
Contributor

@Zaharid Zaharid Jul 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be a top level class rather than some nested method. This creates a new class every time you call it, which is not what you want. If you like, you can override __init__ to take min_value and max_value positionally.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a look. But I think Keras might be wanting a new class every time.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect and hope not.That would be a gross antipattern.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using it in a way different to what it was intended for so it would be my fault.

But you are right, looking again at how constraints work I think I can even get away with using the same instance of the constraint for all weights of the same layer...

@scarlehoff
Copy link
Copy Markdown
Member Author

* Conceptually reportengine should be good at doing various sorts of scans. [...]   that some library has its own opinion on how to set up the pipeline, which is fine.

Yes. I mean, you would give up some of the control to hyperopt or would have to implement in some way some of the hyperopt stuff in each of the report engine providers.

I think it could eventually be done but I think is non trivial (or highly messy, which to me are equivalent) so I would put it as low priority for now.

So I'd say having an idea on how the runcards will look like now and in the future should be a priority.

I have really no opinion on this. I like the syntax @wilsonmr proposed but it would require some work making sure everything works fine. Not difficult, but not just said and done.

That said,a vp provider which makes the nice syntax into the ugly dictionary will be the path of least resistance and would mean already having the structure there if in the future the vp providers need to be the ones creating the network.

* Some more docstrings  (especially on top of the modules) wouldn't hurt. For example it is not clear to me what Stat_Info does (and seems that the `#TODO` comment that it should be refactored is apt).

Everything in Statistics.py will go away. My plan is to have another PR in the future with the animations, positivity and stopping well separated.
It's the same for the Hyperopt files. Their documentation and refactoring will come in a different PR.

I can create those PR before the merging of this one but I need a few days. I will be freer after next week so I can do it then.

* Would it be too much to ask to standarize on PEP8 naming conventions? 

Most of it does. There are several places where I am copying other names (for instance the function is just a wrapper to some Keras class). I'll have another go at pylint (because last time I got bored) but most of the ones left are motivated.

try:
opt_tuple = self.optimizers[optimizer_name]
except KeyError:
raise Exception(f"[MetaModel.compile] Optimizer not found {optimizer_name}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise Exception is generally discouraged, because you catch everything along with it. In fact leaving the original KeyError would alone would be a bit better because it gives a clearer taceback and less oportunities to catch it unintentionally.

In general python people (including core python https://www.python.org/dev/peps/pep-3151/ https://www.python.org/dev/peps/pep-0479/) have learned that fine grained exceptions are a good thing.

One option is to define something like class ModelNotFound(KeyError): pass and then raise that. Also when you raise an exception with additional information, it is a good practice to do:

try:
    ...
except SomeGenericException as e:
    raise CustomError(f'{e}') from e

because it gives clearer and more explicit trackbacks (instead not having from e looks like a bug in the exception handler).

That said, checking for the model existence is one of these things that valifdphys should do in early stages cc @wilsonmr

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a funny piece of trivia related to why generic exceptions are a terrible idea, explain this piece of code

In [1]: class X: 
   ...:     @property 
   ...:     def prop(self): 
   ...:         return self.patata 
   ...:                                                                                                               

In [2]: hasattr(X, 'prop')                                                                                            
Out[2]: True

In [3]: hasattr(X(), 'prop')                                                                                          
Out[3]: False

It took me several hours to find a bug related to this once. I even proposed that python should be changed!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, checking for the model existence is one of these things that valifdphys should do in early stages

I insist, this is cool but not trivial. And before even thinking about doing this there is a lot of "destroying C++" to do which have way higher priority.

I'm doing changes to the code as you guys comment, but I'll wait until the review is finished to push if that's ok. Then you can comment on those changes, that way it's a bit better organized.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the custom error is clearly an easy thing which can be changed now, there are a few of these generic exceptions being raised and I think changing these is manageable now.

Of course it would be nice if the errors raised were at a stage where the runcard is being checked and resources built as I said, however I think @scarlehoff is right that things need to be done in priority order. In this case the fit will fail fairly early on (like in the first couple of minutes?) which can be improved but I think is fine as jumping off point for future iterations of n3fit.

Also I think making these changes might be easier if they were done in smaller batches - moving the building of various resources and integration with validphys tools is surely easier if done incrementally? At some point we need a skeleton fitting code which is doing something pretty sensible (which I think this is) and then have smaller pull requests to really clean it up.

ps: at a code call there was a list of priority things mentioned which I believe you all agreed on - I think that the items of this list are mentioned in the closed PR and possibly in various issues but perhaps it would be worth compiling them somewhere in priority order if they aren't already?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the custom error is clearly an easy thing which can be changed now, there are a few of these generic exceptions being raised and I think changing these is manageable now.

Yes I agree, the exceptions it's something I am looking at. I was thinking on the vp integration only in the previous comment.

# Since the history object can have more than one loss (if epochs != 1) but it is always a list
# Save them as the mean

total = np.mean(hobj["loss"]) / self.ndata_tr
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering what the mean is taken over here - is it epoch? If so how many epochs, can we not just take the final value?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is always 1 epoch but it could potentially be more than one.
I decided on the mean because the only reason I found to increment the number of epochs was to avoid cases in which by chance we have a fake very low loss.

In practice it rarely happens though.

try:
layer_tuple = layers[layer_name]
except KeyError:
raise Exception(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably another place where we could change the exception to be a bit more helpful

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 9, 2019

Ok, I take the point that we should anyway sort out how the providers should look like (with the data keywords and so on) and that there hasn't been a lot of progress lately. Agreed that it is probably a bad idea to wait and anyway it will make reviewing the later improvements easier.

On the camp of things that can be changed now I would place changing things like Statistcs.py -> statiscts.py on the grounds that it is what PEP8 says but mostly that it makes me irrationally agitated.

Comment thread n3fit/src/n3fit/ModelTrainer.py Outdated
self.log_info = log.info
else:
self.log_info = print
self.log = log
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a strong reason not to use a global logger object here. A typical convention is to do

import logging
log = logging.getLogget(__name__)

and that allows you to control the logger from calling code in various ways without having to pass it around as a parameter. A minor annoyance is that by default log.info is disabled, but it isn't in a vp app. Another possible source of annoyance is that by default it printd to stderr but that can be justified. On the plus side vp prints a nice green text.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To allow for more flexibility, e.g., if you run in a cluster then you can send the messages to some socket in your home computer or stuff cool like that which will never be implemented and it is not that useful anyway. So...

Do you have a strong reason

No.

@scarlehoff
Copy link
Copy Markdown
Member Author

On the camp of things that can be changed now I would place changing things like Statistcs.py -> statiscts.py on the grounds that it is what PEP8 says but mostly that it makes me irrationally agitated.

Maybe I misunderstood how PEP8 works but I though classes were supposed to be capitalized (that file is a bit special in that even the class name is different from the filename though).

I know the a-file-per-class thing is not the most pythonic thing, but it is something I am really fond of. I can make all files lower case but I like being able to see from an ls which files are classes and which are libraries of functions/classes.

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 9, 2019

Indeed one typically doesn't do one file per class but OTOH a class with some helper functions is not that atypical (and free functions for when you don't care about the internal state such as in the random or tarfile modules). In any case module names tend to be lower case rather consistently. For example the standard library has enum.Enum, decimal.Decimal , fractions.Fraction, contextvars.ContextVar and also datetime.datetime for some reason.

@scarlehoff
Copy link
Copy Markdown
Member Author

Yes, modules yes. Here you have layers.Observable, backends.MetaLayer or backends.operations.op_add

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 9, 2019

Yes, modules yes. Here you have layers.Observable, backends.MetaLayer or backends.operations.op_add

Yeah, those are all fine.

@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 9, 2019 via email

Copy link
Copy Markdown
Contributor

@Zaharid Zaharid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a considerable amount of complexity in ModelTrainer, and at least some of it is likely not strictly required as it can easily be fended off to calling code.

I'd aim t making it smaller, and documenting clearly the state and especially the mutable state (which should be treated as radioactive material and probably separated off in different classes).

""" If a model_file is set the training model will try to get the weights form here """
self.model_file = model_file

def set_hyperopt(self, on, keys=None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of this could be more clear. And the way it works really. It seems to me that on and keys do entirety different things, and so it should be two methods.

Comment thread n3fit/src/n3fit/ModelTrainer.py Outdated
else:
self.no_validation = False

def set_model_file(self, model_file):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idiomatic way of doing this is with @property and property.set.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but is there an use case for mutating this after initialization?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. For doing more than one replica sequentially you don't want to reinitialize the whole thing but you might have different check-points for all of them.
(for doing more than one in parallel I still don't know how to do it so the method is a bit of a placeholder at the moment)

Agree on the comment about idiomatic(ity) though

# and so it should affect all models
tr_model.load_weights(self.model_file)

if self.print_summary:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the print_summary option and use the appropriate log level instead.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The summary is printed by Keras to stdout. You mean use the log level as the if condition?

Comment thread n3fit/src/n3fit/ModelTrainer.py Outdated
*called in this way because it accept a dictionary of hyper-parameters which defines the Neural Network
"""
def __init__(
self, exp_info, pos_info, flavinfo, nnseed, pass_status="ok", failed_status="fail", log=None, debug=False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand what pass_status and failed_status are for or why are they user settable, but it seems to me that this class already does too many things and should use a boolean instead, leaving the formatting elsewhere.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the log argument.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering there is a sole use forself.debug in one function and it changes some complicated global state, I'd remove it altogether and let the calling code deal with it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand what pass_status and failed_status are for or why are they user settable, but it seems to me that this class already does too many things and should use a boolean instead, leaving the formatting elsewhere.

Because these are necessary for hyperopt. If you implement some other hyperoptimization library your pass flag might be something else.

I think you can remove the log argument.

Ups, forgot about it.

Considering there is a sole use forself.debug in one function and it changes some complicated global state, I'd remove it altogether and let the calling code deal with it.

What do you mean "the calling code"? I can call it something else but I need a on-off flag there to decide whether to completely clean the Keras state or not.

):
"""
# Arguments:
- `exp_info`: list of dictionaries containing experiments
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should describe all the arguments, such as nnseed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does?
Or do you mean there is not enough description?

############################################################################
# # Parametizable functions #
# #
# The functions defined in this block accept a 'params' dictionary which #
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is the kind of think that reportengine solves nicely, allowing you to define functions in terms of the parameters you actually care about while dispatching them automatically. But fine, let's see about that later.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this is a preference thing, but I would rather have everything that can be parametrized within the same class. The rationale behind this (and this ties in with your initial comment) is the following:

I want to create a ModelTrainer object which has all the information necessary in order to generate the Neural Network and the training, this is:

  • All the experimental data and fktables
  • All methods that generate pieces of the NN
  • All methods dedicated to the training

I can have them in different classes, that's fine, but that won't reduce the complexity, it will just spread it out and at the end of the day the object I need to produce must contain everything that you see in ModelTrainer.py

My goal with this was reducing to the minimum the overhead from one call of hyperopt to the next (I'm sure it can still be reduced though).

Note: there are several things in this class that can be done a bit better to improve readability. But since it is tied to the statistics thing I'll do some work on that on that PR.

Comment thread n3fit/src/n3fit/msr.py
)


def compute_arclength(fitbasis_layer, verbose=False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use log.debug instead of verbose.

@@ -0,0 +1,44 @@
def set_initial_state(debug=False, seed=13):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this file should have a docstring explaining what it does. It is not immediately obvious to me at all.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write a bit more in the docstring, but it just sets several initial states.

return 0


def clear_backend_state(debug = False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really should take no arguments and be used as

if debug:
    clear_backend_state()

"""
if not debug:
print("Clearing session")
from keras import backend as K
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the import done internally for any particular reason? If so, please add a comment.

Copy link
Copy Markdown
Member Author

@scarlehoff scarlehoff Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, there was a reason I cannot have it outside. In order to set the random seed I need to set the rn and numpy seeds before I ever import keras.

But the fact that it took me a while to remember points clearly to the necessity of a comment :P

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out it is enough if I clear the session so all points regarding this file implemented.

self.training["pos_multiplier"] = pos_multiplier

def _generate_pdf(self, params):
def _generate_pdf(self, nodes_per_layer, activation_per_layer, initializer, layer_type, dropout):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is definitely a lot more readable!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did @Zaharid told you to write that comment? :P?

(just had a discussion with him about which version looked best)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha no, I was just thinking it. I guess I accidentally proved a point? :P

@Zaharid Zaharid mentioned this pull request Jul 18, 2019
5 tasks
@Zaharid
Copy link
Copy Markdown
Contributor

Zaharid commented Jul 19, 2019

Good. I am going to merge this. Let's make improvements (which will hopefully be many but much smaller in size) against master.

@Zaharid Zaharid merged commit 1b81f09 into master Jul 19, 2019
@scarlehoff scarlehoff deleted the n3fit-ProductionVersion branch May 6, 2020 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants