Python log parser#1384
Conversation
|
Thanks for the PR! Just in case you are wondering - we are currently at the CVPR crunch mode (deadline Nov 15) so kindly expect things to move faster after the deadline. |
|
@drdan14 this looks nice! Thanks for the improved tool. For author / copyright date metadata we rely on commit messages instead of code comments. How about dropping the author and date lines in the comment block? If you are truly worried about attribution it's fine, since this is a bonus tool, but note that in the core we don't have any author comments in the code (since versioning tracks it perfectly). Could you drop 83ce49d and instead ignore PyCharm in your local configuration? |
|
@shelhamer all done. I implemented your requested changes and squashed them |
Was previously using `Train net output #M: loss = X` lines, but there may not be exactly one of those (e.g., for GoogLeNet, which has multiple loss layers); I believe that `Iteration N, loss = X` is the aggregated loss. If there's only one loss layer, these two values will be equal and it won't matter. Otherwise, we presumably want to report the aggregated loss.
Output format is unchanged (except that csv.DictWriter insists on writing ints as 0.0 instead of 0)
|
Thanks for the orderly PR and useful improvement. |
|
@drdan14 the initial learning rate you could get it from the solver, just look for |
|
I tested with some logs and failed where parse_log.sh worked. So I think it will need some review. |
|
@sguada if the training is continued from a snapshot and |
|
@sguada, the current version of What do you think? |
|
Yeah, that's true. Probably it will be better to read everything from the log then, maybe just store all the available information per I don't think is good to assume the name of the For example given a log containing: I would expect an output for |
|
@sguada I think that's a reasonable change, but changing the output format may break other scripts down the line, such as |
|
@drdan14 Yeah it is worthy, now it is common to have multiple losses and different accuracies, the names of the columns should be read from the file. Example: PD: I forgot that csv files should include the time too, and that the delimiter should be allow to be user defined. |
|
OK, that sounds reasonable. Look for an updated PR within a couple of days. |
Over version introduced in BVLC#1384 Highlights: * Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders. * The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`. * Fixed the bug/feature of the first version where the initial learning rate was always NaN. * Add optional parameter to specify output table delimiter. It's still a comma by default. You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
Over version introduced in BVLC#1384 Highlights: * Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders. * The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`. * Fixed the bug/feature of the first version where the initial learning rate was always NaN. * Add optional parameter to specify output table delimiter. It's still a comma by default. You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
Over version introduced in BVLC#1384 Highlights: * Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders. * The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`. * Fixed the bug/feature of the first version where the initial learning rate was always NaN. * Add optional parameter to specify output table delimiter. It's still a comma by default. You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
Over version introduced in BVLC#1384 Highlights: * Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders. * The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`. * Fixed the bug/feature of the first version where the initial learning rate was always NaN. * Add optional parameter to specify output table delimiter. It's still a comma by default. You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
Over version introduced in BVLC#1384 Highlights: * Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders. * The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`. * Fixed the bug/feature of the first version where the initial learning rate was always NaN. * Add optional parameter to specify output table delimiter. It's still a comma by default. You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
Over version introduced in BVLC#1384 Highlights: * Interface change: column order is now determined by using a list of `OrderedDict` objects instead of `dict` objects, which obviates the need to pass around a tuple with the column orders. * The outputs are now named according to their names in the network protobuffer; e.g., if your top is named `loss`, then the corresponding column header will also be `loss`; we no longer rename it to, e.g., `TrainingLoss` or `TestLoss`. * Fixed the bug/feature of the first version where the initial learning rate was always NaN. * Add optional parameter to specify output table delimiter. It's still a comma by default. You can use Matlab code from [this gist](https://gist.github.com/drdan14/d8b45999c4a1cbf7ad85) to verify that your results are the same before and after the changes introduced in this pull request. That code assumes that your `top` names are `accuracy` and `loss`, but you can modify the code if that's not true.
While
tools/extra/parse_log.shis a beautiful piece of Bash engineering, it is rather convoluted and difficult to read and modify unless you're a real Bash scripting pro. So I'm introducingparse_log.py, a python version of the log parser, which does basically the same thing but with a few functionality changes that I prefer over the defaultparse_log.shimplementation.And of course, you can use it as a library instead of writing the output tables to files if you want to run the code from within some other Python program (e.g.,
plot_training_log.py).Usage is very similar to
parse_log.sh; see it with./parse_log.py -h:Differences between
parse_log.pyandparse_log.sh:readtablecommandNumItersnot#Itersbecause some programs (e.g., Matlabreadtableagain) do not like weird characters in the column headersparse_log.shthe output files ended up in the current working directoryparse_log.shwhere the time of the first row was typically near zero; however, the offsets are the samelr = ...line and then rewind, which isn't very aesthetically pleasingparse_log.sh,parse_log.pydoesn't assume thatTest net output #0is Accuracy andTest net output #1is loss, which turns out to depend on the network definition (e.g., it's wrong for the GoogLeNet network defined in GoogLeNet training in Caffe #1367, which has multiple loss and accuracy layers); it explicitly looks for theaccuracy = ...andloss = ...lines.Iteration N, loss = Xinstead of lines likeTrain net output #M: loss = Xto matchparse_log.sh.For your amusement, here's the first few lines the parsed training file both from
parse_log.shandparse_log.py:parse_log.sh:parse_log.py