-
Notifications
You must be signed in to change notification settings - Fork 43
Improve reproducibility of preprocessing; add ALCF documentation #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Canonical order: train, validate, test
to preprocess.py
Now, all criteria for excluding a shot from the input raw shot lists trigger "[omit]" string in their diagnostic when they are satisfied. Should make searching the piped output easier
as in tensorflow. Occurs on ALCF Theta
numpy 1.17.2
tensorboard 1.12.2 pypi_0 pypi
tensorflow 1.12.0 pypi_0 pypi
tensorflow-base 1.14.0 eigen_py36hf4a566f_0
tensorflow-estimator 1.14.0 py_0
/home/felker/FRNN_project/build/miniconda-3.6-4.5.4/miniconda3/4.5.4/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550:
FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Currently, preprocessing dumps all machines/shotlists with the same signal group hash into the same folder. There were no collisions in file names because D3D and JET shot numbers do not currently overlap. Unify implementations of get_individual_shot_file() in utils/processing.py (fairly confident that warning comment about globals incompat with multiprocessing is no longer valid). Use os.path.join() instead of manual += '/' Need to test these changes.
- Consider wrapping import onnx, etc. in try/except to make this an
optional dependency that automatically runs if installed
- Specify Opset=10, for now
- Only add dropout parameters to RNN layer if CuDNNLSTM is not used
- ONNX conversion will not fail fatally if op is not supported. Need to
evaluate if CuDNNLSTM output is usable at all (or with non-GPU
inference), given the following warning that is emitted:
WARNING:tensorflow:From
/home/kfelker/.conda/envs/frnn/lib/python3.7/site-packages/keras2onnx/subgraph.py:156:
tensor_shape_from_node_def_name (from
tensorflow.python.framework.graph_util_impl) is deprecated and will be
removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.tensor_shape_from_node_def_name`
Cannot infer shape for TFNodes1/cu_dnnlstm_1/CudnnRNN:
TFNodes1/cu_dnnlstm_1/CudnnRNN:3
Tensorflow op [TFNodes1/cu_dnnlstm_1/CudnnRNN: CudnnRNN] is not
supported
Unsupported ops: Counter({'CudnnRNN': 1})
Cannot infer shape for TFNodes/cu_dnnlstm_2/CudnnRNN:
TFNodes/cu_dnnlstm_2/CudnnRNN:3
Tensorflow op [TFNodes/cu_dnnlstm_2/CudnnRNN: CudnnRNN] is not supported
Unsupported ops: Counter({'CudnnRNN': 1})
Only meaningful difference in Conda YAML is the removal of the ppc64le IBM AI Conda channel
Not currently valid field in environment YAML file. Follow conda/conda#8675 Until then, use conda config --set channel_priority strict
Intentionally add PEP 8 style error in order to test Travis CI email notifications on failed builds
Modified version of https://github.com/philipperemy/keras-tcn
Both Keras v2.3.0 and v2.3.1 on Traverse (and at least the latter on
TigerGPU) die with:
WARNING:tensorflow:From
/home/kfelker/.conda/envs/frnn/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630:
calling BaseReso\
urceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops)
with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Printing out pre_rnn model...
Traceback (most recent call last):
File "mpi_learn.py", line 111, in <module>
shot_list_test=shot_list_test)
File
"/home/kfelker/.conda/envs/frnn/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py",
line 1229, in __imul__
raise RuntimeError("Variable *= value not supported. Use "
RuntimeError: Variable *= value not supported. Use
`var.assign(var * value)` to modify the variable or `var = var *
value` to get a new Tensor object.
Incompatibility likely fixed in TF >= v2.0 and/or TF's internal Keras
tensorflow/tensorflow#27829
Re-check this after moving to TensorFlow's internal Keras in #43
Add conda-forge to channels above Anaconda Cloud defaults Need to reevaluate these choices later on
Use "sync", not "synch", for "synchronization" abbreviation
Open
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Prefixed preprocess shot
.npzfiles with machine name. Closes Delineate processed shot .npz files in same signal group folder by machine name #45.Preprocessing and normalization diagnostics:
[omit]every time a shot is excluded from either procedureRemove redundant "
data" token from shot list variable names inconf_parser.pyAdd ALCF documentation
Use 2 space indentation in all YAML
Delete single-GPU
runner.pyAdd
tcn.pyfile (c522d3b) missing from Cleaned and squashed merge of @ge-dong fork #50.Add ONNX writer using https://github.com/onnx/keras-onnx
Start encapsulating
module,pip, andcondadependencies in platform-dependent files in new directoryenvs/. Closes Add platform-dependent Conda YAML and environments/module dependencies #47.Add YAML linter
Update ALCF Theta documentation
Make ONNX writer optional;
builder.pyshould not fail due toimport onnxorimport keras2onnx