Average pooling with padding - behavior on edges#6296
Average pooling with padding - behavior on edges#6296oscarriddle wants to merge 205 commits intoBVLC:masterfrom
Conversation
|
Copy of @oscarriddle's comment in another thread:
I will take a closer look at this soon. In the meantime, please review your code again - there are several tests that this change fails to pass. |
Previous implementation caused FP overflow for x less than -90
Previous implementation caused FP overflow for x less than -90
… the gradeint computation robust, much like SoftmaxWithLoss layer (see: http://stackoverflow.com/a/34917052/1714410 for more information). (2) supporting loss along axis
The instructions say that MKL is free for students, but as of 8/2015, MKL is free for everyone with community licensing.
As recommended by @longjon, this will allow `caffe.io.array_to_datum` to handle, for example, numpy.float32 arrays. It might be worth noting that `datum.float_data` is stored as protobuf type 2, which is float32, as opposed to protobuf type 1, which is float64. It is a little unintuitive that caffe currently requires data to be passed in as float64 but then writes float32 to LMDB. To demonstrate this: ```python datum = caffe.io.array_to_datum(np.array([[[0.9]]])) caffe.io.datum_to_array(datum) # array([[[ 0.9]]]) datum_str = datum.SerializeToString() new_datum = caffe.proto.caffe_pb2.Datum() new_datum.ParseFromString(datum_str) caffe.io.datum_to_array(new_datum) # array([[[ 0.89999998]]]) ``` This behavior is somewhat hidden because `datum_to_array` returns type float64, even though the data doesn't actually have that resolution if it has been stored as protobuf text anywhere (for example in LMDB). Alternative solutions: * Require and return float32, consistent with the protobuf representation. * Change the protobuf to allow float32 or float64 and update surrounding code to support this.
With reference to this commit: f1a8470 This fix changes some EXPECT_EQ into EXPECT_FLOAT_EQ .
Imported from Debian Package caffe (1.0.0~rc3+20160715-g42cd785-2).
…ted caffe target This is the first step towards "modern" IMPORTED-targets-only CMake setup. The find_package modules still need to be rewritten and upstreamed in form of config exports where possible.
Despite Caffe itself does not use OpenMP, explicitly linking to OpenMP should be done when one statically links to a BLAS library which uses OpenMP internally and does not provide proper CMake imported targets with proper dependencies (nobody this so far).
a few layers make use of otherwise unused diffs to accumulate results, but unless the diffs are cleared in forward this contaminates the gradients when these layers share a bottom and their backward is skipped.
Loading weights is moved from caffe.exe to solver class, so new "weights" solver parameter is used not only from command line but when caffe is used as library (including python) corrected formatting fixed line length more formatting corrected
…points to a directory. See issue BVLC#6110 proposed improvement No.2
draw_net.py refactoring and optional LR visualization * refactoring `get_layer_label` rewrote the function body to make it more streamlined. does not affect inputs and outputs * optionally visualize LR when drawing the network adds an option to `python/draw_net.py` that allows to visualize information about the learning rate multiplier (if relevant) when drawing the network's graph.
…nto fix_ave_pool
|
Hi, I checked the Travis CI failed log. The compilation is successfully done but failed at the running test of AVE pooling layer, looks like the output values aren't equal to expected values. Failed log: Besides, not quite sure what happened to LRNLayer test. Thanks, |
|
Hi, Depending on this fix, I successfully converted a TensorFlow trained InceptionV3 model to a caffe model, and get the exact same final inference prediction result compared to the the TensorFlow. Now I believe that TensorFlow model can definitely be converted to Caffe with almost no accuracy sacrifice (loss around 0.0001%). At the meantime, InceptionV3 network can also runs on Caffe, too! I did it. Next step, I will try to evaluate the inference performance difference between the TF and Caffe. Thanks, |
|
The LRN layer creates some layers internally, among them the PoolingLayer; so if we break one, the other fails too. Looks like you're right: the relevant tests would have to be redesigned as well - not only for the Pooling, but LRN layer as well. At this point the question arises, whether such a change doesn't break existing models. Maybe this should come as an optional behavior, with a boolean switch defined in caffe.proto, to allow the users to choose the layer's behavior at the edges? See #6282 for the same idea applied to a different problem. PS: this PR's commit history became quite weird: looks like you're trying to merge back some 200 commits that are already in master. |
|
OK, that makes sense. I'm not quite familiar with Pull Request, I tried git pull on this branch so those existed commits are somehow merged into this branch. Maybe I should start over from a clean branch. That's a question indeed, an optional toggle seems good for now. Let me have a try on my local repo, and let's have a talk again when everything goes smoothly. Thanks, |
|
Closing, we'll continue in #6303. |
Hi,
I'm currently working on converting a TensorFlow trained InceptionV3 model to Caffe model, because I want to test the inference performance under Caffe. I've almost got there, during which several interesting issues are encountered so here is the one.
The average pooling layer calculates different output compared to TensorFlow, when padding is enabled.
Observation:
If the padded elements are contained within current pooling kernel, caffe will compute the average of all the inputs, including the padded values(which are "0"s). Hence its result gets smaller and severely affected by those invalid "0"s.
Root-Cause:
The root-cause is that no matter whether the padding exists in current computing kernel or not, the pooling_size is fixed to the parameter "kernel_size", like 9 for 3x3 kernel.
Here is part of debugging output within every iteration in AVE pooling:
(Input_size 35x35, kernel_size 3x3, padding 1)
pooled_height_:35, pooled_width_:35 stride_h_:1, stride_w_:1 pad_h_:1, pad_h_:1 height_:35, width_:35 hstart:0, hend:2 wstart0, wend:2 pool_size9 <-Here-> top_data0.730075 pooled_height_:35, pooled_width_:35 stride_h_:1, stride_w_:1 pad_h_:1, pad_h_:1 height_:35, width_:35 hstart:0, hend:2 wstart0, wend:3 pool_size9 <-Here-> top_data0.913562 pooled_height_:35, pooled_width_:35 stride_h_:1, stride_w_:1 pad_h_:1, pad_h_:1 height_:35, width_:35 hstart:0, hend:2 wstart1, wend:4 pool_size9 <-Here-> top_data0.615629 pooled_height_:35, pooled_width_:35 stride_h_:1, stride_w_:1 pad_h_:1, pad_h_:1 height_:35, width_:35 hstart:0, hend:2 wstart2, wend:5 pool_size9 top_data0.370382Note that the hstart, hend, wstart, wend are updating every iteration, but pool_size stays at 9.
Analysis:
Generally speaking, average pooling shan't takes invalid padded values (like 0), because it will pollute the feature map and downgrade the significance of the edge elements.
So I dive into the pooling_layer.cpp, and find below codes for calculating the variable "pool_size".
Obviously, the variable "pool_size" determines the valid number of elements in each iteration. I can get the designer's idea is to update the pool_size along with kernel sliding but not fix it to a value. However, it's been initialized before the "hstart, hend, wstart, wend" getting updated. So the pool_size couldn't be updated eventually.
Solution:
My modification is quite simple, but I think it's very important.
Hope to hear some comment from your side.
Thanks,
Xiaolun Cao