Add stable nrm2 for L2 normalization by TccccD · Pull Request #12440 · apache/mxnet

TccccD · 2018-09-03T08:54:25Z

Description

Call interface of the 2-norm (#11573) to, it got the same results as this method in three different modes and solve this problem #11938 ;
But I think my modifications may still be optimized，
please help me to optimize it. thanks!
@haojin2 @piiswrong @leezu @anirudh2290 @szha

I encountered some problems in rebase, so I opened a new pr. @leezu , the previous pr is #12287

TccccD · 2018-09-03T11:12:12Z

failed..... @leezu , maybe my code is wrong, can you help me?

kalyc · 2018-09-14T18:28:57Z

Thanks for your contribution @TccccD
@mxnet-label-bot[pr-awaiting-review]

vandanavk · 2018-09-25T17:17:15Z

@TccccD @leezu Please have a look at the CI build failure so we can proceed with the review.

@sandeep-krishnamurthy Please change the label to pr-awaiting-testing

vrakesh · 2018-10-09T07:02:15Z

@TccccD @leezu Requesting a look at the build failures, please update at the earliest.

Roshrini · 2018-10-18T21:23:27Z

@TccccD @leezu Could you please take a look at build failure and work on fixing it?

TccccD · 2018-10-19T01:37:33Z

I have no idea. This is no problem when I use it offline, and I have no problem with all the test cases. But I don't know why the test is not available. I can't get the same environment as the test.
@Roshrini @vrakesh ,
I need your help @leezu

leezu · 2018-10-19T07:23:34Z

@TccccD I'm not familiar with the test environment and currently working on another project. Sorry that I can't help in the short-term with this. Perhaps one of the people working full-time on mxnet can look into this, @Roshrini ?

ankkhedia · 2018-10-30T17:49:13Z

@marcoabreu @lebeg @larroy Could you please help @TccccD with the test failures?
It seems the tests are running fine locally but failing on CI

anirudhacharya · 2018-11-13T00:39:59Z

@TccccD @lebeg @larroy any update on the tests for this PR?

TccccD · 2018-11-13T01:58:21Z

@TccccD @lebeg @larroy any update on the tests for this PR?

ok, I will try again.

TccccD · 2018-11-15T02:00:35Z

I submitted it three times. The first time was successful, the next two times were other mistakes(Failed with error: 'there is no package called 'shiny''). I don't know why.
And I do not change anything. I think it is ok?
@anirudhacharya @leezu @szha

stu1130 · 2018-11-20T22:28:26Z

@mxnet-label-bot update [pr-awaiting-review]
@apeforest @samskalicky @ChaiBapchya @yuxihu ping for review

vandanavk · 2018-11-27T21:01:13Z

LGTM.

@apeforest @samskalicky for further comments/review

TccccD · 2018-12-03T02:16:38Z

In TBlob::reshape, it got different shape size.
However, I have traversed all the test samples in the test, and there is no error under the line;
I used random seed=1529828368, this seed reported an error online, but there is no problem offline.
@vandanavk @marcoabreu

apeforest · 2018-12-04T17:29:36Z

-      });
-      norm = F<mxnet::op::mshadow_op::square_root>(norm);
-      out = data / broadcast<0>(norm, out.shape_);
+      out = data / mshadow::expr::broadcast<0>(norm, out.shape_);


You are already using namespace mshadow::expr. The identifier is not necessary here.

If I use
mxnet::op::ReduceAxesComputeImpl
it will got an error:
src/operator/./l2_normalization-inl.h:91:7: error: ‘ReduceAxesComputeImpl’ is not a member of ‘mxnet::op’
I think it may be because the L2_norm compilation order is before broadcast_reduce_op.h
So, I use
#include "./tensor/broadcast_reduce_op.h"
and this requires that I must use mshadow::expr:: again, otherwise conflicts will occur.
@apeforest

IC, the reason is that in ./tensor/broadcast_reduce_op.h "broadcast" was used to define a namespace. Maybe this is where the conflict comes from.

lupesko · 2018-12-18T02:10:50Z

@apeforest can you please check out latest response by @TccccD ? Thanks!

anirudh2290 · 2018-12-19T00:36:08Z

+      const std::vector<TShape> &in_shape) const override {
+    return{ ResourceRequest::kTempSpace };
+  }
+


overall looks good, can you add a test for the failing test case that you mentioned in the issue.

The main problem is that we don't know the cause of this problem. It is difficult to build a test case. @anirudh2290

Can you just use the same one you reported in issue #11938?

Okay, I get it now...

apeforest

After carefully reading the code, I think this change is too hacky. The way you are using the ReduceAxesComputeImpl to hack around the mxnet_op::Kernel call is anti-pattern and will miss many optimization in the Kernel loop and cause more maintainence efforts down the road.

I suggest you follow the current Kernel and Map pattern in the existing code. You may refer to other operators for references. Thanks for your contribution.

anirudh2290 · 2018-12-19T23:47:14Z

There are pros and cons to both approaches. If @TccccD has to implement with Kernel::Launch approach he has to rely on mshadow reduce_with_axis and that is not very performant compared to the ReduceAxesComputeImpl. There is also another advantage that it moves the computation of L2 norm to a single location for both L2 Normalization op and LPNorm Operator and is more maintainable. As the goal of the PR is to improve stability of L2 norm operator it should be fine.

apeforest · 2018-12-28T17:54:12Z

@ZhennanQin Could you please help to review this PR? Thanks!

anirudhacharya · 2019-01-11T22:02:41Z

@apeforest @ZhennanQin ping for review

ZhennanQin · 2019-01-14T02:13:14Z

Code change looks good to me. Do you need any help to apply this change to CPU implementation at https://github.com/apache/incubator-mxnet/blob/master/src/operator/l2_normalization.cc ?

sandeep-krishnamurthy · 2019-01-25T23:10:35Z

@TccccD - Can you please take look at @ZhennanQin comment, I think we are closer to getting this merged. Thanks for your contributions.

vandanavk · 2019-02-05T19:04:05Z

@mxnet-label-bot update [pr-awaiting-response]

ankkhedia · 2019-02-14T22:45:50Z

@TccccD Thanks for the contribution! Could you please take a look into final few comments by @ZhennanQin

TccccD · 2019-02-27T02:35:30Z

I think the implementation of the CPU may not be related to the error, but I can try to implement one.
@ZhennanQin @sandeep-krishnamurthy @ankkhedia

vandanavk · 2019-03-04T21:57:04Z

@mxnet-label-bot add [Operator, pr-work-in-progress]

pinaraltinyilmaz · 2019-03-19T23:15:23Z

@TccccD Any updates on the requested changes?

abhinavs95 · 2019-03-27T22:56:29Z

@TccccD Could you please give an update? Thanks

piyushghai · 2019-04-04T17:54:57Z

@TccccD Ping again.

TccccD · 2019-04-15T06:36:42Z

Sorry, I am no longer engaged in this work, can not take the time to achieve and improve this pr, now turn it off.

TccccD requested a review from anirudh2290 as a code owner September 3, 2018 08:54

tcd added 2 commits September 3, 2018 20:47

Add stable nrm2 for L2 normalization

3efe962

Add stable nrm2 for L2 normalization, fix whitespace

e7b8480

marcoabreu added the pr-awaiting-review PR is waiting for code review label Sep 14, 2018

sandeep-krishnamurthy added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-review PR is waiting for code review labels Sep 25, 2018

leezu mentioned this pull request Nov 1, 2018

Refactor L2_normalization #13059

Merged

7 tasks

TccccD and others added 4 commits November 14, 2018 22:18

Merge remote-tracking branch 'upstream/master'

0469cc4

test 1

84b3643

test 2

a689ace

no change, but it is ok.

b791484

marcoabreu added pr-awaiting-review PR is waiting for code review and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 20, 2018

apeforest suggested changes Dec 4, 2018

View reviewed changes

anirudh2290 reviewed Dec 19, 2018

View reviewed changes

apeforest suggested changes Dec 19, 2018

View reviewed changes

marcoabreu added pr-awaiting-response PR is reviewed and waiting for contributor to respond and removed pr-awaiting-review PR is waiting for code review labels Feb 5, 2019

marcoabreu added Operator pr-work-in-progress PR is still work in progress labels Mar 4, 2019

TccccD closed this Apr 15, 2019

Conversation

TccccD commented Sep 3, 2018

Description

Uh oh!

TccccD commented Sep 3, 2018

Uh oh!

kalyc commented Sep 14, 2018

Uh oh!

vandanavk commented Sep 25, 2018

Uh oh!

vrakesh commented Oct 9, 2018

Uh oh!

Roshrini commented Oct 18, 2018

Uh oh!

TccccD commented Oct 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leezu commented Oct 19, 2018

Uh oh!

ankkhedia commented Oct 30, 2018

Uh oh!

anirudhacharya commented Nov 13, 2018

Uh oh!

TccccD commented Nov 13, 2018

Uh oh!

TccccD commented Nov 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stu1130 commented Nov 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vandanavk commented Nov 27, 2018

Uh oh!

TccccD commented Dec 3, 2018

Uh oh!

apeforest Dec 4, 2018

Choose a reason for hiding this comment

Uh oh!

TccccD Dec 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apeforest Dec 19, 2018

Choose a reason for hiding this comment

Uh oh!

lupesko commented Dec 18, 2018

Uh oh!

anirudh2290 Dec 19, 2018

Choose a reason for hiding this comment

Uh oh!

TccccD Dec 19, 2018

Choose a reason for hiding this comment

Uh oh!

apeforest Dec 19, 2018

Choose a reason for hiding this comment

Uh oh!

TccccD Dec 20, 2018

Choose a reason for hiding this comment

Uh oh!

apeforest left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anirudh2290 commented Dec 19, 2018

Uh oh!

apeforest commented Dec 28, 2018

Uh oh!

anirudhacharya commented Jan 11, 2019

Uh oh!

ZhennanQin commented Jan 14, 2019

Uh oh!

sandeep-krishnamurthy commented Jan 25, 2019

Uh oh!

vandanavk commented Feb 5, 2019

Uh oh!

ankkhedia commented Feb 14, 2019

Uh oh!

TccccD commented Feb 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TccccD commented Oct 19, 2018 •

edited

Loading

TccccD commented Nov 15, 2018 •

edited

Loading

stu1130 commented Nov 20, 2018 •

edited

Loading

TccccD Dec 5, 2018 •

edited

Loading

apeforest left a comment •

edited

Loading

TccccD commented Feb 27, 2019 •

edited

Loading