[MXNET-411] Add ROI Align by zhanghang1989 · Pull Request #10852 · apache/mxnet

zhanghang1989 · 2018-05-08T16:42:37Z

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

ROI Align from Caffe2
docs and unit test

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

piiswrong · 2018-05-08T16:58:47Z

Please add tests and documentation

xinyu-intel · 2018-05-09T08:19:35Z

Hi, can you enable openmp on the cpu implementation of roialign. This can achieve a better performance. You can reference my pr in #9958.

chinakook · 2018-05-09T13:21:31Z

caffe2 has a cpp test. I think Caffe2’s roi align op is written by Kaiming He.

wkcn · 2018-05-12T04:38:48Z

The ROIAlign in this PR is ROIAlign(max).
And the ROIAlign in Caffe2(forward, backward) is ROIAlign(mean)

wkcn · 2018-05-16T14:38:10Z

Hi. I wrote a ROIAlign forward/backward test. It may be useful.
https://github.com/wkcn/MobulaOP/blob/master/tests/test_roi_align_op.py

And I found the implement of CPU/GPU is different. I think it's better to add cpu/gpu consistence test.

zhanghang1989 · 2018-05-16T19:07:00Z

The docs are generated here http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-10852/9/api/python/ndarray/contrib.html?highlight=roi#mxnet.ndarray.contrib.ROIAlign

zhanghang1989 · 2018-05-16T20:40:37Z

@piiswrong @zhreshold Could you please review this PR for adapting ROI Aligh from Caffe2. Thanks!

zhreshold · 2018-05-16T21:13:49Z

+       i += blockDim.x * gridDim.x)
+
+// The number of cuda threads to use. 512 is used for backward compatibility
+constexpr int ROI_CUDA_NUM_THREADS = 512;


Use mshadow::cuda::kMaxThreadsPerBlock might provide better perf on newer opus possibly?
Use mshadow::cuda::CheckLaunchParam to help check the launch limits

zhreshold · 2018-05-16T21:14:52Z

+}
+
+/*
+template <typename T>


remove these

zhreshold · 2018-05-16T21:23:24Z

+    check_numeric_gradient(sym=test, location=[x1, x2],
+                           grad_nodes={'data':'add', 'rois':'null'},
+                           numeric_eps=1e-4, rtol=1e-1, atol=1E-4)
+


need a forward result check in addition to gradient check

Yeah, will a forward result check soon 👍

zhanghang1989 · 2018-05-17T00:45:58Z

@piiswrong @zhreshold I have added the unit tests. Please see the updates. Thanks!

piiswrong · 2018-05-17T18:30:26Z


-#define START_IND(a, b, c) static_cast<int>(floor(static_cast<float>(a * c) / b))
-#define END_IND(a, b, c) static_cast<int>(ceil(static_cast<float>((a + 1) * c) / b))
+#define START_IND(a, b, c) static_cast<int>(floor(static_cast<real>(a * c) / b))


piiswrong · 2018-05-17T18:36:29Z

+    T roi_start_h = offset_bottom_rois[1] * spatial_scale;
+    T roi_end_w = offset_bottom_rois[2] * spatial_scale;
+    T roi_end_h = offset_bottom_rois[3] * spatial_scale;
+    // T roi_start_w = round(offset_bottom_rois[0] * spatial_scale);


piiswrong · 2018-05-17T18:37:37Z

+    int rois_cols) {
+  DCHECK(rois_cols == 4 || rois_cols == 5);
+
+  for (int index = 0; index < nthreads; index++) {


use openmp?

We have to use single threading in backward, due to no atomic add using CPU. We could assume no one would use cpu to train the model :)

piiswrong · 2018-05-17T18:37:51Z

+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  // #pragma omp parallel for num_threads(32)
+  for (int n = 0; n < n_rois; n++) {


piiswrong · 2018-05-17T18:42:23Z

+    const DType *bottom_rois = in_data[0].dptr<DType>();
+    DType *grad_in = outputs[0].dptr<DType>();
+
+    if (kAddTo == req[roialign::kData] || kWriteTo == req[roialign::kData]) {


Return if NullOp before the switch?

piiswrong · 2018-05-17T18:43:44Z

+  DMLC_DECLARE_PARAMETER(ROIAlignParam) {
+    DMLC_DECLARE_FIELD(pooled_size)
+    .set_expect_ndim(2).enforce_nonzero()
+    .describe("fix pooled size: (h, w)");


what's pooled_size?

The output roi feature sizes. Name is compatible with ROIPooling

piiswrong · 2018-05-17T18:44:55Z

+};
+
+
+struct ROIAlignGrad {


No need for this struct. Use lambda directly at set_attr

xinyu-intel · 2018-05-18T00:05:13Z

+  // (n, c, ph, pw) is an element in the pooled output
+  // can be parallelized using omp
+  int n;
+#pragma omp parallel for private(n) \


Thanks for adding omp. Regarding to NUM_OF_ROIS and CHANNELS, I found laster is more and more bigger than former in ROIPooling, so I just apply omp on channels to achieve better application performance. Can you help benchmark the performance based on the usually roi_align size?

Are you suggesting removing this omp?

xinyu-intel · 2018-05-18T00:05:39Z

+        &pre_calc);
+
+    int c;
+#pragma omp parallel for private(c) \


keeping this ?

zhanghang1989 · 2018-05-18T05:25:42Z

Remove OMP in backward, due to no atomic add in cpu.

zhreshold · 2018-05-21T17:59:22Z

Gathering status, did anyone have unresolved issues?
Trying to merge this PR ASAP as it is the required operator for gluon-cv.

piiswrong · 2018-05-21T18:30:04Z

+  DMLC_DECLARE_PARAMETER(ROIAlignParam) {
+    DMLC_DECLARE_FIELD(pooled_size)
+    .set_expect_ndim(2).enforce_nonzero()
+    .describe("ROI Align output roi featuremap height and width: (h, w)");


featuremap -> feature map

piiswrong · 2018-05-21T18:30:36Z

+
+NNVM_REGISTER_OP(_contrib_ROIAlign)
+.describe(R"code(
+ROI Align Layer


Superfluous. Remove this line

zhanghang1989 · 2018-05-22T21:51:37Z

I should have addressed most of the reviews. Please let me know if there are any further comments. Thanks! Related Gluon-CV PR dmlc/gluon-cv#140

zhanghang1989 · 2018-05-23T04:14:18Z

Finally it passes the CI :) @piiswrong

RogerChern · 2018-05-29T02:22:18Z

+          width,
+          pooled_height,
+          pooled_width,
+          -1,


sampling_rate missing in ROIAlignParam

Currently, it uses the adaptive size. I will make it as an option

* add roi align * lint * cpu gpu forward consistent * roi align from caffe2 * rois and unit-test * for cpplint * use pointer instead of reference for lint * fix * add docs * fix vector * more unit test * using mshadow * omp * omp on channels * remove omp due to no cpu atomic add * use lambda func for grads * knullop return

zhanghang1989 added 2 commits May 8, 2018 09:40

add roi align

96fbf14

lint

95f4077

cpu gpu forward consistent

4453028

zhanghang1989 added 2 commits May 13, 2018 19:59

roi align from caffe2

d77c011

rois and unit-test

bd37077

zhanghang1989 mentioned this pull request May 14, 2018

[On Going] Faster RCNN Model Convert dmlc/gluon-cv#105

Merged

zhanghang1989 added 5 commits May 15, 2018 15:42

Merge remote-tracking branch 'upstream/master' into roi

4848545

for cpplint

694d79e

use pointer instead of reference for lint

53a944e

fix

c125fdf

add docs

13e571c

zhanghang1989 requested a review from szha as a code owner May 15, 2018 23:42

zhanghang1989 added 2 commits May 16, 2018 09:34

fix vector

12e57f4

more unit test

c2691dd

szha requested review from zhreshold and removed request for szha May 16, 2018 19:08

zhreshold reviewed May 16, 2018

View reviewed changes

using mshadow

8923099

zhreshold approved these changes May 17, 2018

View reviewed changes

piiswrong suggested changes May 17, 2018

View reviewed changes

omp

787c694

xinyu-intel reviewed May 18, 2018

View reviewed changes

zhanghang1989 added 2 commits May 17, 2018 17:44

omp on channels

7f81fd7

remove omp due to no cpu atomic add

1a08f8c

Merge remote-tracking branch 'upstream/master' into roi

5d227f3

piiswrong reviewed May 21, 2018

View reviewed changes

zhanghang1989 added 2 commits May 22, 2018 14:34

use lambda func for grads

0d9ac94

knullop return

424bcad

zhanghang1989 mentioned this pull request May 22, 2018

[PoC] Demo code for Detectron Model dmlc/gluon-cv#140

Merged

Merge remote-tracking branch 'upstream/master' into roi

438dc49

piiswrong approved these changes May 23, 2018

View reviewed changes

piiswrong merged commit 98e671e into apache:master May 23, 2018

RogerChern reviewed May 29, 2018

View reviewed changes

anirudhacharya mentioned this pull request Jun 4, 2018

The sampling_ratio in roi-align can be specified(not adaptive size)! #11077

Closed

zhanghang1989 deleted the roi branch June 5, 2018 00:51

		};


		struct ROIAlignGrad {

Conversation

zhanghang1989 commented May 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Essentials

Changes

Comments

Uh oh!

piiswrong commented May 8, 2018

Uh oh!

xinyu-intel commented May 9, 2018

Uh oh!

chinakook commented May 9, 2018

Uh oh!

wkcn commented May 12, 2018

Uh oh!

wkcn commented May 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhanghang1989 commented May 16, 2018

Uh oh!

zhanghang1989 commented May 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhanghang1989 commented May 17, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhanghang1989 May 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhanghang1989 commented May 18, 2018

Uh oh!

zhreshold commented May 21, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhanghang1989 commented May 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhanghang1989 commented May 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

zhanghang1989 commented May 8, 2018 •

edited

Loading

wkcn commented May 16, 2018 •

edited

Loading

zhanghang1989 May 18, 2018 •

edited

Loading

zhanghang1989 commented May 22, 2018 •

edited

Loading