Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions src/operator/nn/dropout-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,8 @@ class DropoutOp {
static void BernoulliGenerate(common::random::RandGenerator<cpu, DType> gen,
int n, double p, int* r) {
typename RandGenerator<xpu, DType>::Impl genImpl(&gen, 1);
const int seed = 17 + genImpl.rand() % 4096; // NOLINT(runtime/threadsafe_fn)
const int seed = 17 + abs(genImpl.rand() % 4096);
CHECK_GE(seed, 0);
const int nthr = engine::OpenMP::Get()->GetRecommendedOMPThreadCount();
#pragma omp parallel num_threads(nthr)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be better to change this parallel approach to "#pragma omp paralle for" so it will be more readable than OMP section parallel.
Do you mind to change it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pengzhao-intel Thanks for your feedback. I don't see why "#pragma omp parallel for" would be more readable in this case as there is no for loop involved. The job for each thread is defined based on its thread number. Introducing a for loop with "#pragma omp parallel for" would be redundant. Please correct me if I've misunderstood it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give more hint on why adding an abs function fixed the flakiness of the test?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apeforest vslNewStream expects a non-negative integer as seed by definition. genImpl.rand() generates negative numbers too, which sometimes leads to negative seed values being passed to vslNewStream. abs() helps keep that in check.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. In this case I would suggest we add an assert here to make sure seed is nonnegative

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check to make sure seed is non-negative.

{
Expand All @@ -92,7 +93,7 @@ class DropoutOp {
const int my_amount = std::min(my_offset + avg_amount, n) - my_offset;
if (my_amount > 0) {
VSLStreamStatePtr stream;
vslNewStream(&stream, VSL_BRNG_MCG31, seed + my_offset);
vslNewStream(&stream, VSL_BRNG_MCG31, seed);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pengzhao-intel vslNewStream was passed (seed + my_offset) which is variable for different threads as my_offset depends on thread number. I have removed my_offset, so whatever seed value is generated, it's used to generate stream for each thread.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the my_offset here while leaving them in the vslSkipAheadStream and viRngBernoulli functions? What's the rationale behind? Thanks!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apeforest By removing my_offset from vslNewStream and making the seed same for all threads, we are ensuring that the same stream is generated for each thread.
For each thread, my_offset acts an index for vslSkipAheadStream and viRngBernoulli. It tells vslSkipAheadStream which section of stream to use and viRngBernoulli which section of vector to populate.

vslSkipAheadStream(stream, my_offset);
viRngBernoulli(VSL_RNG_METHOD_BERNOULLI_ICDF, stream, my_amount, r + my_offset, p);
vslDeleteStream(&stream);
Expand Down
1 change: 0 additions & 1 deletion tests/python/unittest/test_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -5808,7 +5808,6 @@ def test_stack():


@with_seed()
@unittest.skip("Flaky test, tracked at https://github.com/apache/incubator-mxnet/issues/12314")
def test_dropout():
def zero_count(array, ratio):
zeros = 0
Expand Down