Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

CPU worker using more threads than expected/wanted #11891

@zhreshold

Description

@zhreshold

Description

During gluon multi worker training we want to keep each pre-fetch worker taking single thread to avoid thread contention problem. But I figured out it is impossible for me to control that through environment variable.

Environment info (Required)

Pip install mxnet-cu90==1.3.0b20180717

Package used (Python/R/Scala/Julia):
python

Minimum reproducible example

import mxnet as mx

num_arr = 10
arrs = [mx.nd.zeros((10000,)) for _ in range(num_arr)]

for i in range(1000000):
    for arr in arrs:
        arr[:] = i
        b = mx.nd.sigmoid(arr)
    mx.nd.waitall()

Steps to reproduce

I tried almost every single environmental variable to get single thread running example.

MXNET_CPU_NNPACK_NTHREADS=1 MXNET_CPU_PRIORITY_NTHREADS=1 MXNET_CPU_WORKER_NTHREADS=1 OMP_NUM_THREADS=1 MXNET_GPU_COPY_NTHREADS=1 MXNET_GPU_WORKER_NTHREADS=1 MXNET_CPU_COPY_NTHREADS=1 MXNET_EXEC_BULK_EXEC_MAX_NODE_TRAIN=1 MXNET_CPU_TEMP_COPY=1  python mp.py 

I am still getting around 160% cpu usage and a shadow cpu process. I noticed that OMP_NUM_THREADS=1 works because the cpu usage is otherwise more than 1600%

Any pointers for this problem?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions