Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Change class variables to thread local variables#10833

Merged
piiswrong merged 15 commits intoapache:masterfrom
anirudh2290:fix_global_ctx
May 11, 2018
Merged

Change class variables to thread local variables#10833
piiswrong merged 15 commits intoapache:masterfrom
anirudh2290:fix_global_ctx

Conversation

@anirudh2290
Copy link
Copy Markdown
Member

@anirudh2290 anirudh2290 commented May 7, 2018

Description

There are many places where class variables which end up modifying the state from multiple threads and causing issues like the one here: #9920 .
This PR is currently WIP because more investigation is required on test_multi_worker_forked_data_loader and test_laop_3(able to reproduce this test failure without my changes also). Also, more tests need to be added.

@piiswrong

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@anirudh2290 anirudh2290 requested a review from szha as a code owner May 7, 2018 02:22
Comment thread python/mxnet/context.py
gpu(1)
"""
# static class variable
default_ctx = None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break a lot of existing code. Is this change intended for 2.0?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this was meant to be public API. current_context() is used for accessing this.

Copy link
Copy Markdown
Member Author

@anirudh2290 anirudh2290 May 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piiswrong told me that default_ctx, devtype2str and devstr2type were not meant to be public API. This will also be renamed to _default_ctx. Users are expected to use current_context or default_context in test_utils. I agree though that this will break some existing code. I was of the opinion that the release after 1.2 is 2.0 but I am not sure.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are users who depend on this, we can make a backward compatible wrapper with classmethod and property:https://stackoverflow.com/questions/1697501/python-staticmethod-with-property

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_default_ctx = threading.local()
@property
@classmethod
def default_ctx():
   return _default_ctx.value

Comment thread python/mxnet/base.py Outdated
def __exit__(self, ptype, value, trace):
self.load_local_var()
else:
class ThreadLocalRegistry(object, metaclass=ThreadLocalRegistryMeta):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really complicated. why do you have to use inheritance? Isn't composition better?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inheritance was used since the thread local variables need to still be bound to the class rather than the object.

Comment thread python/mxnet/base.py Outdated
class ThreadLocalRegistry(object, metaclass=ThreadLocalRegistryMeta):
"""Structure that holds threadlocal variables and sets context manager"""

def __init__(self, name):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a name?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name here is the name of the threadlocal variable for example default_ctx.
I will add an empty or None check here, to prevent from being misused.

Comment thread python/mxnet/attribute.py
kwargs
The attributes to set for all symbol creations in the scope.
"""
current = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_current = threading.local()

Comment thread python/mxnet/attribute.py

def __enter__(self):
# pylint: disable=protected-access
self._old_scope = AttrScope.current
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._old_scope = AttrScope.current.value

Comment thread python/mxnet/attribute.py

def __exit__(self, ptype, value, trace):
assert self._old_scope
AttrScope.current = self._old_scope
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AttrScope.current.value = self._old_scope

Comment thread python/mxnet/context.py
gpu(1)
"""
# static class variable
default_ctx = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_default_ctx = threading.local()
@property
@classmethod
def default_ctx():
   return _default_ctx.value

Comment thread python/mxnet/context.py
def __enter__(self):
self._old_ctx = Context.default_ctx
Context.default_ctx = self
if not hasattr(Context._default_ctx, "value"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about a situation like below:

import threading
import mxnet as mx

def test_context():
    ctx_list = []
    def f():
        mx.context.Context.default_ctx
        ctx_list.append(Context.default_ctx)
    thread = threading.Thread(target=f)
    thread.start()
    thread.join()

if __name__ == '__main__':
    test_context()

Without hasattr this doesn't work.

def create(prefix, params, hint):
"""Creates prefix and params for new `Block`."""
current = _BlockScope._current
current = getattr(_BlockScope._current, "value", None)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initialize _BlockScope._current.value to None globally and use _BlockScope._current.value directly here?

Comment thread python/mxnet/name.py
def __enter__(self):
self._old_manager = NameManager.current
NameManager.current = self
if not hasattr(NameManager._current, "value"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment thread python/mxnet/name.py
warnings.warn("NameManager.current has been deprecated. "
"It is advised to use the `with` statement with NameManager.",
DeprecationWarning)
if not hasattr(NameManager._current, "value"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment thread python/mxnet/ndarray/ndarray.py Outdated
ctx : Context, optional
An optional device context.
Defaults to the current default context (``mxnet.Context.default_ctx``).
Defaults to the current default context (``mxnet.Context._default_ctx.value``).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is public API. change to mxnet.current_context

Comment thread python/mxnet/attribute.py
# pylint: disable=protected-access
self._old_scope = AttrScope.current
attr = AttrScope.current._attr.copy()
if not hasattr(AttrScope._current, "value"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@piiswrong
Copy link
Copy Markdown
Contributor

Please remove the hasatter/getattr statements. Otherwise LGTM

@anirudh2290 anirudh2290 changed the title [WIP] Change class variables to thread local variables Change class variables to thread local variables May 11, 2018
@anirudh2290
Copy link
Copy Markdown
Member Author

anirudh2290 commented May 11, 2018

Recording some of the offline conversation: Initially the inheritance and metaclasses was for allowing code reuse of thread local code easily for multiple modules. This code was a bit complicated to understand so we moved the threading.local calls to module where it was needed instead of using inheritance and metaclasses for threadlocal. For adding backward compatibility, we had to make default_ctx and equivalents in other modules class properties. This required adding logic in base.py for the same. This included adding a metaclass and since metaclass syntax is not the same between python2 and python3 needed to use six. We cannot add six as a dependency, because that restricts users more, and thus we copied the metaclass specific code from six into base.py.

@piiswrong piiswrong merged commit 86ee3e1 into apache:master May 11, 2018
jinhuang415 pushed a commit to jinhuang415/incubator-mxnet that referenced this pull request May 29, 2018
* Change to simpler implementation

* Add property

* Remove pdb

* Add support for setter and getter

* fix issues

* Add warnings

* Add thread local unittest and tlocal race condition

* Fix pylint

* Use current_context instead of _default_ctx

* Use current_context

* Fix race condition

* Fix thread local test

* Change to current_context
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
* Change to simpler implementation

* Add property

* Remove pdb

* Add support for setter and getter

* fix issues

* Add warnings

* Add thread local unittest and tlocal race condition

* Fix pylint

* Use current_context instead of _default_ctx

* Use current_context

* Fix race condition

* Fix thread local test

* Change to current_context
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
* Change to simpler implementation

* Add property

* Remove pdb

* Add support for setter and getter

* fix issues

* Add warnings

* Add thread local unittest and tlocal race condition

* Fix pylint

* Use current_context instead of _default_ctx

* Use current_context

* Fix race condition

* Fix thread local test

* Change to current_context
@cclauss
Copy link
Copy Markdown
Contributor

cclauss commented Jul 12, 2018

We cannot add six as a dependency

@szha Is this true? I find numerous instances where Python2-only builtins like basestring, long, reload, unicode, xrange are used. Do we need try/except blocks to define each of these or can we use six.moves?

@szha
Copy link
Copy Markdown
Member

szha commented Jul 12, 2018

@cclauss I'd probably describe this as "six is not necessary", as we already handle most of the import compatibility in base.py where we use the python version to decide the definition, and we've been careful about not using version-specific features in our code base (e.g. we have both py2 and py3 in CI). Thus, adding six would mostly just benefit code that's outside of our code base, such as the examples, in which case the dependency is not managed by mxnet and authors are free to choose.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants