Expose DirectML provider to python (conflicts resolved from #3359) by cameronmaske · Pull Request #4630 · microsoft/onnxruntime

cameronmaske · 2020-07-27T09:38:23Z

Description: Exposed the DirectML (DML) provider to Python.
This is based on #3359 but addresses the conflicts on that branch and include the DirectML.dll in the setup.py.

Motivation and Context
Previously the DML provider was not exposed to python.
Fixes #3358

… against latest master

ghost · 2020-07-27T09:38:35Z

All CLA requirements met.

fdwr

👍

faxu · 2020-08-08T02:24:28Z

/azp run Linux CPU CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,MacOS NoContribops CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Windows CPU CI Pipeline

faxu · 2020-08-08T02:24:40Z

/azp run orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-mac-ci-pipeline

azure-pipelines · 2020-08-08T02:24:57Z

Azure Pipelines successfully started running 3 pipeline(s).

azure-pipelines · 2020-08-08T02:25:07Z

Azure Pipelines successfully started running 9 pipeline(s).

faxu · 2020-08-12T01:52:49Z

@cameronmaske looks like there are some more conflicts to resolve on this.

hariharans29 · 2020-08-13T20:56:43Z

@cameronmaske @fdwr - does this PR look okay now ? (Resolved conflicts but making sure it makes sense)

hariharans29 · 2020-08-13T20:56:54Z

/azp run Linux CPU CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,MacOS NoContribops CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline

hariharans29 · 2020-08-13T20:57:00Z

/azp run orttraining-linux-ci-pipeline,orttraining-mac-ci-pipeline,orttraining-linux-gpu-ci-pipeline,centos7_cpu,Linux OpenVINO CI Pipeline

azure-pipelines · 2020-08-13T20:57:24Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2020-08-13T20:57:32Z

Azure Pipelines successfully started running 9 pipeline(s).

hariharans29 · 2020-08-14T00:06:55Z

The Windows GPU build are failing. Is it because it isn't able to locate the DirectML.dll for some reason ?

faxu · 2020-08-17T22:28:24Z

@cameronmaske Can you take a look at the failure issue and also resolve conflicts?

cameronmaske · 2020-08-19T08:48:41Z

@faxu I’m currently traveling but will try and resolve these issue when I return next week

cameronmaske · 2020-08-24T09:08:09Z

~~Is there any way to see the full error output of why the build failed? Clicking Details just gives the error "Build Not Found"~~

cameronmaske · 2020-08-24T12:06:47Z

@faxu Issue should be resolved. For context, the DirectML.dll was not being copied over to the /capi directory to be included with the python wheel. I've updated the cmake files to do that now.

faxu · 2020-08-24T18:16:50Z

/azp run Linux CPU CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,MacOS NoContribops CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline

faxu · 2020-08-24T18:16:59Z

/azp run orttraining-linux-ci-pipeline,orttraining-mac-ci-pipeline,orttraining-linux-gpu-ci-pipeline,centos7_cpu,Linux OpenVINO CI Pipeline

azure-pipelines · 2020-08-24T18:17:26Z

Azure Pipelines successfully started running 5 pipeline(s).

azure-pipelines · 2020-08-24T18:17:28Z

Azure Pipelines successfully started running 9 pipeline(s).

faxu · 2020-08-25T02:07:43Z

@cameronmaske Still some issues, it seems...

cameronmaske · 2020-08-25T07:37:27Z

@faux Reading over the test failures, I'd like some guidance from the maintainers about which route to take to address this, as the direction seems quite opinionated.

The error occurring is...

onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Memory pattern must be disabled before registering DMLExecutionProvider

The DML provider does not support enable_mem_pattern = True which is the default behavior. There are two approaches I can think of to address this.

My thoughts are either,

Alter the tests to explicitly disable that option when running with the DMLExecutionProvider. Some special care will need to be taken for when the session options are loaded from the model file.

or

Change the session options to default enable_mem_pattern to False if the DMLExecutionProvider is used. This may be harder to do (I'm very new to the codebase, so guidance is appreciated if this is the route to do).

Maybe there are different approaches, appreciate any thoughts/guidance.

jywu-msft · 2020-08-31T02:34:01Z

@faux Reading over the test failures, I'd like some guidance from the maintainers about which route to take to address this, as the direction seems quite opinionated.

The error occurring is...
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Memory pattern must be disabled before registering DMLExecutionProvider
The DML provider does not support enable_mem_pattern = True which is the default behavior. There are two approaches I can think of to address this.

My thoughts are either,

Alter the tests to explicitly disable that option when running with the DMLExecutionProvider. Some special care will need to be taken for when the session options are loaded from the model file.

or

Change the session options to default enable_mem_pattern to False if the DMLExecutionProvider is used. This may be harder to do (I'm very new to the codebase, so guidance is appreciated if this is the route to do).

Maybe there are different approaches, appreciate any thoughts/guidance.

Between those two options, I would lean towards the latter. enable_memory_pattern is an optimization, so if it is incompatible with DMLExecutionProvider, it seems better to make it a no-op, rather than place onus on user to know what optimizations are compatible with which provider.
Defaulting the correct option seems also preferable to changing all the various test code to special case DMLExecutionProvider to instantiate sess_options and set enable_mem_pattern to false.
As for how to default enable_mem_pattern to false, I think we can either remove the error message (and change the setting there). or perhaps better would be to set enable_mem_pattern to false in InferenceSession::ConstructorCommon
sometime after FinalizeSessionOptions()
@pranavsharma , what do you think?

hariharans29 · 2020-09-02T22:16:06Z

This error should go way now, please rebase with master and I can merge this PR. Thanks.

hariharans29 · 2020-09-03T17:57:36Z

/azp run Linux CPU CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,MacOS NoContribops CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline

hariharans29 · 2020-09-03T17:57:49Z

/azp run orttraining-linux-ci-pipeline,orttraining-mac-ci-pipeline,orttraining-linux-gpu-ci-pipeline,centos7_cpu,Linux OpenVINO CI Pipeline

azure-pipelines · 2020-09-03T17:58:14Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2020-09-03T17:58:15Z

Azure Pipelines successfully started running 5 pipeline(s).

hariharans29 · 2020-09-08T18:31:30Z

/azp run Linux CPU CI Pipeline,Linux CPU x64 NoContribops CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,MacOS CI Pipeline,MacOS NoContribops CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline

hariharans29 · 2020-09-08T18:31:41Z

/azp run orttraining-linux-ci-pipeline,orttraining-mac-ci-pipeline,orttraining-linux-gpu-ci-pipeline,centos7_cpu,Linux OpenVINO CI Pipeline

azure-pipelines · 2020-09-08T18:32:07Z

Azure Pipelines successfully started running 9 pipeline(s).

azure-pipelines · 2020-09-08T18:32:10Z

Azure Pipelines successfully started running 5 pipeline(s).

cameronmaske · 2020-09-09T09:34:06Z

@hariharans29 Many thanks for the help in getting this merged!

cameronmaske added 2 commits July 24, 2020 15:48

Ported @RobinKa's changes to expose DirectML to python (microsoft#3359)…

9db3d7a

… against latest master

Include directml dll. Register DML provider in GetAllProviders

82a8d53

cameronmaske requested a review from a team as a code owner July 27, 2020 09:38

cameronmaske mentioned this pull request Jul 30, 2020

Expose DirectML provider to Python #3359

Closed

fdwr previously approved these changes Aug 5, 2020

View reviewed changes

Merge branch 'master' into dml

b2a6dd6

hariharans29 dismissed fdwr’s stale review via b2a6dd6 August 13, 2020 20:55

cameronmaske added 2 commits August 24, 2020 11:12

Resolve conflicts in build.py

7790fb5

Copy DirectML.dll over to /capi directory to include in python wheel

1ab3fbc

hariharans29 mentioned this pull request Sep 1, 2020

Change session option values if they don't work with EPs being registered for the session #4991

Merged

Merge branch 'master' into dml

9eb8d96

hariharans29 requested review from BowenBao, liqunfu, spandantiwari and thiagocrepaldi as code owners September 1, 2020 02:35

Merge branch 'master' into dml

5bb951f

This was referenced Sep 5, 2020

Exclude registering CUDA EP in the Python wheel if DML is enabled #5071

Closed

Prevent registering both DML and CUDA EPs in an ML op test #5078

Merged

Merge branch 'master' into dml

dbea241

hariharans29 approved these changes Sep 8, 2020

View reviewed changes

hariharans29 merged commit 4553b2e into microsoft:master Sep 8, 2020

Conversation

cameronmaske commented Jul 27, 2020 • edited by hariharans29 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Jul 27, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

faxu commented Aug 8, 2020

Uh oh!

faxu commented Aug 8, 2020

Uh oh!

azure-pipelines bot commented Aug 8, 2020

Uh oh!

azure-pipelines bot commented Aug 8, 2020

Uh oh!

faxu commented Aug 12, 2020

Uh oh!

hariharans29 commented Aug 13, 2020

Uh oh!

hariharans29 commented Aug 13, 2020

Uh oh!

hariharans29 commented Aug 13, 2020

Uh oh!

azure-pipelines bot commented Aug 13, 2020

Uh oh!

azure-pipelines bot commented Aug 13, 2020

Uh oh!

hariharans29 commented Aug 14, 2020

Uh oh!

faxu commented Aug 17, 2020

Uh oh!

cameronmaske commented Aug 19, 2020

Uh oh!

cameronmaske commented Aug 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cameronmaske commented Aug 24, 2020

Uh oh!

faxu commented Aug 24, 2020

Uh oh!

faxu commented Aug 24, 2020

Uh oh!

azure-pipelines bot commented Aug 24, 2020

Uh oh!

azure-pipelines bot commented Aug 24, 2020

Uh oh!

faxu commented Aug 25, 2020

Uh oh!

cameronmaske commented Aug 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jywu-msft commented Aug 31, 2020

Uh oh!

hariharans29 commented Sep 2, 2020

Uh oh!

hariharans29 commented Sep 3, 2020

Uh oh!

hariharans29 commented Sep 3, 2020

Uh oh!

azure-pipelines bot commented Sep 3, 2020

Uh oh!

azure-pipelines bot commented Sep 3, 2020

Uh oh!

hariharans29 commented Sep 8, 2020

Uh oh!

hariharans29 commented Sep 8, 2020

Uh oh!

azure-pipelines bot commented Sep 8, 2020

Uh oh!

azure-pipelines bot commented Sep 8, 2020

Uh oh!

cameronmaske commented Sep 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

cameronmaske commented Jul 27, 2020 •

edited by hariharans29

Loading

ghost commented Jul 27, 2020 •

edited by ghost

Loading

cameronmaske commented Aug 24, 2020 •

edited

Loading

cameronmaske commented Aug 25, 2020 •

edited

Loading