Skip to content

Disable streams for the DML EP#19481

Merged
PatriceVignola merged 2 commits intomainfrom
user/pavignol/disable-streams-dml-ep
Feb 10, 2024
Merged

Disable streams for the DML EP#19481
PatriceVignola merged 2 commits intomainfrom
user/pavignol/disable-streams-dml-ep

Conversation

@PatriceVignola
Copy link
Contributor

There's currently a bug in the allocation planner when reusing buffers and more than one streams are used that make it possible (although rarely) to reach a reference count of 0 for a buffer that is still being used. Since DML doesn't benefit from multiple streams, disabling it is the safest option for now.

This is a high priority issue that we need to fix for 1.17.1 since it breaks stable diffusion. Identifying the perfect fix and fixing the underlying issue would be too risky for a patch release, especially given the limited time that we have.

#19480

@PatriceVignola
Copy link
Contributor Author

/azp run Linux GPU CI Pipeline (Linux_Test)

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@PatriceVignola
Copy link
Contributor Author

@snnn I keep getting CUDA failure 100: no CUDA-capable device is detected ; GPU=0 ; hostname=12a729cd64de ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=138 ; expr=cudaGetDeviceCount(&num_devices); on the Linux build.

@PatriceVignola
Copy link
Contributor Author

/azp run Linux GPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PatriceVignola
Copy link
Contributor Author

/azp run Linux GPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@PatriceVignola PatriceVignola merged commit 1182b55 into main Feb 10, 2024
@PatriceVignola PatriceVignola deleted the user/pavignol/disable-streams-dml-ep branch February 10, 2024 08:34
YUNQIUGUO pushed a commit that referenced this pull request Feb 11, 2024
There's currently a bug in the allocation planner when reusing buffers
and more than one streams are used that make it possible (although
rarely) to reach a reference count of 0 for a buffer that is still being
used. Since DML doesn't benefit from multiple streams, disabling it is
the safest option for now.

This is a high priority issue that we need to fix for 1.17.1 since it
breaks stable diffusion. Identifying the perfect fix and fixing the
underlying issue would be too risky for a patch release, especially
given the limited time that we have.

#19480
Copy link
Contributor

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Pat for restoring SD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants