Skip to content

Remove NVFUSER_DISTRIBUTED. #2155

Merged
wujingyue merged 7 commits intomainfrom
wjy/use
Apr 30, 2024
Merged

Remove NVFUSER_DISTRIBUTED. #2155
wujingyue merged 7 commits intomainfrom
wjy/use

Conversation

@wujingyue
Copy link
Collaborator

@wujingyue wujingyue commented Apr 27, 2024

@xwang233 discussed this in the nvFuser-MultiGPU chatroom. At this moment, supporting non-distributed build of pytorch isn't worth the cost of additional CI and using knobs like NVFUSER_DISTRIBUTED or USE_DISTRIBUTED.

This PR reverts #1711

Feel free to revert this PR if non-distributed build becomes important.

@wujingyue
Copy link
Collaborator Author

!build --dist

@samnordmann
Copy link
Collaborator

Can you please explain the motivation for this change?

@wujingyue
Copy link
Collaborator Author

Can you please explain the motivation for this change?

@xwang233 talked about this in the nvFuser-MultiGPU chatroom. At this moment, there isn't much value to support non-distributed build of pytorch. The cost of additional CI and knobs like NVFUSER_DISTRIBUTED/USE_DISTRIBUTED just for that config isn't worthwhile.

@wujingyue wujingyue requested a review from samnordmann April 29, 2024 15:50
@wujingyue
Copy link
Collaborator Author

!build

@wujingyue
Copy link
Collaborator Author

!build --dist

@wujingyue
Copy link
Collaborator Author

!build --dist

@wujingyue
Copy link
Collaborator Author

Glad that CI has recovered, making this PR ready to review again. PTAL

Copy link
Collaborator

@cowanmeg cowanmeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for multidevice code and tests. I would wait for @xwang233's stamp on the build changes.

Copy link
Collaborator

@xwang233 xwang233 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for working on this.

@wujingyue wujingyue merged commit 6e05931 into main Apr 30, 2024
@wujingyue wujingyue deleted the wjy/use branch April 30, 2024 20:01
zasdfgbnm pushed a commit that referenced this pull request May 2, 2024
@xwang233 discussed this in the nvFuser-MultiGPU chatroom. At this
moment, supporting non-distributed build of pytorch isn't worth the cost
of additional CI and using knobs like `NVFUSER_DISTRIBUTED` or
`USE_DISTRIBUTED`.

Feel free to revert this PR if non-distributed build becomes important.
Aidyn-A added a commit to Aidyn-A/Fuser that referenced this pull request May 6, 2024
@wujingyue wujingyue restored the wjy/use branch June 28, 2024 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants