Skip to content

Host irs: use Communication directly instead of wrapped inside PostOnStream#2328

Merged
wujingyue merged 7 commits intoNVIDIA:mainfrom
samnordmann:host_irs/simplify_comm_remove_post
Jun 9, 2024
Merged

Host irs: use Communication directly instead of wrapped inside PostOnStream#2328
wujingyue merged 7 commits intoNVIDIA:mainfrom
samnordmann:host_irs/simplify_comm_remove_post

Conversation

@samnordmann
Copy link
Collaborator

@samnordmann samnordmann commented Jun 3, 2024

Bootstrapping for simplifying the design. With this patch, the Communication class is used to post a collective, without the need of wrapping it inside a PostOnStream Ir.

There is no functional change, this is only a minor cleanup I would like to upstream before introducing the "collective wait" IR.

To achieve this, we need to add I/O TVs to the Communication's constructor, and use those I/O directly. This patch aligns with the discussion from #2250 and should lead to further simplification in MultiDeviceExecutor, that I would like to leave to another PR for avoiding conflicts.

Note that the Communication class is used in two places: Host IRs and MultiDeviceExecutor. Since the former case should replace the latter eventually, we maintain for now these two usages. In MultiDeviceExecutor, we don't set a Communication's I/O, that's why those arguments are made optional in the constructor

@samnordmann
Copy link
Collaborator Author

!build --dist

@samnordmann samnordmann requested review from cowanmeg and wujingyue June 3, 2024 13:13
@samnordmann
Copy link
Collaborator Author

!build

Copy link
Collaborator

@wujingyue wujingyue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice -- so I don't need to make the same change!

postCommunication(post_ir);
} else {
NVF_ERROR(false, "The op cannot be posted on a stream: ", op);
dispatch(op);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change of behavior. Did you intend to do this in another PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any change in behavior, am I missing smth?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right -- there's no behavior change. There's a subtle difference that the new code won't check the type of the host op in PostOnStream, which I think is fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the type of the host op is still checked, inside ´OptInDispatch::dispatch'

postCommunication(post_ir);
} else {
NVF_ERROR(false, "The op cannot be posted on a stream: ", op);
dispatch(op);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right -- there's no behavior change. There's a subtle difference that the new code won't check the type of the host op in PostOnStream, which I think is fine.

@samnordmann
Copy link
Collaborator Author

!build

@wujingyue
Copy link
Collaborator

!build

@wujingyue wujingyue merged commit 6a4156d into NVIDIA:main Jun 9, 2024
@samnordmann samnordmann changed the title Host irs: use Communicator directly instead of wrapped inside PostOnStream Host irs: use Communication directly instead of wrapped inside PostOnStream Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants