Skip to content

Conversation

@mbs-octoml
Copy link
Contributor

@mbs-octoml mbs-octoml commented Oct 19, 2021

This is the first step in apache/tvm-rfcs#38 to bring devices, targets and memory scopes together when doing device planning as a new 'SEScope' class. For now memory scopes will be left as "", but will become significant in follow on PRs.

Once device planning works in units of SEScopes it will be possible to directly read off the device and target for any Relay sub-expression without the need for TargetMaps or the construction of default Targets.

SEScopes also support 'Join' and 'Default' operations needed when constraint solving in the device planner.

This PR also brings some duplicated and ad-hoc 'default target' handling logic together into a CompilationConfig class. This came along with SEScope since it:
a) establishes the default SEScope for primitive ops, needed by device planning when sub-expressions are not otherwise constrained to a particular device/target.
b) establishes the SEScope for the 'host', which is needed by device planning and the VM for shape data, shape functions, and other non-tensor data/computations.
c) provides a way to 'canonicalize' SEScopes which may enter device planning via annotations which are missing their target.

These classes will be used in the sequel PR #9326.

@mbs-octoml
Copy link
Contributor Author

Downgrading to draft as I'm not happy with the global SEScope cache and would like to rejig to be more definitely scoped.

@mbs-octoml mbs-octoml marked this pull request as ready for review October 21, 2021 00:02
@mbs-octoml
Copy link
Contributor Author

This is ready for review again.

@mbs-octoml
Copy link
Contributor Author

Note the CI failure looks to be a flake (TVMCI-8).

@jroesch @electriclilies for review

@jroesch
Copy link
Member

jroesch commented Oct 25, 2021

cc @csullivan @Lunderberg @adstraw

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, so far I have only focused on the introduced class defs. Will continue reviewing later this afternoon.

@mbs-octoml
Copy link
Contributor Author

PTAL for the switch of SEScope to an AttrNode and addition of structual eq/hash to Target necessary to make that work.

Please check my impl of eq/hash on Target are coherent since some fields are strings, some support eq/hash, and some must only be compared by pointer.

The only special case is TargetKind does not support
SEqualReduce/SHashReduce, so defer to ptr equality for
those.
@mbs-octoml mbs-octoml requested a review from icemelon as a code owner November 3, 2021 00:33
@mbs-octoml
Copy link
Contributor Author

PTAL: Naive SEqualReduce/SHashReduce impls. Just noticed the MapNodeTrait in structural_hash.cc so see how it works now, thanks.

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good for now. Let us continue to discuss the CompilationOption and composite target choices in the discuss preRFC

@tqchen
Copy link
Member

tqchen commented Nov 3, 2021

Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, and without the special meaning of virtual_device_id==0, I don't see any additional points of confusion there.

(Thank you to @tqchen for the reminder to explicitly approve.)

@mbs-octoml
Copy link
Contributor Author

A final word about the defaulting of the host device to kDLCPU for the record.

I carried that logic over from the 3 existing places where target maps & host targets defaulting is done.
Note that there are many places in the code which currently hard code kDLCPU in addition to this defaulting logic, the sequel will fix those to use the host's type.
I disagree with this defaulting logic -- it should all go away in favor of a much clearer API. But that is a work in progress.
For targets which do not have a kDLCPU host the user can make sure the host_target is always given. That will trigger a WARNING if it is not a kDLCPU just because I'm unsure if the code base will support that at the moment (no doubt there are other kDLCPU assumptions buried in there).

Copy link
Contributor

@csullivan csullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @mbs-octoml, many thanks for your attention to detail!

LOG(INFO) << "Using the given host target '" << host_target << "' of device type "
<< host_device_type << " for all host operations and data";
}
} else if (primitive_targets.size() == 1 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written a host_target must be set for host targets whose target kind device type are not kDLCPU, otherwise this will introduce default kDLCPU targets that may be undesirable.

Copy link
Contributor

@csullivan csullivan Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is lateral functionality from what exists in TVM today. See @mbs-octoml comment above.

@tqchen tqchen merged commit a6c948a into apache:main Nov 3, 2021
@tqchen
Copy link
Member

tqchen commented Nov 3, 2021

Thanks @mbs-octoml @csullivan @Lunderberg @Mousius . this PR is now merged!

@mbs-octoml mbs-octoml deleted the mbs-sescope branch November 3, 2021 17:32
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 9, 2021
CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 10, 2021
CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 11, 2021
CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 12, 2021
CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.
mbs-octoml added a commit to mbs-octoml/mbs-tvm that referenced this pull request Nov 12, 2021
CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.
junrushao pushed a commit that referenced this pull request Nov 12, 2021
…s. (#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from #9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…g in 'device' planning. (apache#9313)

[Target] Adds SEScope (Storage/Execution Scope) for use as new unit of planning in 'device' planning

This is the first step in apache/tvm-rfcs#38 to bring devices
and targets together when doing device planning. I've gone ahead and also included a
memory scope in this object since we will also need to propagate memory scopes across
Relay expressions once this basic preparation is in place. In the meantime that field will be
left as "".

Once device planning works in units of SEScopes it will be possible to directly read off
the device and target for any Relay sub-expression without the need for TargetMaps ort
the construction of default Targets.

SEScopes also support 'Join' and 'Default' operations needed when constraint solving in
the device planner. You can see those in use in my scratchpad branch:
  https://github.com/mbs-octoml/mbs-tvm/tree/mbs-scopes

This PR also brings some duplicated and the ad-hoc 'default target' handling logic
together into a CompilationConfig class. (Again, see the scratchpad branch for how that
will end up being used). I've placed that next to SEScope since it's main purpose is to
  a) establish the default SEScope for primitive ops
  b) establish the SEScope for the 'host'
  c) feed a definitive vector of Targets into device planning so it can resolve all
     "on_device" and "device_copy" device references to their full SEScope form.

* Reworked to avoid global SEScopeCache.

Realized while working through unit tests in the sequel that it's reasonable
for folks to call build multiple times with distinct Target objects, in which
case the global cache would grow without bound.

So instead placed the cache in the CompilationConfig class. Since that class
now has everything the device planner needs to do its job, promoted it to
be an FFI-able Object, which is now in compilation_config.{h,cc}.

I think we can do much better with CompilationConfig, but for now keeping it
to the minimum I needed to prepare for device planning from all the executor
compilation codepaths.
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Dec 1, 2021
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
…g in 'device' planning. (apache#9313)

[Target] Adds SEScope (Storage/Execution Scope) for use as new unit of planning in 'device' planning

This is the first step in apache/tvm-rfcs#38 to bring devices
and targets together when doing device planning. I've gone ahead and also included a
memory scope in this object since we will also need to propagate memory scopes across
Relay expressions once this basic preparation is in place. In the meantime that field will be
left as "".

Once device planning works in units of SEScopes it will be possible to directly read off
the device and target for any Relay sub-expression without the need for TargetMaps ort
the construction of default Targets.

SEScopes also support 'Join' and 'Default' operations needed when constraint solving in
the device planner. You can see those in use in my scratchpad branch:
  https://github.com/mbs-octoml/mbs-tvm/tree/mbs-scopes

This PR also brings some duplicated and the ad-hoc 'default target' handling logic
together into a CompilationConfig class. (Again, see the scratchpad branch for how that
will end up being used). I've placed that next to SEScope since it's main purpose is to
  a) establish the default SEScope for primitive ops
  b) establish the SEScope for the 'host'
  c) feed a definitive vector of Targets into device planning so it can resolve all
     "on_device" and "device_copy" device references to their full SEScope form.

* Reworked to avoid global SEScopeCache.

Realized while working through unit tests in the sequel that it's reasonable
for folks to call build multiple times with distinct Target objects, in which
case the global cache would grow without bound.

So instead placed the cache in the CompilationConfig class. Since that class
now has everything the device planner needs to do its job, promoted it to
be an FFI-able Object, which is now in compilation_config.{h,cc}.

I think we can do much better with CompilationConfig, but for now keeping it
to the minimum I needed to prepare for device planning from all the executor
compilation codepaths.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 7, 2022
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
yangulei pushed a commit to yangulei/tvm that referenced this pull request Jan 11, 2022
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…g in 'device' planning. (apache#9313)

[Target] Adds SEScope (Storage/Execution Scope) for use as new unit of planning in 'device' planning

This is the first step in apache/tvm-rfcs#38 to bring devices
and targets together when doing device planning. I've gone ahead and also included a
memory scope in this object since we will also need to propagate memory scopes across
Relay expressions once this basic preparation is in place. In the meantime that field will be
left as "".

Once device planning works in units of SEScopes it will be possible to directly read off
the device and target for any Relay sub-expression without the need for TargetMaps ort
the construction of default Targets.

SEScopes also support 'Join' and 'Default' operations needed when constraint solving in
the device planner. You can see those in use in my scratchpad branch:
  https://github.com/mbs-octoml/mbs-tvm/tree/mbs-scopes

This PR also brings some duplicated and the ad-hoc 'default target' handling logic
together into a CompilationConfig class. (Again, see the scratchpad branch for how that
will end up being used). I've placed that next to SEScope since it's main purpose is to
  a) establish the default SEScope for primitive ops
  b) establish the SEScope for the 'host'
  c) feed a definitive vector of Targets into device planning so it can resolve all
     "on_device" and "device_copy" device references to their full SEScope form.

* Reworked to avoid global SEScopeCache.

Realized while working through unit tests in the sequel that it's reasonable
for folks to call build multiple times with distinct Target objects, in which
case the global cache would grow without bound.

So instead placed the cache in the CompilationConfig class. Since that class
now has everything the device planner needs to do its job, promoted it to
be an FFI-able Object, which is now in compilation_config.{h,cc}.

I think we can do much better with CompilationConfig, but for now keeping it
to the minimum I needed to prepare for device planning from all the executor
compilation codepaths.
ylc pushed a commit to ylc/tvm that referenced this pull request Jan 13, 2022
…s. (apache#9326)

* Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.

CAUTION: Breaking VM executable serialization change. I needed a new 'virtual devices' array in the executable so that instructions can continue to refer to devices by a simple index yet the VM can respect both the device type and id for runtime devices.

Continuing from apache#9313, and as part of apache/tvm-rfcs#38, we switch PlanDevices to plan with respect to SEScopes instead of just DLDeviceTypes. Our ultimate goal is to be able to flow memory scopes between PrimFuncs by re-running PlanDevices after the LowerTE pass. This PR at least gets us to being able to flow the memory scopes, but the actual changes to PlanDevices to look inside PrimFuncs is still two PR's in the future.

However, we get two nice side effects right away:
 - Since SEScopes contain Targets we can isolate all the device-to-target resolution machinery within PlanDevices (with the help of CompilationConfig). After PlanDevices has run we can retrieve the Target for any sub-expression directly from that sub-expression's SEScope. For now we retain the one-Target-per-DLDeviceType constraint since it baked into the public 'TargetMap' API, but the path to breaking that constraint is clearer.
 - Device ids are now respected all the way from annotation to executor. Previously though we had a bit of plumbing using Devices the device_id therein was ignored or defaulted to zero.

 The Python "on_device" annotation helpers still work w.r.t. devices. Thus though they now respect device ids, they do not allow the user to specify a Target or memory scope as supported by the underlying SEScope.

* [checkpoint] Revert emitter.py, must have run 'black .' by mistake.

* [checkpoint] Address PR comments

Also add back SplitArgs pass in build_module.cc which somehow got lost in the shuffle.

(try again -- flaky test_crt.py test_autotune?)

* [checkpoint] Fix after rebase on CallLowered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants