From d7349ca61275d0376664118c066d0313bcdf204d Mon Sep 17 00:00:00 2001 From: Manupa Karunaratne Date: Fri, 13 Aug 2021 17:11:38 +0100 Subject: [PATCH 1/7] [RFC][TIR] TIR Non-scalar Constants * added the markdown * added a commit msg header Change-Id: I0fb3e6b97242ba219c157c9abe5184f14a9f8eff --- rfcs/000x-tir-non-scalar-constants.md | 107 ++++++++++++++++++++++++++ 1 file changed, 107 insertions(+) create mode 100644 rfcs/000x-tir-non-scalar-constants.md diff --git a/rfcs/000x-tir-non-scalar-constants.md b/rfcs/000x-tir-non-scalar-constants.md new file mode 100644 index 00000000..0eec7712 --- /dev/null +++ b/rfcs/000x-tir-non-scalar-constants.md @@ -0,0 +1,107 @@ + +- Feature Name: tir_non_scalar_constants +- Start Date: 2021-06-01 +- RFC PR: TBD +- GitHub Issue: TBD + +# 1. Summary + +This RFC proposes how non-scalar constants could be represented in TIR and used by passes in the lowering process. + +# 2. Motivation + +Currently, the non-scalar constants could be represented in Relay (relay.Constant) to be used by relay passes but not in TIR. Therefore, when performing lowering using TIR passes, we have to maintain a side-channel of tir::Var to constant non-scalar data mapping to perform transformations that could use the knowledge where some of the data are constants. + +Few example scenarios as further motivation : + +## Weight compression + +When lowering for accelerators (E.g. : [Arm(R) Ethos(TM)-U NPU](https://github.com/apache/tvm-rfcs/pull/11)), certain operations will need to get tiled to co-optimize performance and memory utilization. Such tiling patterns create slices of weights that need compressing that will end up with varying sizes. Therefore, the knowledge of some tir::Vars refer to constants are critical in the level of TIR to perform this. + +## Memory Planning + +The TIR program has the ability to express both inter and intra operator memory requirement, post-scheduling as explained further by [Unified Static Memory Planning RFC](https://github.com/apache/tvm-rfcs/pull/9). It would be better if the constants could be embedded to the TIR PrimFunc. Moreover, this allows various [target-dependent lowerings](https://github.com/apache/tvm-rfcs/pull/10), to produce TIR PrimFuncs with constants in it. + +## Winograd Constants + +The Winograd transformation (used for fast GEMMs) involves multiplication by a hard-coded constant tensor. This is currently accomplished in TE using a complicated TE compute expression with many nested selects. Being able to directly express a constant tensor here would significantly simplify this code. + + +# 3. Guide-level explanation + +This is not particularly a user-facing feature and this will allow constants to be 'linked' to TIR. Initially, we are planning to use this with gated on '-link-params' argument for relay.build and TVMC. + +# 4. Reference-level explanation + +The proposal is quite simple and it could be explained as follows : + +``` +@tvm.script.tir +def myfunc(): + param = tir.allocate_const([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], "int32", [10]) +``` + +This follows closely the semantics of tir.allocate and the difference being it represent a buffer filled with constants. + +There are mainly two ways of constants being created in the lowering : + +A1. Linking the params of the model (relay.Constants) + +A2. Creation of constants in the lowering. + +For A1, this should only be done if the target support codegeneration of the constant data as part of the operators. + +For A2, the lowering for targets that support constant as part of the operators, there can be new (differently sized) constants could be created due to optimizations such as weight compression as required by the target. + +# 5. Drawbacks + +Not all targets need/benefit from handling codegeneration differently for constants. + +If we have to 'link' constants to TIR all the time, there might need a subsequent pass to pull them out. However, its clearer if we just 'link' constants where the target supports and benefits of having them expressed in TIR. + +# 6. Alternatives and Discussion + +## Different way of representations + +This is initiated from the discussion on [#8472](https://github.com/apache/tvm/pull/8472). + +C1 : +``` +@tvm.script.tir +def myfunc(): + tir.attrs({ + "link_params": {"model0": array} + }) + my_param_var = tir.get_link_param("model0") +``` +C2 : +``` +@tvm.script.tir +def myfunc(): + tir.attrs({ + "link_params": {my_param_var: array} + }) +``` +C3 : +``` +@tvm.script.tir +def myfunc(): + param = tir.allocate_const([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], "int32", [10]) +``` + +C1 and C2 does not need an addition of IR node, however, needs special handling in the passes to figure out whether its a constant. + +C3 adds a new IR node, but seems straight-forward way to represent constants near to the compute. + +## Different IR node names + +D1 : tir.constant +D2 : tir.allocate_const + +D1 matches more with relay.Constant and D2 shows the similiarity to tir.allocate node, difference being that the data is constant. + + + + + + From e4c8e108d75380c0afb8fb8c3c519e8d6d1edd09 Mon Sep 17 00:00:00 2001 From: Manupa Karunaratne Date: Tue, 17 Aug 2021 17:59:08 +0100 Subject: [PATCH 2/7] [RFC][TIR] TIR Non-scalar Constants * adding the PR number Change-Id: Ia84e39506934919c25f8265fc4f6d3bc6f3a5140 --- ...non-scalar-constants.md => 0022-tir-non-scalar-constants.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename rfcs/{000x-tir-non-scalar-constants.md => 0022-tir-non-scalar-constants.md} (98%) diff --git a/rfcs/000x-tir-non-scalar-constants.md b/rfcs/0022-tir-non-scalar-constants.md similarity index 98% rename from rfcs/000x-tir-non-scalar-constants.md rename to rfcs/0022-tir-non-scalar-constants.md index 0eec7712..e68c181c 100644 --- a/rfcs/000x-tir-non-scalar-constants.md +++ b/rfcs/0022-tir-non-scalar-constants.md @@ -1,7 +1,7 @@ - Feature Name: tir_non_scalar_constants - Start Date: 2021-06-01 -- RFC PR: TBD +- RFC PR: https://github.com/apache/tvm-rfcs/pull/22 - GitHub Issue: TBD # 1. Summary From 1a560d9c6a683f510183eae8071057b680ed8ad8 Mon Sep 17 00:00:00 2001 From: Manupa Karunaratne Date: Mon, 27 Sep 2021 18:46:24 +0100 Subject: [PATCH 3/7] [RFC][TIR] TIR Non-scalar Constants * adding how constants are stored in the IRModule Change-Id: Ie45b1c76e9c522595646fd3f49d5c253d7818e9b --- rfcs/0022-tir-non-scalar-constants.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/rfcs/0022-tir-non-scalar-constants.md b/rfcs/0022-tir-non-scalar-constants.md index e68c181c..fc57d344 100644 --- a/rfcs/0022-tir-non-scalar-constants.md +++ b/rfcs/0022-tir-non-scalar-constants.md @@ -53,6 +53,13 @@ For A1, this should only be done if the target support codegeneration of the con For A2, the lowering for targets that support constant as part of the operators, there can be new (differently sized) constants could be created due to optimizations such as weight compression as required by the target. +### Storage of constants + +Due to concerns of future expansions of centralized storage of constants and adding alternate methods to parse in constants (other than parsing TVMScript), we have decided to store the constants as an IRModule attribute. + +This will go as a "Constants" key in the DictAttrs where the value is a Array\. The tir.allocate_const(...) will refer to the constant directly by index in the array at the time of the creation. Therefore, they are only meant to be accessed via tir.allocate_const(...) nodes in TIR. + + # 5. Drawbacks Not all targets need/benefit from handling codegeneration differently for constants. From 1cea327053d1a28d8cbb7c4a61b5b1c32b302d68 Mon Sep 17 00:00:00 2001 From: Manupa Karunaratne Date: Tue, 28 Sep 2021 07:06:39 +0100 Subject: [PATCH 4/7] [RFC][TIR] TIR Non-scalar Constants * adding IRNode definition * further explaining how constants gets added to IRModule --- rfcs/0022-tir-non-scalar-constants.md | 36 +++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) diff --git a/rfcs/0022-tir-non-scalar-constants.md b/rfcs/0022-tir-non-scalar-constants.md index fc57d344..db956e98 100644 --- a/rfcs/0022-tir-non-scalar-constants.md +++ b/rfcs/0022-tir-non-scalar-constants.md @@ -53,11 +53,43 @@ For A1, this should only be done if the target support codegeneration of the con For A2, the lowering for targets that support constant as part of the operators, there can be new (differently sized) constants could be created due to optimizations such as weight compression as required by the target. + +### IRNode Definition + +``` +class AllocateConstNode : public StmtNode { + public: + /*! \brief The buffer variable. */ + Var buffer_var; + /*! \brief The data associated to the constant. */ + NDArray data; + /*! \brief If the PrimFunc containing the Stmt is added to IRModule, + this is an optional index to indicate the index within + "Constants" attribute, that is a Array of IRModule. + */ + Optional irmod_storage_idx; + /*! \brief The type of the buffer. */ + DataType dtype; + /*! \brief The extents of the buffer. */ + Array extents; + /*! \brief The body to be executed. */ + Stmt body; +} +``` + + ### Storage of constants -Due to concerns of future expansions of centralized storage of constants and adding alternate methods to parse in constants (other than parsing TVMScript), we have decided to store the constants as an IRModule attribute. +Due to concerns of future expansions of centralized storage of constants and adding alternate methods to parse in constants (other than parsing TVMScript), we have decided to store the constants as an IRModule attribute, *when the PrimFunc is added to the IRModule*. + +This will go as a "Constants" key in the DictAttrs where the value is a Array\. However, they are only meant to be accessed via tir.allocate_const(...) nodes in TIR. + + +* If the constants are created in within passes, the IRModule::Add(...) for a PrimFunc needs to traverse the Stmts to pick the NDArray, add it "Constants" IRModule attribute (Array\) and populate *irmod_storage_idx*. + +* If the constants are present in IRModule prior to the PrimFunc is created, then the ObjectRef (for NDArray) and the index of constants in "Constants" IRModule attribute (Array\) has to be populated. + -This will go as a "Constants" key in the DictAttrs where the value is a Array\. The tir.allocate_const(...) will refer to the constant directly by index in the array at the time of the creation. Therefore, they are only meant to be accessed via tir.allocate_const(...) nodes in TIR. # 5. Drawbacks From a8340de89a3b9b88262fc2296046b4703d39959f Mon Sep 17 00:00:00 2001 From: Manupa Karunaratne Date: Tue, 28 Sep 2021 07:20:58 +0100 Subject: [PATCH 5/7] [RFC][TIR] TIR Non-scalar Constants * Updating drawbacks related IRModule::Add(...) changes --- rfcs/0022-tir-non-scalar-constants.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/rfcs/0022-tir-non-scalar-constants.md b/rfcs/0022-tir-non-scalar-constants.md index db956e98..c469d140 100644 --- a/rfcs/0022-tir-non-scalar-constants.md +++ b/rfcs/0022-tir-non-scalar-constants.md @@ -94,9 +94,11 @@ This will go as a "Constants" key in the DictAttrs where the value is a Array\ Date: Tue, 28 Sep 2021 14:22:22 +0100 Subject: [PATCH 6/7] [RFC][TIR] TIR Non-scalar Constants * Addressing nits --- rfcs/0022-tir-non-scalar-constants.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/rfcs/0022-tir-non-scalar-constants.md b/rfcs/0022-tir-non-scalar-constants.md index c469d140..686cef0c 100644 --- a/rfcs/0022-tir-non-scalar-constants.md +++ b/rfcs/0022-tir-non-scalar-constants.md @@ -10,26 +10,26 @@ This RFC proposes how non-scalar constants could be represented in TIR and used # 2. Motivation -Currently, the non-scalar constants could be represented in Relay (relay.Constant) to be used by relay passes but not in TIR. Therefore, when performing lowering using TIR passes, we have to maintain a side-channel of tir::Var to constant non-scalar data mapping to perform transformations that could use the knowledge where some of the data are constants. +Currently, the non-scalar constants are represented in Relay (relay.Constant) to be used by relay passes but not in TIR. Therefore, when performing lowering using TIR passes, we have to maintain a side-channel of tir::Var to constant non-scalar data mapping to perform transformations that could use the knowledge where some of the data are constants. Few example scenarios as further motivation : ## Weight compression -When lowering for accelerators (E.g. : [Arm(R) Ethos(TM)-U NPU](https://github.com/apache/tvm-rfcs/pull/11)), certain operations will need to get tiled to co-optimize performance and memory utilization. Such tiling patterns create slices of weights that need compressing that will end up with varying sizes. Therefore, the knowledge of some tir::Vars refer to constants are critical in the level of TIR to perform this. +When lowering for accelerators (See [Arm(R) Ethos(TM)-U NPU](https://github.com/apache/tvm-rfcs/pull/11)), certain operations will need to get tiled to co-optimize performance and memory utilization. Such tiling patterns create slices of weights that need compressing that will end up with varying sizes. Therefore, the knowledge of some tir::Vars refer to constants are critical in the level of TIR to perform this. ## Memory Planning -The TIR program has the ability to express both inter and intra operator memory requirement, post-scheduling as explained further by [Unified Static Memory Planning RFC](https://github.com/apache/tvm-rfcs/pull/9). It would be better if the constants could be embedded to the TIR PrimFunc. Moreover, this allows various [target-dependent lowerings](https://github.com/apache/tvm-rfcs/pull/10), to produce TIR PrimFuncs with constants in it. +The TIR program has the ability to express both inter and intra operator memory requirement, post-scheduling as explained further by [Unified Static Memory Planning RFC](https://github.com/apache/tvm-rfcs/pull/9). It would be better if the constants could be embedded to the TIR PrimFunc because the memory for constants becomes visible for the memory planner. Moreover, this allows various [target-dependent lowerings](https://github.com/apache/tvm-rfcs/pull/10), to produce TIR PrimFuncs with target-specific constants in it. ## Winograd Constants -The Winograd transformation (used for fast GEMMs) involves multiplication by a hard-coded constant tensor. This is currently accomplished in TE using a complicated TE compute expression with many nested selects. Being able to directly express a constant tensor here would significantly simplify this code. +The Winograd transformation (used for fast GEMMs) involves multiplication by a hard-coded constant tensor. This is currently accomplished in TE using a complicated TE compute expression with many nested selects. Being able to directly express a constant tensor here would significantly simplify this code. See https://github.com/apache/tvm/blob/9df2ae8eaa8b394013182a7ad09ac57fe401f80e/python/tvm/topi/utils.py#L320-L350. # 3. Guide-level explanation -This is not particularly a user-facing feature and this will allow constants to be 'linked' to TIR. Initially, we are planning to use this with gated on '-link-params' argument for relay.build and TVMC. +This is not particularly a user-facing feature and this will allow constants to be 'linked' to TIR. Intially, tir.allocate_const nodes will only be created during scheduling when -link-params is included in the Target (e.g. to relay.build and to TVMC). # 4. Reference-level explanation @@ -45,11 +45,11 @@ This follows closely the semantics of tir.allocate and the difference being it r There are mainly two ways of constants being created in the lowering : -A1. Linking the params of the model (relay.Constants) +A1. Linking the params of the model (relay.Constants -- currently, the model params would be in Relay as relay.Constant nodes) -A2. Creation of constants in the lowering. +A2. Creation/Mutation of constants in the lowering -- these maybe different to the original constants prior to scheduling the Relay into TIR. -For A1, this should only be done if the target support codegeneration of the constant data as part of the operators. +For A1, this should only be done if the target support codegeneration of the constant data (i.e. support --link-params) as part of the operator runtime.Module. Therefore, this is executor independent. For A2, the lowering for targets that support constant as part of the operators, there can be new (differently sized) constants could be created due to optimizations such as weight compression as required by the target. From 3424bda5b230ece6b9770b58fc402567e3bff1fa Mon Sep 17 00:00:00 2001 From: Manupa Karunaratne Date: Wed, 29 Sep 2021 19:03:45 +0100 Subject: [PATCH 7/7] [RFC][TIR] TIR Non-scalar Constants * making irmod_storage_idx and data mutually exclusive --- rfcs/0022-tir-non-scalar-constants.md | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/rfcs/0022-tir-non-scalar-constants.md b/rfcs/0022-tir-non-scalar-constants.md index 686cef0c..4101ca2d 100644 --- a/rfcs/0022-tir-non-scalar-constants.md +++ b/rfcs/0022-tir-non-scalar-constants.md @@ -61,8 +61,10 @@ class AllocateConstNode : public StmtNode { public: /*! \brief The buffer variable. */ Var buffer_var; - /*! \brief The data associated to the constant. */ - NDArray data; + /*! \brief The optional data associated to the constant. + This is mutually exclusive to irmod_storage_idx. + */ + Optional data; /*! \brief If the PrimFunc containing the Stmt is added to IRModule, this is an optional index to indicate the index within "Constants" attribute, that is a Array of IRModule. @@ -74,7 +76,18 @@ class AllocateConstNode : public StmtNode { Array extents; /*! \brief The body to be executed. */ Stmt body; + /*! \brief The constructor with data. */ } + +// The constructor to create a IRNode with constant data +// depending on the type of ObjectRef, it will either +// create AllocateConstNode with irmod_storage_idx or data +AllocateConst(Var buffer_var, + DataType dtype, + Array extents, + ObjectRef data_or_idx, + Stmt body, + Span span); ```