Skip to content

Conversation

@masahi
Copy link
Member

@masahi masahi commented Oct 12, 2022

Currently, MS uses StructuralEqual/Hash in task extraction / evo search / database. Sometimes, we want to use different hashing and equality testing methods, for example (1) to ignore NDArray (#12706) or (2) to enable anchor-op only tuning (identify conv2d and conv2d -> add subgraphs as equal).

To enable such flexibility, this PR consolidate raw calls to StructuralEqual/Hash into one place, which for now is named ModuleEquality. Since hashing is also done for equality testing, I think it is appropriate to call the component responsible for hashing / equality test that way. But other suggestions are welcome.

Importantly, task extraction and database are now using the same hashing / equal method based on TIR mod, while previously task extraction was using a cache key-ed on relay mod.

cc @junrushao @zxybazh @tqchen

@tvm-bot
Copy link
Collaborator

tvm-bot commented Oct 12, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

ObjectPtr<WorkloadNode> n = runtime::make_object<WorkloadNode>();
n->shash = tvm::StructuralHash()(mod);
n->mod = mod;
n->shash = ModuleEquality::Create("structural")->Hash(mod);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this constructor is not used anywhere, I'm hard coding the instance here to avoid passing the mod_eq_name string. I want to remove this constructor if there is no objection.

/*! \brief The default database implementation, which mimics two database tables with two files. */
class JSONDatabaseNode : public DatabaseNode {
public:
explicit JSONDatabaseNode(String mod_eq_name = "structural")
Copy link
Member Author

@masahi masahi Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is required, since we need to be able to default-construct JSONDatabaseNode and other database nodes due to the use of TVM_REGISTER_NODE_TYPE macro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible if we bypass it via some tricks like unique ptr? (Sorry I don’t think very deeply so feel free to object)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get what you meant here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just noticed that DatabaseNode::mod_eq_ has been already a unique_ptr - so we don't need an explicit default constructor DatabaseNode::DatabaseNode, but instead we could do inline initialization:

-  std::unique_ptr<ModuleEquality> mod_eq_;
+  std::unique_ptr<ModuleEquality> mod_eq_{nullptr};

BTW, just to nitpick, shall we change mod_eq_ from private to protected so it could be accessed by derived classes?

Copy link
Member Author

@masahi masahi Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since JSONDatabaseNode is constructed by Database class,

Database Database::JSONDatabase(String path_workload, String path_tuning_record,
bool allow_missing) {
int num_threads = std::thread::hardware_concurrency();
ObjectPtr<JSONDatabaseNode> n = make_object<JSONDatabaseNode>();

and Database is not a subclass of DatabaseNode, setting mod_eq after JSONDatabaseNode is constructed requires making mod_eq a public member or introducing a setter method in DatabaseNode.

Moreover, it would also introduce complication to initializing workloads2idx_ (see #13050 (comment)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried: Removing the construction of workloads2idx_, workloads2idx_(/*bucket_count*/ 0, WorkloadHash(), WorkloadEqual(GetModuleEquality())) leads to compilation failure since workloads2idx_ cannot be default-constructed (due to WorkloadEqual which requires mod_eq to be initialized).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then I'm fine with the current implementation

public:
explicit JSONDatabaseNode(String mod_eq_name = "structural")
: DatabaseNode(mod_eq_name),
workloads2idx_(/*bucket_count*/ 0, WorkloadHash(), WorkloadEqual(GetModuleEquality())) {}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since WorkloadEqual needs to be aware of the module hashing / equality testing being used, we need to construct this unordered_map after the ModuleEquality instance is constructed in DatabaseNode by L72.

@masahi masahi force-pushed the metasch-refactor-structural branch from 3c79ad5 to b71ca1b Compare October 12, 2022 00:55
shash = tvm::StructuralHash()(mod);
String recalc_shash = SHash2Str(shash);
CHECK_EQ(recalc_shash, str_shash) << "ValueError: Structural hash changed. Given: " << str_shash
<< "; Recalculated: " << recalc_shash;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ModuleEquality information is not written to json (Workload doesn't have ModuleEquality instance), we cannot determine the appropriate hashing method here. So I had to remove this verification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could leave a TODO here in case of future changes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the removal make sense. On the other hand, we are always using structural hash as the shash. I think we may not really need the functionality to specify the shash direcly in constructor.

  • What about we construct it from the mod and mod_eq_name string so that we can obtain the customized hash result?
  • Maybe we can store the mod_eq_name in Workload so that when we parse it we can still check the shash results?

What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep things clean, I want to minimize the number of unique ModuleEquality instances scattered throughout MS infra. Currently, only task extraction and Database "own" a ModuleEquality instance.

Given this goal, I think having Database compute a hash of Workload using its ModuleEquality is more natural than passing mod_eq_name to Workload and let it compute hash. Moreover, all Workload would end up having the same mod_eq_name, so it feels redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to restore hash verification is to do verification inside DatabaseNode after Workload::FromJSON is called, since the database does know what the ModuleEquality instance is supposed to be.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored hash verification in JSONDatabase, the only place where Workload::FromJSON is used currently.

@masahi masahi force-pushed the metasch-refactor-structural branch from b71ca1b to 0c8ee7f Compare October 12, 2022 01:13
Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM!

Copy link
Member

@zxybazh zxybazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending the new feature! It's very exciting to see such improvements!

Generally it's looking good. I have some minor suggestions over the Workload usage and question towards the hash map of workloads.

Other than these, currently I can see the only option of module equality functions defined on the c++ side, do you think we can also allow easier customization of the hash and equal function via providing FFI functions or by overridding the class from python side (i.e., providing a PyModuleEquality class)?

shash = tvm::StructuralHash()(mod);
String recalc_shash = SHash2Str(shash);
CHECK_EQ(recalc_shash, str_shash) << "ValueError: Structural hash changed. Given: " << str_shash
<< "; Recalculated: " << recalc_shash;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the removal make sense. On the other hand, we are always using structural hash as the shash. I think we may not really need the functionality to specify the shash direcly in constructor.

  • What about we construct it from the mod and mod_eq_name string so that we can obtain the customized hash result?
  • Maybe we can store the mod_eq_name in Workload so that when we parse it we can still check the shash results?

What do you think?

@masahi
Copy link
Member Author

masahi commented Oct 14, 2022

currently I can see the only option of module equality functions defined on the c++ side, do you think we can also allow easier customization of the hash and equal function via providing FFI functions or by overridding the class from python side (i.e., providing a PyModuleEquality class)?

@zxybazh This can be considered in future, but I don't expect that people would want to add a custom ModuleEquality often. So for now, ModuleEquality is not exposed to python, this lets me spare from boilerplates and make ModuleEquality creation as simple as

std::unique_ptr<ModuleEquality> ModuleEquality::Create(const std::string& mod_eq_name) {
  if (mod_eq_name == "structural") {
    return std::make_unique<ModuleEqualityStructural>();
  }
  LOG(FATAL) << "Unknown module equality " << mod_eq_name;
  return nullptr;
}

@masahi masahi force-pushed the metasch-refactor-structural branch from 3058a68 to 95c01b7 Compare October 14, 2022 19:07
Copy link
Member

@zxybazh zxybazh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @masahi for addressing my questions and adding the verification, the PR looks good to me! Looking forward to the introduction of new module equality functions!

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@junrushao junrushao merged commit cbca28d into apache:main Oct 17, 2022
junrushao pushed a commit that referenced this pull request Oct 17, 2022
…ng NDArray raw data (#13091)

A follow up to #13050, also builds on #13001. This PR enables the functionality in #12706 without changing the existing `StructuralEqual/Hash`.

A question for discussion: Should this be the default ModuleEquality used by MS? It has no effect for the `link-params = False` case, and it simplifies the MS tuning API usage for the `link-params = True` case (Hexagon etc).
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 10, 2022
…e#13050)

Currently, MS uses `StructuralEqual/Hash` in task extraction / evo search / database. Sometimes, we want to use different hashing and equality testing methods, for example (1) to ignore NDArray (apache#12706) or (2) to enable anchor-op only tuning (identify `conv2d` and `conv2d -> add` subgraphs as equal). 

To enable such flexibility, this PR consolidate raw calls to `StructuralEqual/Hash` into one place, which for now is named `ModuleEquality`. Since hashing is also done for equality testing, I think it is appropriate to call the component responsible for hashing / equality test that way. But other suggestions are welcome.

Importantly, task extraction and database are now using the same hashing / equal method based on TIR mod, while previously task extraction was using a cache key-ed on relay mod.
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 10, 2022
…ng NDArray raw data (apache#13091)

A follow up to apache#13050, also builds on apache#13001. This PR enables the functionality in apache#12706 without changing the existing `StructuralEqual/Hash`.

A question for discussion: Should this be the default ModuleEquality used by MS? It has no effect for the `link-params = False` case, and it simplifies the MS tuning API usage for the `link-params = True` case (Hexagon etc).
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
…e#13050)

Currently, MS uses `StructuralEqual/Hash` in task extraction / evo search / database. Sometimes, we want to use different hashing and equality testing methods, for example (1) to ignore NDArray (apache#12706) or (2) to enable anchor-op only tuning (identify `conv2d` and `conv2d -> add` subgraphs as equal). 

To enable such flexibility, this PR consolidate raw calls to `StructuralEqual/Hash` into one place, which for now is named `ModuleEquality`. Since hashing is also done for equality testing, I think it is appropriate to call the component responsible for hashing / equality test that way. But other suggestions are welcome.

Importantly, task extraction and database are now using the same hashing / equal method based on TIR mod, while previously task extraction was using a cache key-ed on relay mod.
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
…ng NDArray raw data (apache#13091)

A follow up to apache#13050, also builds on apache#13001. This PR enables the functionality in apache#12706 without changing the existing `StructuralEqual/Hash`.

A question for discussion: Should this be the default ModuleEquality used by MS? It has no effect for the `link-params = False` case, and it simplifies the MS tuning API usage for the `link-params = True` case (Hexagon etc).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants