Skip to content

Conversation

@jroesch
Copy link
Member

@jroesch jroesch commented Oct 24, 2018

Add an implementation of structural hashing for Relay.

This PR also extends the alpha equality tests to ensure the hash matches when equal, and does not match when not equal.

cc @MarisaKirisame @tqchen @slyubomirsky

Should be good to go.

@slyubomirsky
Copy link
Contributor

slyubomirsky commented Oct 24, 2018

A general question, please pardon me if it is ignorant, but what would the structural hashes be used for? Just for better hash tables and the like?

Edit: Discussed with @jroesch, it seems that being able to check hash map membership by structural equality is exactly the point here. It might be helpful to have comments indicating this

}

size_t VisitExpr_(const VarNode* var) final {
return std::hash<int>()(var_map_[GetRef<Var>(var)]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use .at()
there is the other case (hashing term with free var), it should be error with at() or just hash by ptr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of fact, when VisitExpr is called, this means var is not in the hash_map_, it is a free var, we could simply hash by type_annotation and name hint to be safe.


using AttrsHashHandler::VisitAttr_;
size_t VisitAttr_(const Variable* lhs) final {
return 0; // return LeafNodeEqual(GetRef<NodeRef>(lhs), other);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tqchen I don't fully understand this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two possible cases in here:

  • Variable is a var defined in say TypeParams, in this case, then we will need to put the hash value at the declaration site(likely you can put everything under the same hash_map_ and reuse it for both Attr and Expr
  • Variable is a free variable, in that case, depending on the equality semantics(graph equal vs alpha equal)
    • In the case of alpha_equal, free variables do not map to each other, and hash by the pointer is a good choice
    • In the case of graph_equal, free variables might match each other, maybe we could hash by name_hint
    • By considering both cases, hash by name_hint might not be a bad choice

}

size_t VisitExpr_(const VarNode* var) final {
return std::hash<int>()(var_map_[GetRef<Var>(var)]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of fact, when VisitExpr is called, this means var is not in the hash_map_, it is a free var, we could simply hash by type_annotation and name hint to be safe.


private:
// whether to map open terms.
bool map_free_var_{false};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map_free_var has less of a use in here, in alpha equality case, it means if we would like to also call BindVar when we meet free variables

bool map_free_var_{false};
// renaming of NodeRef to indicate two nodes equals to each other
std::unordered_map<NodeRef, size_t, NodeHash, NodeEqual> hash_map_;
std::unordered_map<NodeRef, int, NodeHash, NodeEqual> var_map_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely we can collapse var_map into hash_map

size_t VisitType_(const FuncTypeNode* func_type) final {
size_t hash = std::hash<std::string>()(func_type->_type_key);
for (auto type_param : func_type->type_params) {
hash = Combine(hash, TypeHash(type_param));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a declaration side, need to record the hash value of type_param into the hash_map_

}

size_t VisitExpr_(const GlobalVarNode* global) final {
return GetRef<GlobalVar>(global).hash();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider hash by global->name_hint instead (so two environment might still be able to match)

@jroesch jroesch changed the title Add Relay hashing [RELAY] Add structural hashing for Relay Oct 25, 2018
"""
return bool(_make._graph_equal(lhs, rhs))

def expr_hash(expr):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a good name for this, expr_hash is a bit generic. Maybe structural_hash/? Most users get used to same function name being overloaded for types, so we could use it for both type and Expr

*
* \return the hash value.
*/
size_t HashType(const Type& type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have an overloaded name for both Type and Expr, StructuralHash?

}

size_t VisitType_(const IncompleteTypeNode* incomplete) final {
return GetRef<IncompleteType>(incomplete);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conversion of pointer to size_t? The simplest approach to make this work for both graph and renaming is to hash Kind(ignore pointer for now)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely, we can also directly use BindVar(Incomplete) and combine it with kind and key

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just a bug, was intending to call hash, will do that.

}

size_t VisitType_(const TypeVarNode* tyvar) final {
int index = BindVar(GetRef<TypeVar>(tyvar));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since both TypeVar/Var/Variable have two possible ways of occurrence:

  • The point of declaration
  • The point of visit

There are certain implications in this function that is not necessarily apparent from the code. Because Visit already checks the hashmap, this means the TypeVar itself is unbound. And we hash it by the free variable index, which implies graph equality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider leave a comment here about its implications


size_t BindVar(const NodeRef& var) {
size_t hash = std::hash<int>()(var_counter++);
CHECK(hash_map_.find(var) == hash_map_.end());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hash_map_.count(var) == 0

return it->second;
}

return std::hash<std::string>()(var->name_hint);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we can use BindVar here as well, and combine it with name_hint and key

}

size_t VisitType_(const TypeVarNode* tyvar) final {
int index = BindVar(GetRef<TypeVar>(tyvar));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shold be size_t


size_t VisitType_(const TypeVarNode* tyvar) final {
int index = BindVar(GetRef<TypeVar>(tyvar));
return std::hash<int>()(index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to hash again, as BindVar already hashes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was small hold over from previous version. Fixed in commit.

@tqchen tqchen merged commit 0f7aa30 into apache:master Oct 25, 2018
eqy pushed a commit to eqy/tvm that referenced this pull request Oct 29, 2018
eqy pushed a commit to eqy/tvm that referenced this pull request Oct 29, 2018
eqy pushed a commit to eqy/tvm that referenced this pull request Oct 29, 2018
FrozenGene pushed a commit to FrozenGene/tvm that referenced this pull request Dec 27, 2018
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Feb 20, 2019
@jroesch jroesch deleted the relay-eq-hash branch February 4, 2021 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants