Skip to content

Conversation

@mbaret
Copy link
Contributor

@mbaret mbaret commented Feb 11, 2020

Primitive functions are not to be modified by passes and as such should be ignored. This commit makes Visitor patterns ignore Primitive functions by default, although this can be overridden by setting enter_primitives=true. One case in which this overriding is necessary is in the type_infer pass.

This is motivated by the discussions external-codegen-ensure-fuseops-ignores-partitioned-subgraphs and external-codegen-how-to-prevent-batch-norm-from-being-decomposed.

Primitive functions are not to be modified by
passes and as such should be ignored. This
commit makes Visitor patterns ignore Primitive
functions by default, although this can be
overridden by setting enter_primitives=true.

One case in which this overriding is necessary is
in the type_infer pass.

Change-Id: Ib9cf7f44d76dd929617493369b6a9912134085b4
@tqchen
Copy link
Member

tqchen commented Feb 11, 2020

I feel that this is a change that should be bought to a sub-class of a visitor, instead of the visitor itself. Given that it is quite specific to the semantics of the problem.

@mbaret
Copy link
Contributor Author

mbaret commented Feb 11, 2020

I agree it's not ideal to modify the behaviour of the Visitors directly. However, if I sub-class to produce a 'PrimitiveSkippingVisitor' then all existing passes that need to skip primitives will need updating (and all future ones will need to know to use the subclass too). In practice, that's almost every pass. I can attempt to do this if that's desirable, but it doesn't seem like an elegant solution.

Alternatively, there may be an entirely different approach to resolve this issue that I haven't considered. Either way, I think this does need a resolution as it breaks quite a lot of the BYOC infrastructure.

@mbaret
Copy link
Contributor Author

mbaret commented Feb 11, 2020

@comaniac @zhiics Can you review please? I'm wondering whether I've interpreted the intended behaviour of primitive functions correctly.

@tqchen
Copy link
Member

tqchen commented Feb 11, 2020

A potentially better approach here would be to lift the primitive functions into the module-level, so that the function pass can safely skip these types of functions.

The previous behavior of keeping primitive functions inlined was mainly due the restriction that we were dealing with single function at a time in the past, instead of Module -> Module

@tqchen
Copy link
Member

tqchen commented Feb 11, 2020

It would also be great to understand how things would break in the case of external functions(e.g. was that due to inlining), and whether lifting the customized external function to the Module level will resolve the problem

@mbaret
Copy link
Contributor Author

mbaret commented Feb 11, 2020

An example to illustrate why this breaks is that we have external codegens that want to act directly on the relay 'qnn dialect'. However, this is all lowered away once QnnCanonicalize/Legalize is called. Therefore, if we leave the external functions inlined their contents also gets lowered and we lose all the qnn ops.

If we removed the external functions from 'main' altogether, I think that could work but might require some restructuring of the build process. I'll have a look into it.

@comaniac
Copy link
Contributor

Specifically for removing all functions from main, would this help? #4847
This PR should support empty Relay program to be lowered.

@zhiics
Copy link
Member

zhiics commented Feb 11, 2020

I think there are at least two solutions to prevent Relay passes from touching the external functions:

  • S0: lifting the external functions to the module scope, i.g. making them as a global function, and marking them as external. The pass infra will skip optimization of them automatically. This is okay for VM. However, for graph runtime, we may need to have an inline pass to inline these global functions, as build_module and graph_runtime_codegen really only focus on the main function and globalvar is not allowed.
  • S1: Look at the passes and skip the optimizing FunctionNode if the function doesn't use default compiler from TVM
    https://github.com/apache/incubator-tvm/blob/4fce5137ddd30ea56d1951703de43727043090b9/src/relay/ir/expr.cc#L168
    This looks a little easier because some passes can ignore this, but it doesn't seem that clean.

@tqchen
Copy link
Member

tqchen commented Feb 11, 2020

I think S0 is a better approach here

@mbaret
Copy link
Contributor Author

mbaret commented Feb 11, 2020

I'll investigate the global function approach in the context of the graph runtime which is what I'm using at the moment. In thinking about this problem, it has occured to me that we could easily run into a problem where different external compilers want to see different stages of the lowering. I'm not sure if that impacts this PR, I'll give it some more consideration.

@masahi
Copy link
Member

masahi commented Feb 11, 2020

Primitive functions are also created during the op fusion pass. https://github.com/apache/incubator-tvm/blob/e4d817d4c63b1f9881e5085e6a18c856770bae14/src/relay/pass/fuse_ops.cc#L937

I think it makes sense for primitive functions created during op fusion to be subject to further optimization pass. Does this PR prevent such possibility?

@mbaret
Copy link
Contributor Author

mbaret commented Feb 11, 2020

It would prevent the contents of the fused functions from being further modified (except by passes which specifically exempt themselves from the restriction). I suppose one question I have if we do want primitive functions to be further modified is, what actually is the intended use for kPrimitive? I've been working on the assumption that once a function is marked as primitive it shouldn't be lowered further and you should just manipulate the call to the function instead.

@zhiics
Copy link
Member

zhiics commented Feb 11, 2020

hmm, it should be okay for the primitive functions created in fusion as we actually annotate external functions with kCompiler and check if it uses default compiler. We outline and inline these functions but not the ones created in fusion.

@masahi
Copy link
Member

masahi commented Feb 11, 2020

I don't know what exactly "Primitive" function is supposed mean either, but functions created during op fusion and partitioning have different semantics and treating them in the same way as "Primitive" functions doesn't seem right to me.

Why not checking kExternalSymbol or kCompiler instead?

@tqchen
Copy link
Member

tqchen commented Feb 11, 2020

Given the current discussion, I still think it is better to close this PR for now and take the lift to global approach(sorry I am being picky here because it is related to the core infra).

There is a tradeoff between introducing a new feature, and technical debt in discussing the related solutions

While the current solution certainly resolves some of the problem in the short term, it introduces a reverse dependency of the function convention itself. Users of the API may not be aware of such subtle details in the base-class. In the long run, such kind of changes can become a technical debt in the project. Having a specialized sub-class visitor might be better for limited passes that need this behavior.

Given that we have an alternative path(lift to global) is certainly cleaner, and will likely become the ultimate solution, let us push for that direction instead :)

@slyubomirsky
Copy link
Contributor

I want to second @tqchen's comments: there are visitors (like the Python converter or the Relay AoT compiler) that do operate on primitive functions and the base visitor should not ignore them

@mbaret
Copy link
Contributor Author

mbaret commented Feb 21, 2020

I'll abandon this for now - it looks like it will be best addressed as part of changes to the graph partitioning flow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants