Skip to content

Conversation

@zhiics
Copy link
Member

@zhiics zhiics commented Jan 2, 2019

This PR adds passes in Relay to annotate expressions to indicate which device/context each operator should be executed at. #2296 is the proposed RFC. The following changes are made in this PR.

  • Extend target to accept a dictionary of device/context to target and add the fallback_device argument in build of build_module.py
    build(func, target=None, target_host=None, params=None, fallback_device=None)

  • Slightly Modify compile engine to stop lowering and generating schedules for device copy operators since the real data transferring will be only performed at runtime and no schedule is needed for this type of operators.

  • Modify memory plan to return both storage ids and device ids.

  • Add on_device(expr, dev_id) and device_copy(expr, src_dev_id, dst_dev_id) operators as synthetic op as @jroesch suggested. The former takes an expr and a device_id as inputs which indicate where an expression should be annotated. The latter will be used to perform data copy between different devices.

  • Write several passes to validate the annotated program, rewrite the program (e.g. insert device copy operators), and propagate the device information from device copy operators to the other expression, etc.

  • Add unit tests to test the functionality of different annotation schemes.

@tqchen @jroesch @yidawang @yzhliu @tmoreau89

@jroesch
Copy link
Member

jroesch commented Jan 3, 2019

I just got back from vacation, will review this PR tonight, looks like great work 👍

@zhiics
Copy link
Member Author

zhiics commented Jan 6, 2019

@jroesch Thank you. Please take a look when you have time.

@jroesch
Copy link
Member

jroesch commented Jan 6, 2019

@zhiics sorry I had done a partial review and forgot to hit submit, will finish rest and post comments right now.

Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a good first pass, most of my comments are small nits about the comments. It would be good to solicit review from someone who is familiar with heterogeneous-execution.

Copy link
Contributor

@tmoreau89 tmoreau89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @zhiics for this excellent contribution. Quick question: how hard would it be to construct a heterogeneous test case (as test_pass_annotation.py) to run on the VTA simulator? It would be interesting to be able to provide explicit control over what components of the graphs get offloaded to CPU vs. VTA with this approach.

@zhiics
Copy link
Member Author

zhiics commented Jan 7, 2019

@tmoreau89 It shouldn't be hard if you only want to offload most of ops to VTA and only keep a few to CPU. Otherwise, I think it might be a little tedious to traverse the program and add on_device ops. Another way might make annotation easier is probably to allow users to just provide relay op names instead of adding on_device primitives directly, and we can add these primitives in the backend. How do you think?

@icemelon
Copy link
Member

icemelon commented Jan 7, 2019

@zhiics Could you use git rebase so that it will include your commits?

@zhiics
Copy link
Member Author

zhiics commented Jan 7, 2019

oops. Sorry. I will rebase now.

Copy link
Member

@yzhliu yzhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't finished yet, will continue tonight. A high level question: for now users do need to write relay.on_device explicitly for annotation?

@zhiics
Copy link
Member Author

zhiics commented Jan 8, 2019

@yzhliu Yes, we need to use relay.on_device to do annotation. I think we can also allow users to provide the op names for annotation as well, but that probably is worthy a separate PR.

@tqchen
Copy link
Member

tqchen commented Jan 8, 2019

related RFC #2391

@jroesch
Copy link
Member

jroesch commented Jan 8, 2019

@zhiics @yzhliu my suggestion to use the explicit op is to give us flexibility. We can build approaches which are automated/use user input/etc by just writing a pass which annotates the program with the appropriate on_device calls.

@zhiics
Copy link
Member Author

zhiics commented Jan 8, 2019

@jroesch Yes, I actually agree with you. We can add other passes for different annotation schemes, but users should have the flexibility to annotate expressions from the language directly.

std::vector<StorageToken*> tokens;
int device_id = node_device_map_.count(GetRef<Expr>(op))
? node_device_map_[GetRef<Expr>(op)]->value
: 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add 0 as a placeholder in DLDeviceType so that others will not use it for other special purpose by mistake.
also device_id -> device_type

Copy link
Member Author

@zhiics zhiics Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yzhliu Yes, I also thought that it's probably necessary to have a field in DLDeviceType, like 'kDLUNDEFINED = 0'. Let's keep this for now. I will send a RFC later to hear from more people because it needs a slight change in dlpack.

Zhi Chen and others added 4 commits January 9, 2019 17:55
add fallback cpptest

fix lint

accept both nn.op_name and op_name

accept both nn.op_name and op_name

use expr annotation instead of op names

fix test_back_graph_runtime unit test
add fallback cpptest

fix lint

accept both nn.op_name and op_name

accept both nn.op_name and op_name

use expr annotation instead of op names

fix test_back_graph_runtime unit test

address @jroesch's comments
@zhiics
Copy link
Member Author

zhiics commented Jan 11, 2019

@jroesch Please take another look. We can probably bring it in if everything looks good to you. Thanks.

Copy link
Member

@jroesch jroesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the delta, looks good to me 👍

@icemelon icemelon merged commit 09236bf into apache:master Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants