Skip to content

Conversation

@slyubomirsky
Copy link
Contributor

In response to a comment (#2232 (comment)) in #2232, I've added a few words on why traditional deep learning frameworks use the computation graph representation in the first place. Since I was making changes anyway, I also made some stylistic edits.

Unfortunately, I've screwed up the line endings somehow so the diff is blown up; I couldn't figure out how to fix the diff, unfortunately.

Copy link
Contributor

@yidawang yidawang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's bad that I cannot tell what you have changed. Can you at least point out the line numbers roughly?

I suggested some grammar issue editing as well.

surprise a PL researcher in the first place. If we implement a simple visitor to print out the result and
treat the result as nested Call expression, it becomes ``log(%x) + log(%x)``.

Such ambiguity is caused by different interpretation of program semantics when there is a shared node in the DAG.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interpretations

fact that the ``%1`` is actually reused twice in ``%2``.

The Relay IR is mindful of this difference. Usually, deep learning framework users build the computational
graph in this fashion, where a DAG node reuse often occur. As a result, when we print out the Relay program in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

occurs

Now, please take a close look at the AST structure. While the two programs are semantically identical
(so are their textual representations, except that A-normal form has let prefix), their AST structures are different.

Since program optimizations take these AST data structures and transform them, the two different structure will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structures

affect the compiler code we are going to write. For example, if we want to detect a pattern ``add(log(x), y)``:

- In the data-flow form, we can first access the add node, then directly look at its first argument to see if it is a log
- In the A-normal form, we cannot directly do the check anymore, because the first input to add is ``%v1`` -- we will need to keep a map from variable to its bound values and lookup that map, in order to know that ``%v1`` is a log.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look up

@slyubomirsky
Copy link
Contributor Author

Yeah, I'm terribly sorry about the line endings being messed up in the diff. The primary change was two added lines after the first paragraph in the first section. Other changes were mostly one-word changes

@yidawang
Copy link
Contributor

@slyubomirsky OK. Please address other minor issues I left then it will be good to merge I think.

@tqchen tqchen merged commit 14acb80 into apache:master Dec 24, 2018
@tqchen
Copy link
Member

tqchen commented Dec 24, 2018

Thanks, @yidawang @slyubomirsky , this is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants