Avoid symbolic output extent in newOutputDomain by jacobhinkle · Pull Request #358 · NVIDIA/Fuser

jacobhinkle · 2023-05-17T19:04:11Z

Fixes #357

There are a combination of issues address by this PR. First, as noted in the issue, we should not exact map between Symbolic domains. However, there is still an issue where the output extent, computed at Fusion definition, is sometimes set equal to the symbolic extent even though that extent might concretize to 1 and then become a resolved broadcast. In those cases we should either use any Iteration extents if available, or if we have an unambiguous Symbolic extent, we can use it.

jacobhinkle · 2023-05-17T23:45:45Z

I've encountered an interesting limitation. For a dynamic resize resulting in a broadcast, we may have a concretized root domain of type Iteration that's resized to a Broadcast rfactor domain. For example, as of now the example in #357 concretizes (properly?) to

Inputs:
  T0_g[ iS0{i0}, iS1{i2} ], float
  T1_g[ iS2{i3}, iS3{i4} ], float
  i5, int64_t
  i6, int64_t
Outputs:
  T3_g[ iS8{i3}, iS9{i4} ], float
  T2_g[ bS10{( i0 + i5 )}rf, iS11{( i2 + i6 )}rf ], float

%kernel_math {
T2_g[ bS10{( i0 + i5 )}rf, iS11{( i2 + i6 )}rf ]
   = pad( T0_g[ iS0{i0}, iS1{i2} ], {0, i5, 0, i6} )
T3_g[ iS8{i3}, iS9{i4} ]
   = T2_g[ bS10{( i0 + i5 )}rf, iS11{( i2 + i6 )}rf ]
   * T1_g[ iS2{i3}, iS3{i4} ];

This currently fails when building ComputeAtMap while building the structure used to define concrete IDs, ConcretizedBroadcastDomains at this check: https://github.com/NVIDIA/Fuser/blob/main/csrc/device_lower/analysis/trivial_broadcast.cpp#L99. I believe this assumes that a broadcast will come from a BroadcastOp, and tries to map every Broadcast root ID to its "original" root ID, meaning the ID in the root of the input to a BroadcastOp. In the case of Resize we don't have a separate TV op representing the broadcast, so to satisfy this assumption we would need to insert an op at concretization to separately broadcast after concretizing all resized IterDomains as Iteration. That's a bit invasive, so I'm checking to see if in this case it might be safe to use the resize itself as the concrete ID.

jacobhinkle · 2023-05-18T00:24:14Z

Note that reshape does not trigger this error because during concretization/static reshape, broadcasts are detected and actual broadcast ops are inserted. This may be the way to go; I can detect broadcast output domains that come from Resize op while processing a TV and ideally remove them then insert the broadcast on that TV. I'm not exactly sure how best to remove that ID without it being a Broadcast. One option is to Merge that ID with its neighbor assuming it has any. I will continue thinking about this.

naoyam · 2023-05-18T03:51:27Z

It seems to me that the problem lies in ConcretizedBroadcastDomains. If we extend it to handle broadcast IDs produced by IterDomain expressions, should the problem be fixed?

jacobhinkle · 2023-05-18T16:06:05Z

If we extend it to handle broadcast IDs produced by IterDomain expressions, should the problem be fixed?

I believe so.

naoyam · 2023-05-18T17:36:58Z

OK, then I think that would be preferred over adding broadcast ops. Let me know if you need any help.

With this, now failing at [E thread_pool.cpp:109] Exception in thread pool task: it != bcast_map_.end() INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/root_domain_map.cpp":630, please report a bug to PyTorch. Not found: iS4{i0}rf in T2[ iS4{i0}rf iS6{i2}rf ] (Rfactor: [ bS10{( i0 + i5 )}rf iS11{( i2 + i6 )}rf ])

Using this, we can enable mapping of Symbolic IterDomains when doing p2c propagation in concretization, and still keep it disabled for the purposes of exact mapping. This fixes some errors but I am now seeing errors like > [E thread_pool.cpp:110] Exception in thread pool task: is_overriden || index_map.find(alloc_dom[i]) != index_map.end() INTERNAL ASSERT FAILED at "/opt/pytorch/nvfuser/csrc/index_compute.cpp":1634,

jacobhinkle · 2023-06-02T19:51:53Z

This is being addressed in #420

jacobhinkle added 4 commits May 17, 2023 14:58

Initial try at fixing output domain extent

4e91567

Change how output iter_types are computed in newOutputDomain

7ae44a4

Skip double mutating while concretizing root->rfactor

19aca3e

Merge remote-tracking branch 'origin/main' into fix_symbolic_new_domain

504d72c

jacobhinkle mentioned this pull request May 18, 2023

Fix getPointwiseHeuristics with empty tensors #369

Closed

jacobhinkle added 8 commits May 23, 2023 11:45

Clean up logic in newOutputDomain

b581c13

Merge remote-tracking branch 'origin/main' into fix_symbolic_new_domain

dcb2b43

Clean up clang-tidy warnings on vector of optionals

e304aed

Undo change to promoteIterType

bd0d7c3

Merge remote-tracking branch 'origin/main' into fix_symbolic_new_domain

321cdc7

Remove debug print stmts

89be4e4

jacobhinkle added a commit that referenced this pull request Jun 2, 2023

Grab iter_type fix in newOutputDomain from #358

596ceab

jacobhinkle closed this Jun 2, 2023

jacobhinkle deleted the fix_symbolic_new_domain branch July 25, 2023 16:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid symbolic output extent in newOutputDomain#358

Avoid symbolic output extent in newOutputDomain#358
jacobhinkle wants to merge 12 commits intomainfrom
fix_symbolic_new_domain

jacobhinkle commented May 17, 2023

Uh oh!

jacobhinkle commented May 17, 2023

Uh oh!

jacobhinkle commented May 18, 2023

Uh oh!

naoyam commented May 18, 2023

Uh oh!

jacobhinkle commented May 18, 2023

Uh oh!

naoyam commented May 18, 2023

Uh oh!

jacobhinkle commented Jun 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jacobhinkle commented May 17, 2023

Uh oh!

jacobhinkle commented May 17, 2023

Uh oh!

jacobhinkle commented May 18, 2023

Uh oh!

naoyam commented May 18, 2023

Uh oh!

jacobhinkle commented May 18, 2023

Uh oh!

naoyam commented May 18, 2023

Uh oh!

jacobhinkle commented Jun 2, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants