Skip to content

Conversation

@thaisacs
Copy link
Contributor

@thaisacs thaisacs commented Mar 19, 2024

When Ansor does not find a schedule for a task in warm up, Ansor gradient gets stuck in this task, because there is no optimized schedule for this task. Hence, Ansor does not optimize any task.

Behavior before correction

bug

Behavior after correction

bug1

@cbalint13
Copy link
Contributor

Cc @comaniac , @merrymercy , @vinx13 , @jcf94

@comaniac
Copy link
Contributor

This change looks a bit hacky and unsafe as you override the cost before calculating the gradients and recover it back afterwards. Intuitively if a task has no schedules, should we just mark it as a dead task and never consider it in the rest of the process?

@thaisacs thaisacs force-pushed the ansor-gradient-bug branch from 3a0ea0e to 7b1a19c Compare March 29, 2024 01:01
Comment on lines 363 to 365
for task_idx in range(len(self.tasks)):
if(self.best_costs[task_idx] == 1e10):
self.dead_tasks.add(task_idx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for task_idx in range(len(self.tasks)):
if(self.best_costs[task_idx] == 1e10):
self.dead_tasks.add(task_idx)
for task_idx, cost in enumerate(self.best_costs):
if cost == 1e10:
self.dead_tasks.add(task_idx)

self.best_score = self.cur_score

# put task without schedule on warm up to dead state
if self.strategy == "gradient":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this logic applies to all strategies instead of just gradient, so this condition may not be necessary. Could you help confirm?

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaisacs thaisacs force-pushed the ansor-gradient-bug branch from bc003bf to c472a87 Compare March 29, 2024 03:04
@cbalint13
Copy link
Contributor

@thaisacs

Could please git force push to trigger tests from scratch, some tests failed for no reason.
I tried "@tvm-bot rerun" for you, but one particular (the hexagon) test still does not wanting to restart.

@thaisacs thaisacs force-pushed the ansor-gradient-bug branch from c472a87 to 6c1c39a Compare March 29, 2024 13:51
@comaniac
Copy link
Contributor

@tvm-bot rerun

@comaniac comaniac merged commit ffa9cfd into apache:main Apr 1, 2024
@comaniac
Copy link
Contributor

comaniac commented Apr 1, 2024

Thanks @thaisacs

thaisacs added a commit to thaisacs/tvm that referenced this pull request Apr 3, 2024
* Fixing ansor gradient bug

* Changing to dead_task

* Applying reviews
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants