[feat](ai): add accuracy debug skill for nightly test#607
Open
PerryZhang01 wants to merge 1 commit intomainfrom
Open
[feat](ai): add accuracy debug skill for nightly test#607PerryZhang01 wants to merge 1 commit intomainfrom
PerryZhang01 wants to merge 1 commit intomainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
该PR为精度问题定位skill,搭配每日nightly accuracy test (https://rocm.github.io/ATOM/benchmark-dashboard/#tab=accuracy) 使用。当模型出现精度问题时,修改md文件的config即可使用:
给出命令:请按照accuracy_debug.md文档描述帮我定位精度问题,然后该skill就会使用二分法去逐个定位出错的commit。
缺点:
-时间太慢,每次执行一轮测试都需要重新编译aiter,定位一次需要几小时时间;
-自动化程度不高,仍然需要手动设置一些config;
-需要依赖claud code使用,cursor每次切换commit都需要手工点击run;
待优化:
-优化ATOM的commit变动时不重新pull aiter,优化aiter 切commit时增量编译,减少从头jit编译时间;
-直接与dashboard 页面结合,自动根据页面去配置config,在每日dashboard刷新后自动检测出错模型,定位出错commit。