Skip to content

fix model compression bug of nan output#1575

Merged
wanghan-iapcm merged 1 commit intodeepmodeling:develfrom
denghuilu:fix-compression-bug
Mar 16, 2022
Merged

fix model compression bug of nan output#1575
wanghan-iapcm merged 1 commit intodeepmodeling:develfrom
denghuilu:fix-compression-bug

Conversation

@denghuilu
Copy link
Member

This PR should also fix #1444 .

Copy link
Member

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this PR. In addition, could you explain which situation will cause the bug, which is even not covered by the UT?

@njzjz njzjz linked an issue Mar 16, 2022 that may be closed by this pull request
@denghuilu
Copy link
Member Author

I approve this PR. In addition, could you explain which situation will cause the bug, which is even not covered by the UT?

This problem was caused by a previous optimization, which reuse the GPU registers to reduce the global memory access time. However, within that implementation, when the table_idx value obtained for the first time is 0, the initialization operation of the registers would be prevented. This is a logic error in programming, which will occur in rare cases.

@denghuilu
Copy link
Member Author

I approve this PR. In addition, could you explain which situation will cause the bug, which is even not covered by the UT?

This problem was caused by a previous optimization, which reuse the GPU registers to reduce the global memory access time. However, within that implementation, when the table_idx value obtained for the first time is 0, the initialization operation of the registers would be prevented. This is a logic error in programming, which will occur in rare cases.

see #1274 .

@wanghan-iapcm wanghan-iapcm merged commit 1503fc3 into deepmodeling:devel Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] compressed model produces NaN forces

3 participants