-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fill in missing CPU detection logic #9017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4e3f882
f751e14
212a23d
15d2fe3
702dc66
3533b4a
0bd19c0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -146,3 +146,11 @@ define weak_odr void @x64_cpuid_halide(i32* %info) nounwind uwtable { | |
| call void asm sideeffect inteldialect "xchg rbx, rsi\0A\09mov eax, dword ptr $$0 $0\0A\09mov ecx, dword ptr $$4 $0\0A\09cpuid\0A\09mov dword ptr $$0 $0, eax\0A\09mov dword ptr $$4 $0, ebx\0A\09mov dword ptr $$8 $0, ecx\0A\09mov dword ptr $$12 $0, edx\0A\09xchg rbx, rsi", "=*m,~{eax},~{ebx},~{ecx},~{edx},~{esi},~{dirflag},~{fpsr},~{flags}"(i32* elementtype(i32) %info) | ||
| ret void | ||
| } | ||
|
|
||
| ; xgetbv: info[0] is ECX (input), output is info[0]=EAX, info[1]=EDX. | ||
| ; Unlike cpuid, xgetbv does not clobber ebx/rbx, so one definition | ||
| ; works for both 32-bit and 64-bit. | ||
| define weak_odr void @xgetbv_halide(i32* %info) nounwind uwtable { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realize this is a "me" thing, but I think we should limit the inline ASM to just the minimal required. I know this works across both 32- and 64-bit, but it requires quite a bit of understanding (e.g. that the struct has been prefilled with the XCR register ID, that loading into ECX is ok because the top bits would be zeros anyway, etc). In other words, can this be written so that just the
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm guilty of cargo-culting the cpuid function above it. I figured, "if cpuid couldn't be written using inline asm, I'm sure I'll hit the same issue here" |
||
| call void asm sideeffect inteldialect "mov ecx, dword ptr $$0 $0\0A\09xgetbv\0A\09mov dword ptr $$0 $0, eax\0A\09mov dword ptr $$4 $0, edx", "=*m,~{eax},~{ecx},~{edx},~{dirflag},~{fpsr},~{flags}"(i32* elementtype(i32) %info) | ||
| ret void | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My .ll comment is to do basically this