Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Set isInternalRegDelayFree for several of the x86 hwintrinsics#16649

Merged
tannergooding merged 1 commit into
dotnet:masterfrom
tannergooding:hwintrin-internalRegDelayFree
Feb 28, 2018
Merged

Set isInternalRegDelayFree for several of the x86 hwintrinsics#16649
tannergooding merged 1 commit into
dotnet:masterfrom
tannergooding:hwintrin-internalRegDelayFree

Conversation

@tannergooding
Copy link
Copy Markdown
Member

FYI. @CarolEidt, @mikedn, @fiigii. As per the discussion here: #16558 (comment)

Comment thread src/jit/hwintrinsiccodegenxarch.cpp Outdated
emit->emitDataGenEnd();

// Ensure we aren't overwriting offsReg, baseReg, or nonConstImmReg
assert(offsReg != baseReg);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any guarantee that two temporary registers won't be the same? Do I need to set isInternalRegDelayFree to guarantee that?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary registers have identical live ranges so they cannot be the same. If their live ranges would be disjoint then you'd probably need only one temporary register to begin with.

Comment thread src/jit/hwintrinsiccodegenxarch.cpp Outdated
regNumber offsReg = node->GetSingleTempReg();

// Ensure we aren't overwriting op1Reg
assert(baseReg != op1Reg);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarolEidt, you mentioned

It's only needed if the tempReg needs to be different from the target. The internal registers never conflict with incoming sources.

Is the second part (never conflict with incoming sources) also true in the case where a source register is the same as the target register?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the case where a source register is the same as the target register?

Isn't this the same as being RMW?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically.

But my question is that:

  • since tmpReg can equal targetReg (when isInternalRegDelayFree=false)
  • and since opReg can equal targetReg (when isDelayFree=false)
  • is it possible that tmpReg can equal opReg (when opReg == targetReg)?
    • or is it a hard limitation that tmpReg != opReg (based on @CarolEidt's statement)?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, tmpReg can't equal opReg. Temporary registers are treated as if they are defined before operands are used so they have to be different.

I think I've seen these details described somewhere in the documentation but I can't find that right now.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it wasn't in the documentation, it's in the comment at the start of lsra.cpp:

coreclr/src/jit/lsra.cpp

Lines 24 to 39 in f1fee6d

"Internal registers" are registers used during the code sequence generated for the node.
The register lifetimes must obey the following lifetime model:
- First, any internal registers are defined.
- Next, any source registers are used (and are then freed if they are last use and are not identified as
"delayRegFree").
- Next, the internal registers are used (and are then freed).
- Next, any registers in the kill set for the instruction are killed.
- Next, the destination register(s) are defined (multiple destination registers are only supported on ARM)
- Finally, any "delayRegFree" source registers are freed.
There are several things to note about this order:
- The internal registers will never overlap any use, but they may overlap a destination register.
- Internal registers are never live beyond the node.
- The "delayRegFree" annotation is used for instructions that are only available in a Read-Modify-Write form.
That is, the destination register is one of the sources. In this case, we must not use the same register for
the non-RMW operand as for the destination.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedn - thanks for adding all clarification (I'm currently out of the country so my responses are slow and off-timezone)

Copy link
Copy Markdown

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/jit/hwintrinsiccodegenxarch.cpp Outdated
assert(baseReg != op1Reg);
assert(baseReg != op2Reg);
assert(offsReg != op1Reg);
assert(offsReg != op2Reg);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO these assertions are a bit of overkill, but I guess they can't hurt.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had initially added them because I was unsure if tmpReg can ever be opReg (#16649 (comment)).

I am going to remove these "unneeded" asserts as part of the rebase.

@tannergooding
Copy link
Copy Markdown
Member Author

Removed the unnecessary asserts (for the case where tmpReg == srcReg).
Kept the asserts checking that tmpReg != targetReg.

@tannergooding
Copy link
Copy Markdown
Member Author

test Windows_NT x64 Checked jitincompletehwintrinsic
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2
test Windows_NT x64 Checked jitx86hwintrinsicnosimd
test Windows_NT x64 Checked jitnox86hwintrinsic

test Windows_NT x86 Checked jitincompletehwintrinsic
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx2
test Windows_NT x86 Checked jitx86hwintrinsicnosimd
test Windows_NT x86 Checked jitnox86hwintrinsic

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx2
test Ubuntu x64 Checked jitx86hwintrinsicnosimd
test Ubuntu x64 Checked jitnox86hwintrinsic

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

@tannergooding tannergooding merged commit 76c9ccf into dotnet:master Feb 28, 2018
@tannergooding tannergooding deleted the hwintrin-internalRegDelayFree branch May 30, 2018 04:16
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants