Skip to content

Comments

fix issue 18315 - wrong code for int.min > 0#7841

Merged
MartinNowak merged 2 commits intodlang:stablefrom
aG0aep6G:18315
Feb 7, 2018
Merged

fix issue 18315 - wrong code for int.min > 0#7841
MartinNowak merged 2 commits intodlang:stablefrom
aG0aep6G:18315

Conversation

@aG0aep6G
Copy link
Contributor

@aG0aep6G aG0aep6G commented Feb 4, 2018

Removing the wrong optimization, falling back to cmp instruction.


I have no idea what I'm doing when it comes to dmd code, and this seems like a place where one can introduce subtle bugs. Please review carefully.

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @aG0aep6G! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Auto-close Bugzilla Description
18315 wrong code for i > 0

@timotheecour
Copy link
Contributor

I think a benchmark showing effect on performance would be nice to know how much this affects dmd.
Since there was no bug with -O (and ldc2 isn't affected) my guess is shouldn't affect much.

@aG0aep6G
Copy link
Contributor Author

aG0aep6G commented Feb 5, 2018

I think a benchmark showing effect on performance would be nice to know how much this affects dmd.

For this D code:

bool f(int i) { return i > 0; }

dmd 2.078.0 generates this:

   0:   push   rbp
   1:   mov    rbp,rsp
   4:   mov    rax,rdi
   7:   neg    eax
   9:   shr    eax,0x1f
   c:   pop    rbp
   d:   ret

with this PR:

   0:   push   rbp
   1:   mov    rbp,rsp
   4:   sub    rsp,0x10
   8:   mov    DWORD PTR [rbp-0x8],edi
   b:   cmp    DWORD PTR [rbp-0x8],0x0
   f:   setg   al
  12:   mov    rsp,rbp
  15:   pop    rbp
  16:   ret

LDC:

   0:   test   edi,edi
   2:   setg   al
   5:   ret

(All with -O.)

I'm not sure if a fair comparison includes the prologue/epilogue in this case. Anyway, here are some times with function calls:

# Compile old version.
echo 'bool f_old(int i) { return i > 0; }' | dmd2.078 - -c -ofold.o -O -betterC
# Compile new version.
echo 'bool f_new(int i) { return i > 0; }' | dmd2.git - -c -ofnew.o -O -betterC
# Try LDC too, for reference.
echo 'bool f_ldc(int i) { return i > 0; }' | ldc2 - -c -ofldc.o -O -betterC
# Compile timing code, link, run.
dmd -O old.o new.o ldc.o -run - <<EOF
bool f_old(int);
bool f_new(int);
bool f_ldc(int);
void main()
{
    import std.conv;
    import std.datetime;
    import std.datetime.stopwatch: benchmark; /* Really? */
    import std.stdio;
    auto ts = benchmark!(
        () { f_old(int.min); },
        () { f_new(int.min); },
        () { f_ldc(int.min); },
    )(10_000_000);
    writeln("old: ", ts[0].to!Duration);
    writeln("new: ", ts[1].to!Duration);
    writeln("LDC: ", ts[2].to!Duration);
}
EOF

Typical output:

old: 28 ms, 140 μs, and 2 hnsecs
new: 31 ms, 107 μs, and 2 hnsecs
LDC: 23 ms, 451 μs, and 6 hnsecs

Since there was no bug with -O (and ldc2 isn't affected) my guess is shouldn't affect much.

The bug exists with -O. It's just a bit harder to trigger:

void main(string[] args)
{
    int i = args.length != 0 ? int.min : 1;
    bool* b = new bool(i > 0);
    if (*b) assert(false);
}

@schveiguy
Copy link
Member

IMO, the performance loss is irrelevant (well, if it was like 10x slower, maybe we could work on it a bit). A compiler should never emit code that doesn't do what the source is telling it to do.

@wilzbach wilzbach added this to the 2.079.0 milestone Feb 5, 2018
Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. I absolutely agree that DMD shouldn't emit the optimization until its bugs have been fixed.

@wilzbach
Copy link
Contributor

wilzbach commented Feb 5, 2018

(The Travis failure can be safely ignored - we have disabled it for master already)

@WalterBright
Copy link
Member

Don't pull this until I have a chance to work on it.

// lower INT_MIN by 1? See test exe9.c
// BUG: fix later
code_orflag(cdb.last(), CFpsw);
cdb.genc2(0x81,grex | modregrmx(3,3,reg),0); // SBB reg,0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this section was #if'd only for Windows. The SBB instruction makes it work. No need to disable the optimization.

/*NOTE: OPgt is left out because there is currently no
optimization implemented for it. There used to be one but it
was wrong.
https://issues.dlang.org/show_bug.cgi?id=18315 */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change this.

assert(!b);
b = 0 < i;
assert(!b);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please just add this test case to test/runnable/mars1.d, instead of putting it in its own file. The autotester runs faster with fewer files.

`SBB reg,0` is needed to make this work. There's nothing platform specific
about it.

Also add comments explaining how this works.
@aG0aep6G
Copy link
Contributor Author

aG0aep6G commented Feb 6, 2018

Followed @WalterBright's advice and fixed the optimization by enabling the SBB instruction. I've also added some comments to explain how all this works. Also moved the test to mars1.d.

@MartinNowak
Copy link
Member

Why don't we just implement the test Xdi, Xdi; setg al solution?
https://godbolt.org/g/8gCBrV

@MartinNowak MartinNowak merged commit 4ccde91 into dlang:stable Feb 7, 2018
@MartinNowak MartinNowak changed the title fix issue 18315 - wrong code for i > 0 fix issue 18315 - wrong code for int.min > 0 Feb 7, 2018
@aG0aep6G aG0aep6G deleted the 18315 branch February 7, 2018 17:48
@WalterBright
Copy link
Member

Why don't we just implement the test Xdi, Xdi; setg al solution?

Great question.

  1. The fix here is simple and not risky, and it gets the bug fixed.
  2. The code gen still supports processors that don't have the setg instruction.
  3. I plan to support setg when I have some time to spend on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants