Fix issue 16211 - Cyclic dependencies broken again#1602
Fix issue 16211 - Cyclic dependencies broken again#1602MartinNowak merged 11 commits intodlang:masterfrom
Conversation
|
|
Why was this closed? The cycle detection is still broken. |
|
@JackStouffer correct. This should now pass |
|
Apparently, GitHub auto-closed this when the auto-tester merged the Phobos PR. Note all the "auto"s – I'm totally innocent. ;P |
|
Yay a windows specific cycle! I'll have to fix that first |
src/rt/minfo.d
Outdated
| { | ||
| // save to the error message. Note that if we are throwing an | ||
| // exception, we don't care about being careful with using | ||
| // stack memory. Just use the GC/runtime. |
There was a problem hiding this comment.
Can you be sure that everything is already brought up at this point when using druntime as a shared library itself?
There was a problem hiding this comment.
I expect that @MartinNowak already solved that issue when he changed the code to using module groups. I think cycle detection is run on each shared objects' group, which can't have upstream dependencies.
There was a problem hiding this comment.
Wait, I get what you are saying, the GC may not have been set up yet? In the old version, when it was decided to throw an Error with the cycle, the code used GC concatenation, so I assumed this was OK to use. Isn't the GC inside druntime, so I would think it is available by now?
There was a problem hiding this comment.
Yes this will work. Modules are only initialized/sorted after rt_init, i.e. shared libraries aren't constructed in dso_registry until the runtime is initialized.
|
Seems like Win32 is still broken. My go-to codebase to get an idea about the impact of this on real-world code would usually be Weka's, but they have disabled cycle detection anyway since the issues with "uncontrollable" template emission were impractical for them to circumvent – they opted for disabling the checks entirely over basically making sure not to use static constructors with templates ever (this depends of course also on the dependencies between your modules – if you strictly avoid any cycles, working around any template issues would be manageable too). In other words, this definitely fixes a real bug (and should go in), but I'm curious how many more projects will discover issues with cycles now. |
I expect there will be some. But it may be fixing bugs in their code! However, based on how blunt a tool cycle detection is, it's likely not helping at all (most cycle detection problems are frivolous). This is going to be controversial, but it's still a bug fix. Don't know how to do it any better, open to ideas. |
I suspected so, because the comment was weird. I wonder how that happened? |
|
I'm concerned here -- this cycle doesn't seem to be windows specific. Need to investigate the reason why it doesn't fail on OSX or others. |
|
There appears to be a bug in dmd or druntime. On OSX, I can fix the cycle pretty easily I think. But we need to figure out how to reduce this other bug... |
|
Needs dlang/phobos#4571 |
|
Nuts. New shiny object to chase. I'm going to get a proper Windows test build going before going any further |
Maybe |
|
OK, I think dlang/phobos#4580 will make this pass. |
|
I'm a bit concerned about this going in the wrong direction. The cycle detection is already overly conservative and annoys people, b/c it forces them through weird hoops, even though their code just works fine. Just look through bugzilla how many issues we have related to this topic. |
|
There were two reasons to replace the old cycle detector. The code was very hard to maintain, and the algorithm had a fairly bad runtime (O(N^2) IIRC). We shouldn't go back to that if possible. Lines 180 to 195 in 90bd014 That should definitely work, the only addition we need to make this work is accepting cycles between modules without dtors. This could either be done by the backtracking code that wants to print the cycle, or if that is too slow, by adding more state. |
|
I had a realization last night -- imports in unit tests shouldn't play any role in cycles, because they cannot be called from static ctors. If we remove unit test imports from the cycle detection, then we can work around most of the problems I have fixed in Phobos. We still need to handle imports inside functions, because static ctors can call them. In terms of how complex the algorithm has to be, it just has to be. Otherwise the cycle detection is annoying and incorrect. The |
No, this is how it was before I fixed it in 2010. It doesn't find cycles that involve a non-ctor-containing node twice. Such as a -> b -> c -> b -> a, where a and c have ctors. You can't count the irrelevant nodes during cycle detection, they are just a way to get to the next relevant node. |
this would be awesome. |
I think we need to figure this out anyhow, an option to disable this being fatal was requested a couple of times (and can be supported via the -DRT- options, e.g.
That's a tricky question and I haven't yet thought long enough about this to give an answer. My intuition says we should run ddmd.traits first b/c it's at the bottom of the depedency tree (before the cycle ascends back up). Though a much simpler resolution would be to mark the end of the cycle |
That's only because the module order happened to put ddmd.cond first in the list of modules :) If it was the other way around, the cycle would be traits -> cond -> traits. I wonder if it would be worth sorting the module names first to create a deterministic ordering, even in the event of cycles. We are already putting the names into a hash table, instead if we just put it in a sorted list, and then use binary search to take the place of the hashing, we could get similar performance and more determinism. The fundamental problem here is that the cycle detection means there is a real determinism problem, where there is no right answer. All we can do is run the constructors in whichever order we pick (I think from an algorithm standpoint, it makes sense to print the cycle and prune that branch as normal, so in the cycle above ddmd.traits would run first), and hope that it will not create issues. We for sure need a switch to enable this behavior. |
Not to keen on presorting the modules, its add even more complexity, startup time, and code into every binary and we already print the cycle, so people know that sth. doesn't work as expected. |
|
Don't want to push, but we should include such a switch in 2.072.0. |
I will try to get to it by tomorrow. |
Obviously didn't get there, I've been crazy busy. I am traveling today, and will not get to this until at least Saturday, and if not, it will be next week. If you start releasing betas, I'll put it into stable branch. |
|
Let's use |
|
I had settled on this scheme: I can switch easily to your scheme. Almost done with the PR. |
|
@MartinNowak having issues building stable branch of dmd: Any ideas? I've tried several versions of dmd, including 2.069 and 2.071 |
Looks great, let's not start with camel cases for command lines and stick to
You don't need to build dmd-stable yourself, do you? |
|
@MartinNowak I created the PR: #1668. I didn't see your message about downloading dmd stable. I'll try that. |
|
Unfortunately this change did broke more projects than initially tested, at least Higgs, vibe.d/libasync, and an intermediate state of ddmd. I'd expect that it hit a lot more projects not tested by us. |
|
I may be able to "run" the old algorithm, or at least adapt the new algorithm to run like the old one did, if errors are seen. Alternatively, we can resurrect the entire original code, and just call it if the corrected algorithm fails. It's an unfortunate perception that this is "breaking" projects -- those projects are already broken in terms of having a cycle. Whether the cycle is harmless or not, the truth is that the compiler doesn't give us enough information to determine whether it's truly broken or not. |
No, that's a fallacy. You could say "broken" about a lot of things that are written in ways that under some circumstances can end up being buggy, but that's just not a useful definition. Relying on that makes for an unusable language because each version makes most existing code unusable as-is. If a piece of software has been tested, known to work correctly for all foreseeable inputs and circumstances, and has been successfully put in production, it is not broken, it's working as expected. A compiler update that causes correct software to stop working is broken. I don't want to start a debate, but this is an argument that has been brought up a few times. D needs to be useful foremost, facilitating creation of correct programs is just one side of that. |
|
@CyberShadow There is no need to debate, we definitely can go through deprecation state. D claims to handle the proper initialization of globals via dependency sorting at runtime. The sorting was broken. The answer to running the program should have been (and is now), "I don't know how to properly intialize things". Essentially, we can embrace undefined behavior or we can reject it. Which is more useful to a user? In my experience a broken compiler that spits out UB is far worse than one that doesn't compile (or in this case run) my broken code. The most unfortunate thing in this whole mess is the fact that cycles are left for runtime with minimal information (the import dependencies) to go on. It's really a problem that should be solved by the compiler/linker. |
Yes - but only for new programs. Making working software stop working is the problem. This is why we can't reinvent things every year and are likely stuck with the http://wiki.dlang.org/Language_issues list forever. |
|
We can agree to disagree. I personally don't want my programs blowing up at some future date due to a forseeable initialization ambiguity. Others might be OK with that I guess. This is not a reinvention. I fixed this problem in 2010, and it resurfaced within a year. I should have put a test in for it at the time, we are much better about that now. In any case, we can stop the pointless philosophical discussion and just put in the deprecation mechanism. |
Just to clarify, this is not just about a debate of personal preferences. It became clear a long time ago that the point where we can break working code willy-nilly in D is long past. As such, this is not negotiable: we all must take code breakage seriously. A method of notifying the user without actually breaking their code is the correct solution here. For this we usually have deprecation paths, or in this particular case, a message printed on startup is good too. |
|
Again, not willy-nilly. The compiler was generating code that creates undefined behavior. I think we can all agree that any fix to any broken code of this nature has the chance to "break" code that depended on the brokenness that's being fixed. It's a question of where we draw the line. This is why it's not a winnable argument either way. Either you have code that crashes the mars lander, or you have fixes that set the project back months. Which is worse depends on what your job is. The issues surrounding this fix are very woolly, and we have a path forward, let's just implement it. |
@schveiguy - what information would be useful to you? |
|
Well, an actual dependency map for the ctors. Right now you have things that are listed as dependencies simply because they are imported somewhere in the module. But the compiler knows which variables depend on which other modules because it can see what functions are called in the ctor. |
|
With that, the cycle detection would be closer to reality. Most cycles are harmless. |
Well, it is just an array of imported modules, which may not be the best indicator of dependencies between them. I can't think of any technical blockers that would prevent adding new information about which imported modules are used in the construction/destruction phases, it could even be an int that just masks which indexes in the importes modules array are used by the ctor. |
|
All the ideas and thoughts on how to improve this are fairly spread over multiple Bugzilla tickets, forum entries, and github comments. Can we please aggregate all of this in a single Bugzilla ticket (or didn't we already have one). One related idea was that ctors only using private variables can be marked as |
- only print a deprecation warning for the fixed cycle detection cases (cycles over modules without ctor/dtor)
|
See #1717 |
Note: you may want to look just at the first commit. I left the old algorithm there to prevent the diff from comparing two unrelated functions. The second commit removes the old algorithm from the source.No longer the case, there are a lot of commits, and the code has changed significantly since the first commit.The new algorithm finds cycles that have an intermediate module in the cycle twice. For example:
a -> b -> c -> b -> a, where only a and c have module ctors, but b does not. the old algorithm would short circuit the evaluation at b the second time, thinking it was already visited. This unit test has been added.Essentially, the real graph of modules is between only the modules with static ctors (relevant modules), and the "edges" are really which other relevant modules can be reached. The fact that you have to go through non-relevant modules isn't important, and the non-relevant modules can be used more than once. For this reason, the algorithm needs to be very different than a straight cycle detection algorithm.
The new algorithm is a copy of the original algorithm I wrote in 2010, but enhanced to deal with the current module system.
I also implemented a crude dijkstra search to try and print the shortest cycle possible given the two offending modules. Works rather well, actually.
Unfortunately, there is still an ugly wart of a linear search whenever a module index has to be looked up, because the module info is now all immutable (previous version stored the index inside the module info). Also note that a module can import another module multiple times (see https://issues.dlang.org/show_bug.cgi?id=16208), so I try to minimize the looking up, but I can't avoid it there without allocating an edge map.
Finally, I added a bit better mechanism for figuring out which unit test fails when doing module sorting, because it's all done within a wrapper (and the stack trace doesn't always show line numbers).
pinging @MartinNowak since you are most familiar with this code (and wrote the old version).
This requires dlang/phobos#4493
Note: even though this is a regression fix, this will likely cause code breakage outside phobos, which is why I did not target stable.