-
Notifications
You must be signed in to change notification settings - Fork 847
[Experiment/Discussion] Erase List.iter #3662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@0x53A Interesting idea. Kinda sneaky :-) The debugability is a general issue with the arrow operators. We should probably spend some thought on making that more debugable. |
Agree, but the arrow is not the cause in this case. Rather it is because the call to List.iter and then the subsequent call into the callback means that the outer variables and the inner variables are on different stackframes. You would get the same annoyance with Also one clarification (I also edited the op): Not all instances of So
let f s cb =
s |> List.iter cb=> should not be rewritten, instead kept-as-is. The current code matches on |
|
I'm not in favour of hardwiring things like this into the compiler - there is no end to the sort of things like this we could do and doing this sort of thing inevitably causes a tail of quite severe bugs. If we have a problem like debuggability or performance, we should solve it more generally through a set of techniques that can also apply to user-written code. |
Note we do this via inlining, which can also apply to user code
inlining should help here. And if not we should work out if there is a general scheme where closures should capture more of their environment in Debug code order to improve debugging (at the cost of possible space leaks - but that occurs already with tailcalls off in Debug code) |
|
So it may be enough to just mark *.iter as inline? Will try that in the evening ...
Yeah. that would also vastly improve
I agree, but I see this as a good thing - the exact same user-code will become faster and faster with each optimization we add. That doesn't mean we shouldn't start somewhere, or that we should only add it as one big package.
Ignoring this PR, and looking more at forkis, I think the Seq.map rewriting step is relatively clear cut. It is small and easily reviewable. Sure, there may always be bugs, but that is imo just a fact of programming. The more complex the rewriting, the higher the chance a bug gets overlooked, but I don't think that there is such a big difference compared to the other optimization passes the compiler already does. The big difference is that such optimizations are only possible in the compiler, because only it has enough semantic information. The JIT doesn't know the difference between Seq and List, but fsc does. |
Yes, it's just that I'm not in favour of embedding optimizations into the compiler for specific functions in FSharp.Core, except in very rare circumstances. Optimizations should always implement a general scheme and apply to the equivalent user-written code/types. There are exceptions - notably the sequence expression state machine rewriting. However ideally that would also be generalized to user-defined control types, though that is hard to achieve |
|
But that is not possible. The compiler doesn't know the semantics of The only way to enable equivalent optimizations for user-code would be to expose the optimizer as an api, and allow the user to register custom optimization passes. |
Yes that is really a nice goal to have.
That should have it's own language suggestion (if there isn't) and pushed forward. With this we could already use @forki's fuse implementation in the real world and it might have pushed other people into trying some optimizations for their particular code-style/application. Once stable&fast things can still move to the core compiler. This would be like Type Providers/Computation Expressions a good "enabler" feature for the community.
Yes this is IMHO the way forward here. The problem is that there shouldn't be a need to optimize the code-gen for the debugger. The debugger should work properly or nicely no matter the code I write (and as @dsyme said I could use my own user-code for some reason). We should look if we should "inject" information about locals from higher call stacks into such local captures? Can somebody tell if this is technically feasible? This is at least MHO about this. |
|
@0x53A What you are looking for is staging in F#. It would be really good to have staging support in F# as it would provide custom specialization / optimization for almost everything. Unfortunately F# nowhere near what is available in Scala LMS or in OCaml. Stream fusion (Oleg Kiselyov, Aggelos Biboudis, Nick Palladinos, Yannis Smaragdakis: Stream Fusion, to Completeness: https://strymonas.github.io/ ) F# implementation was blocked due the lack of staging infrastructure in F#. Take a look at these papers and presentations, what code specialization / optimization could provide (usually multiple orders of magnitude performance improvement): https://scala-lms.github.io//publications.html Lightweight Modular Staging and Embedded Compilers: Abstraction Without Regret for High-Level High-Performance Programming. Tiark Rompf. PhD Thesis, LAMP, EPFL, 2012 http://infoscience.epfl.ch/record/180642/files/EPFL_TH5456.pdf Flare: Native Compilation for Heterogeneous Workloads in Apache Spark |
|
Well, trying it out was fun, but I have accepted that this won't be accepted ;-) May be continued in #3670 @zpodlovics Yes, staging would push all this on a different level. I doubt F# will ever have anything like it (but please surprise me!), so I still think that hardcoding a few optimizations (cough Seq.map cough) directly in the compiler would make sense. |
|
Like I said I think it is quite valuable to extend the compiler in this direction. Please don't just throw it away and push further.
Of course not if nobody does it or at least pushes it forward ;) @dsyme Would a language suggestion in that direction be considered? |
EDIT: similar behavior can be gained by just marking *.iter as
inline, I will create a second pr with performance / binary size comparisons in the evening.I propose we erase all instances of [Seq/Array/List].iter into a boring foreach loop.
Why?
debuggability
s |> List.iter (fun i -> ...)versusfor i in s do ...have equal semantics, but vastly different debuggability.In the first case, you can only see the loop-local variables in the debugger (
iin this case), in the second, you can debug like normal.For a slightly more complicated example compare for loop vs List.iter
performance
I doubt it would make much of a difference, but eliminating the repeated callback invokes, and possibly removing closures could slightly improve performance. Hoisting the callback into the outer function may (or may not) enable other optimizations to take hold.
Precedent:
My first thought was that this would surely not be accepted. But there is actually already something similar done: All calls to
Array.mapare hoisted into the calling function and converted to a for-loop, which iterates over the source array and writes into the copy. In the compiled code, the lambda is erased, similar to the goal of this pr.State of this PR:
This PR rewrites only List.iter into a for-loop. It seems to work well, for what it is.
I actually don't think implementing this cleanly would take too much work, but please don't look too hard at the current code. For example, at the moment I simply ignore all sequence points, where as they would need to be properly converted.
Future
Should you agree with me that such transformations are beneficial, then I am sure we can come up with some more. I would also ask to to reconsider accepting forkis PR.
I did not create a lang-suggestion / rfc, because this does not change the semantics of the code. It is purely a ergonomic and performance optimization.
EDIT: clarification:
Not all instances of
List.iterare rewritten, but only the ones where you directly pass a lambda into it.So
s |> List.iter (fun i -> ...)=> rewritten tofor i in s do ...List.iter (fun i -> ...) s=> rewritten tofor i in s do ...=> should not be rewritten, instead kept-as-is.
The current code matches on
Apply(List.iter, Lambda), so it should not match if you pass it a function variable instead of a lambda.