Conversation
- update vendored parsers to mirror upstream at 5.4: * introduce locations for Longident.t components * distinguish (module M:S) and ((module M):(module S)) for expressions - support for new syntaxes: * bivariance * labelled tuples
eabfcee to
201949e
Compare
Julow
left a comment
There was a problem hiding this comment.
Sorry for the slow reply. This is awesome :) Thanks a lot!
| | BANG MINUS { [ mkvarinj "!" $loc($1); mkvarinj "-" $loc($2) ] } | ||
| | INFIXOP2 | ||
| { if ($1 = "+!") || ($1 = "-!") then [ mkvarinj $1 $sloc ] | ||
| { if $1 = "+!" || $1 = "-!" || $1 = "+-"|| $1 = "-+" |
There was a problem hiding this comment.
Do you think we could accept anything starting in +, - or ! here ? The standard parser will catch errors there anyway.
There was a problem hiding this comment.
From the parser point of view, we could accept any INFIXOP2. And I agree that it probably better to accept this when building the CST.
| in | ||
| let outer_wrap = has_attr && parens in | ||
| let inner_wrap = has_attr || parens in | ||
| let with_label (lbl, exp) = |
There was a problem hiding this comment.
Instead handling special cases for ( ~foo, ~(bar : t) ), etc.. do you think we could represent the concrete syntax in the parsetree ?
I mean something like type tuple_label = Tl_pun of string loc | Tl_constraint of string loc * core_type | Tl_normal of string loc
There was a problem hiding this comment.
Are case where people don't want to normalize ~x:x, ~y to ~x, ~y ? If yes, it is probably better to separate the two forms in the parsetree. Otherwise, I am not sure if this is that useful?
There was a problem hiding this comment.
I think it's useful to make the formatting code simpler and less buggy. Decoding the AST was a major source of bugs in the past.
We generally do the normalization by modifying the AST: https://github.com/ocaml-ppx/ocamlformat/blob/main/lib/Extended_ast.ml#L68
I'd be happy to do the changes if you don't have the time ?
There was a problem hiding this comment.
After experimenting a bit more, I agree that it feels better to not have a straight mapping of locations with punning. I will send my updated commits after some more tests.
There was a problem hiding this comment.
I know I am late, but my current opinion is that every token should be represent in our CST, with its location included. This is the only way to have a bunch of features, like proper comment formatting.
~x: (* blabla *) x, ~yis valid and should be allowed.
fced3c8 to
7cc3534
Compare
| - Support for OCaml 5.4 (#2717, #2720, @Julow, @Octachron) | ||
| OCamlformat now supports OCaml 5.4 syntax. | ||
| Module packing of the form `((module M) : (module S))` are no longer | ||
| rewritten to `(module M : S)` because these are now two different syntaxes. |
There was a problem hiding this comment.
You could mention that this in preparation of modular explicits.
|
I explain the diffs in CI (https://github.com/ocaml-ppx/ocamlformat/actions/runs/17588187443/job/49961988518?pr=2720) by a reduced number of errors (from 88 to 78 errors). Thank you for making this ! |
….28.1) CHANGES: ### Highlight - \* Support for OCaml 5.4 (ocaml-ppx/ocamlformat#2717, ocaml-ppx/ocamlformat#2720, ocaml-ppx/ocamlformat#2732, ocaml-ppx/ocamlformat#2733, ocaml-ppx/ocamlformat#2735, @Julow, @Octachron, @cod1r, @EmileTrotignon) OCamlformat now supports OCaml 5.4 syntax. Module packing of the form `((module M) : (module S))` are no longer rewritten to `(module M : S)` because these are now two different syntaxes. - \* Reduce indentation after `|> map (fun` (ocaml-ppx/ocamlformat#2694, @EmileTrotignon) Notably, the indentation no longer depends on the length of the infix operator, for example: ```ocaml (* before *) v |>>>>>> map (fun x -> x ) (* after *) v |>>>>>> map (fun x -> x ) ``` `@@ match` can now also be on one line. ### Added - Added option `module-indent` option (ocaml-ppx/ocamlformat#2711, @HPRIOR) to control the indentation of items within modules. This affects modules and signatures. For example, module-indent=4: ```ocaml module type M = sig type t val f : (string * int) list -> int end ``` - `exp-grouping=preserve` is now the default in `default` and `ocamlformat` profiles. This means that its now possible to use `begin ... end` without tweaking ocamlformat. (ocaml-ppx/ocamlformat#2716, @EmileTrotignon) ### Deprecated - Starting in this release, ocamlformat can use cmdliner >= 2.0.0. When that is the case, the tool no longer accepts unambiguous option names prefixes. For example, `--max-iter` is not accepted anymore, you have to pass the full option `--max-iters`. This does not apply to the keys in the `.ocamlformat` configuration files, which have always required the full name. See dbuenzli/cmdliner#200. (ocaml-ppx/ocamlformat#2680, @emillon) ### Changed - \* The formatting of infix extensions is now consistent with regular formatting by construction. This reduces indentation in `f @@ match%e` expressions to the level of indentation in `f @@ match`. Other unknown inconsistencies might also be fixed. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon) - \* The spacing of infix attributes is now consistent across keywords. Every keyword but `begin` `function`, and `fun` had attributes stuck to the keyword: `match[@A]`, but `fun [@A]`. Now its also `fun[@A]`. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon) - \* The formatting of`let a = b in fun ...` is now consistent with other contexts like `a ; fun ...`. A check for the syntax `let a = fun ... in ...` was made more precise. (ocaml-ppx/ocamlformat#2705, @EmileTrotignon) - \* `|> begin`, `~arg:begin`, `begin if`, `lazy begin`, `begin match`, `begin fun` and `map li begin fun` can now be printed on the same line, with one less indentation level for the body of the inner expression. (ocaml-ppx/ocamlformat#2664, ocaml-ppx/ocamlformat#2666, ocaml-ppx/ocamlformat#2671, ocaml-ppx/ocamlformat#2672, ocaml-ppx/ocamlformat#2681, ocaml-ppx/ocamlformat#2685, ocaml-ppx/ocamlformat#2693, @EmileTrotignon) For example : ```ocaml (* before *) begin fun x -> some code end (* after *) begin fun x -> some code end ``` - \* `break-struct=natural` now also applies to `sig ... end`. (ocaml-ppx/ocamlformat#2682, @EmileTrotignon) ### Fixed - Fixed `wrap-comments=true` not working with the janestreet profile (ocaml-ppx/ocamlformat#2645, @Julow) Asterisk-prefixed comments are also now formatted the same way as with the default profile. - Fixed `nested-match=align` not working with `match%ext` (ocaml-ppx/ocamlformat#2648, @EmileTrotignon) - Fixed the AST generated for bindings of the form `let pattern : type = function ...` (ocaml-ppx/ocamlformat#2651, @v-gb) - Print valid syntax for the corner case (1).a (ocaml-ppx/ocamlformat#2653, @v-gb) - `Ast_mapper.default_mapper` now iterates on the location of `in` in `let+ .. in ..` (ocaml-ppx/ocamlformat#2658, @v-gb) - Fix missing parentheses in `let+ (Cstr _) : _ = _` (ocaml-ppx/ocamlformat#2661, @Julow) This caused a crash as the generated code wasn't valid syntax. - Fix bad indentation of `let%ext { ...` (ocaml-ppx/ocamlformat#2663, @EmileTrotignon) with `dock-collection-brackets` enabled. - ocamlformat is now more robust when used as a library to print modified ASTs (ocaml-ppx/ocamlformat#2659, @v-gb) - Fix crash due to edge case with asterisk-prefixed comments (ocaml-ppx/ocamlformat#2674, @Julow) - Fix crash when formatting `mld` files that cannot be lexed as ocaml (e.g. containing LaTeX or C code) (ocaml-ppx/ocamlformat#2684, @emillon) - \* Fix double parens around module constraint in functor application : `module M = F ((A : T))` becomes `module M = F (A : T)`. (ocaml-ppx/ocamlformat#2678, @EmileTrotignon) - Fix misplaced `;;` due to interaction with floating doc comments. (ocaml-ppx/ocamlformat#2691, @EmileTrotignon) - The formatting of attributes of expression is now aware of the attributes infix or postix positions: `((fun [@A] x -> y) [@b])` is formatted without moving attributes. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon) - `begin%e ... end` and `begin [@A] ... end` nodes are always preserved. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon) - `begin end` syntax for `()` is now preserved. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon) - Fix a crash on `type 'a t = A : 'a. {a: 'a} -> 'a t`. (ocaml-ppx/ocamlformat#2710, @EmileTrotignon) - Fix a crash where `type%e nonrec t = t` was formatted as `type nonrec%e t = t`, which is invalid syntax. (ocaml-ppx/ocamlformat#2712, @EmileTrotignon) - Fix commandline parsing being quadratic in the number of arguments (ocaml-ppx/ocamlformat#2724, @let-def) - \* Fix `;;` being added after a documentation comment (ocaml-ppx/ocamlformat#2683, @EmileTrotignon) This results in more `;;` being inserted, for example: ```ocaml (* before *) print_endline "foo" let a = 3 (* after *) print_endline "foo" ;; let a = 3 ``` - Fix dropped comment in `if then (* comment *) begin .. end` (ocaml-ppx/ocamlformat#2734, @Julow)
This PR updates the vendored parsers and related types to mirror the 5.4 version of the OCaml parser and adds a basic support for labeled tuples and bivariance.
The parser updates means that it is possible to locate comments between
Longident.tcomponent, and it is always possible to distinguish between(module M): (module S)at least for expressions.The support for labeled expressions is bit ad-hoc and hackish in this version due to the way that labeled tuple element interact with parenthesis, for instance inside:
and it could definitively be improved.