You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Menhir being superior to ocamlyacc in almost every point, it may be worthwhile at some point in the future to consider using it as the parser generator for building Ocaml.
Of course this will come with a bootstrap problem, which I have no idea how to address and as such I am not proposing to switch the parser now.
However a second problem comes from differences in the grammar syntax and in the relevant APIs.
Menhir introduces the $startpos, $endpos, $startpos($id|ident), etc. keywords to refer to position of grammatical items in the production being reduced.
In contrast, ocamlyacc relies on the user manually querying the Parsing module to fetch the positions. This comes with shared global state incompatible with menhir approach.
A first step
The current pull request extends ocamlyacc to support a few features of menhir to help a potential migration, or even just to allow sharing the grammar between users of the two parser generators.
Sugar
Explicit names bound to RHS values
expr:
| LPAREN e = expr RPAREN { e }
Is now accepted and the e name is bound to $2 in the action.
Remove "=" as valid character to enter an action
expr:
| LPAREN expr RPAREN = $2 }
Was valid ocamlyacc code. Who would want to use that?! And of course, this is incompatible with the previous feature.
Allow ocaml-style comments in the grammar
Until now, Ocamlyacc only supported C-style /* ... */ comments in the grammar. (Actions can of course embed ocaml-style comments).
Nested (* ... (* ... *) ... *) comments are now supported, with a limited support for strings inside comments (escaped character are just skipped, "tagged"-string literals introduced in Ocaml 4.02 lexer are not supported).
Enable %start TERM
The following code:
%start main
%type <Ast.t> main
Can now be written:
%start <Ast.t> main
Bridging the gap
Most of menhir keywords can be used.
$startpos, $endpos, $startpos($id|ident), $startofs($id|ident), $endpos($id|ident),$endofs($id|ident) are bound to the equivalent call to Parsing.<…>.
$syntaxerror is equivalent to raise Parsing.Parse_error.
$previouserror fails at compile-time, because there is no way, AFAIK, to emulate this feature.
P.-S.
In case of failure, these features try as much as possible to print a relevant error or warning message to the user.
And finally, having a grammar at the intersection of this ocamlyacc and menhir will also greatly help merlin in supporting new versions of the grammar :).
The text was updated successfully, but these errors were encountered:
I find it a bit strange to modify ocamlyacc to support more features rather than trying to extend Menhir (probably under control of a command-line option) to support legacy grammars.
Original bug ID: 6369
Reporter: @gasche
Status: acknowledged (set by @damiendoligez on 2014-07-16T14:08:13Z)
Resolution: open
Priority: normal
Severity: feature
Version: 4.02.0+dev
Category: tools (ocaml{lex,yacc,dep,debug,...})
Tags: github, patch
Monitored by: @jmeber
Bug description
#33
Menhir vs Yacc
Menhir being superior to ocamlyacc in almost every point, it may be worthwhile at some point in the future to consider using it as the parser generator for building Ocaml.
Of course this will come with a bootstrap problem, which I have no idea how to address and as such I am not proposing to switch the parser now.
However a second problem comes from differences in the grammar syntax and in the relevant APIs.
Menhir introduces the
$startpos
,$endpos
,$startpos($id|ident)
, etc. keywords to refer to position of grammatical items in the production being reduced.In contrast, ocamlyacc relies on the user manually querying the
Parsing
module to fetch the positions. This comes with shared global state incompatible with menhir approach.A first step
The current pull request extends ocamlyacc to support a few features of menhir to help a potential migration, or even just to allow sharing the grammar between users of the two parser generators.
Sugar
Explicit names bound to RHS values
Is now accepted and the
e
name is bound to$2
in the action.Remove "=" as valid character to enter an action
Was valid ocamlyacc code. Who would want to use that?! And of course, this is incompatible with the previous feature.
Allow ocaml-style comments in the grammar
Until now, Ocamlyacc only supported C-style
/* ... */
comments in the grammar. (Actions can of course embed ocaml-style comments).Nested
(* ... (* ... *) ... *)
comments are now supported, with a limited support for strings inside comments (escaped character are just skipped, "tagged"-string literals introduced in Ocaml 4.02 lexer are not supported).Enable %start TERM
The following code:
Can now be written:
Bridging the gap
Most of menhir keywords can be used.
$startpos
,$endpos
,$startpos($id|ident)
,$startofs($id|ident)
,$endpos($id|ident)
,$endofs($id|ident)
are bound to the equivalent call toParsing.<…>
.$syntaxerror
is equivalent toraise Parsing.Parse_error
.$previouserror
fails at compile-time, because there is no way, AFAIK, to emulate this feature.P.-S.
In case of failure, these features try as much as possible to print a relevant error or warning message to the user.
And finally, having a grammar at the intersection of this ocamlyacc and menhir will also greatly help merlin in supporting new versions of the grammar :).
The text was updated successfully, but these errors were encountered: