This site is updated infrequently. For up-to-date information, please visit the new OCaml website at ocaml.org.

Parsing
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
 Date: 2007-06-22 (10:15) From: Bruno De Fraine Subject: Re: [Caml-list] Parsing
```On 22 Jun 2007, at 01:14, Jon Harrop wrote:

> However, you must factor heads in the pattern matches. So instead of:
>
>   and parse_expr = parser
>     | [< e1=parse_factor; 'Kwd "+"; e2=parse_expr >] -> e1 +: e2
>     | [< e1=parse_factor; 'Kwd "-"; e2=parse_expr >] -> e1 -: e2
>     | [< e1=parse_factor  >] -> e1
>
> you must write:
>
>   and parse_expr = parser
>     | [< e1=parse_factor; stream >] ->
>         (parser
>          | [< 'Kwd "+"; e2=parse_expr >] -> e1 +: e2
>          | [< 'Kwd "-"; e2=parse_expr >] -> e1 -: e2
>          | [< >] -> e1) stream

Actually you must write something like:

and parse_expr = parser
| [< e1=parse_factor; stream >] ->
let rec aux e1 = parser
| [< 'Kwd "+"; e2=parse_factor; stream >] -> aux (e1 +: e2)
stream
| [< 'Kwd "-"; e2=parse_factor; stream >] -> aux (e1 -: e2)
stream
| [< >] -> e1
in aux e1 stream

Although you get the precedences right by organizing different levels
for expr/factor/atom, you've made the operations right associative by
greedily parsing "e2" as an expression. "+" and "-" should be left
associative, unless you want 1-2+3 to evaluate to -4.

> Why is this? Is it because the streams are destructive? Could the
> camlp4 macro
> not factor the pattern match for me?

camlp4's extensible grammar system can handle levels and
http://caml.inria.fr/pub/docs/tutorial-camlp4/tutorial003.html

This is your example in the new camlp4:

\$ ocaml camlp4of.cma
Objective Caml version 3.10.0

Camlp4 Parsing version 3.10.0

# type expr = Num of int | Add of expr * expr | Sub of expr * expr |
Mult of expr * expr ;;
type expr =
Num of int
| Add of expr * expr
| Sub of expr * expr
| Mult of expr * expr
# let (+:) e1 e2 = Add(e1,e2) ;;
val ( +: ) : expr -> expr -> expr = <fun>
# let (-:) e1 e2 = Sub(e1,e2) ;;
val ( -: ) : expr -> expr -> expr = <fun>
# let ( *: ) e1 e2 = Mult(e1,e2) ;;
val ( *: ) : expr -> expr -> expr = <fun>
# open Camlp4.PreCast ;;
# let expr = Gram.Entry.mk "expr" ;;
val expr : '_a Camlp4.PreCast.Gram.Entry.t = <abstr>
# EXTEND Gram
expr:
[ e1 = expr; "+"; e2 = expr -> e1 +: e2
| e1 = expr; "-"; e2 = expr -> e1 -: e2 ]
| "mult" LEFTA
[ e1 = expr; "*"; e2 = expr -> e1 *: e2 ]
| "simple" NONA
[ n = INT -> Num(int_of_string n)
| "("; e = expr; ")" -> e ] ]
;
END ;;
- : unit = ()
# Gram.parse expr Loc.ghost (Stream.of_string "1-2+3*4") ;;
- : expr = Add (Sub (Num 1, Num 2), Mult (Num 3, Num 4))

Another reason to use camlp4 Grammars is that you get automatic
syntax error descriptions. However, this seems to contain bugs in the
current camlp4:

# Gram.parse expr Loc.ghost (Stream.of_string "1+2*(3+)-4") ;;
- : expr = Sub (Add (Num 1, Mult (Num 2, Num 3)), Num 4)

(This should not be accepted. It used to give: Stream.Error "[expr]
expected after '+' (in [expr])")

# Gram.parse expr Loc.ghost (Stream.of_string "1+2*(3+!)-4") ;;
Exception:
Loc.Exc_located (<abstr>,
Stream.Error "[expr] expected after \"(\" (in [expr])").

(This is wrong, because there is a legal expression after "(", namely
"3".)

Regards,
Bruno

--
Bruno De Fraine
Vrije Universiteit Brussel
Faculty of Applied Sciences, DINF - SSEL
Room 4K208, Pleinlaan 2, B-1050 Brussels
tel: +32 (0)2 629 29 75
fax: +32 (0)2 629 28 70
e-mail: Bruno.De.Fraine@vub.ac.be

```