Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing problem #2957

Closed
vicuna opened this issue Sep 6, 2001 · 2 comments
Closed

Parsing problem #2957

vicuna opened this issue Sep 6, 2001 · 2 comments
Labels

Comments

@vicuna
Copy link

vicuna commented Sep 6, 2001

Original bug ID: 515
Reporter: administrator
Status: closed
Resolution: not a bug
Priority: normal
Severity: minor
Category: ~DO NOT USE (was: OCaml general)

Bug description

Hi,
I'm not sure if this is a bug or a misunderstanding on my part. I
translated the combinator parsing examples from John Harrison's online
text, which uses a pre-CSL version of Caml, into OCaml, and for combinator
symbols I used these operators

val ( +++ ) :
('a, 'b) parser_t -> ('a, 'c) parser_t -> ('a, 'b * 'c) parser_t
val ( >>> ) :
('a, 'b) parser_t -> ('b -> 'c) -> ('a, 'c) parser_t
val ( ||| ) :
('a, 'b) parser_t -> ('a, 'b) parser_t -> ('a, 'b) parser_t

with the expectation that the above is in descending precedence order,
i.e., that the following

(foo +++ bar +++ baz +++ quux
   >>> f0

||| name
>>> f1
||| numeral
>>> f2
||| tweedle +++ deedle +++ dum
>>> snd)

would behave as though parenthesized like

((foo +++ bar +++ baz +++ quux >>> f0))
|||
(name >>> f1)
|||
(numeral >>> f2)
|||
((tweedle +++ deedle +++ dum) >>> snd)

Apparently this isn't so since in the function atom that I've appended
below I get a type error when I leave it unparenthesized but it works
just fine with parens. I've commented out the buggy one (look for BUG).
Sorry if this is trivial, but I've stared at this for a while and I don't
see why it doesn't work without the parens. Just ocamlc syntaxbug.ml after
uncommenting the buggy version to test.

(* Begin appended file syntaxbug.ml *)

let ( << ) f g x = f (g x) (* or let (<<) f g = fun x -> f (g x) *)

type term = Var of string
| Const of string
| Fn of string * (term list);;

exception Parse_failure;;

(* parser_t since parser is a reserved word *)

type ('a, 'b) parser_t = 'a list -> 'b * 'a list;;

let ( ||| ) : ('a, 'b) parser_t -> ('a, 'b) parser_t -> ('a, 'b) parser_t =
fun p1 p2 input ->
try p1 input
with Parse_failure -> p2 input;;

let ( +++ ) : ('a, 'b) parser_t -> ('a, 'c) parser_t -> ('a, 'b * 'c) parser_t =
fun p1 p2 input ->
let result1,rest1 = p1 input in
let result2,rest2 = p2 rest1 in
(result1,result2),rest2;;

let rec many : ('a, 'b) parser_t -> ('a, 'b list) parser_t =
fun p input ->
try let result,next = p input in
let results,rest = many p next in
(result::results),rest
with Parse_failure -> [],input;;

let ( >>> ) : ('a, 'b) parser_t -> ('b -> 'c) -> ('a, 'c) parser_t =
fun p f input ->
let result,rest = p input in
f result,rest;;

let some : ('a -> bool) -> 'a list -> 'a * 'a list =
fun p l -> match l with
[] -> raise Parse_failure
| (x::xs) -> if p x then (x,xs) else raise Parse_failure;;

let same_as : 'a -> ('a, 'a) parser_t =
fun tok -> some (fun item -> item = tok);;

let finished : ('a,int) parser_t = fun input ->
if input = [] then 0,input else raise Parse_failure;;

type lexeme = Name of string | Num of string | Other of string;;

let name =
function (Name s::rest) -> s,rest
| _ -> raise Parse_failure;;

let numeral =
function (Num s::rest) -> s,rest
| _ -> raise Parse_failure;;

let other =
function (Other s::rest) ->s,rest
| _ -> raise Parse_failure;;

(* BUG?: This one doesn't work!
let rec atom : lexeme list -> term * lexeme list = fun input ->
(name +++ same_as (Other "(") +++ termlist +++ same_as (Other ")")
>>> (fun (((name,),args),) -> Fn(name,args))

||| name
>>> (fun s -> Var s)

||| numeral
>>> (fun s -> Const s)

||| same_as (Other "(") +++ term +++ same_as (Other ")")
>>> (snd << fst)

||| same_as (Other "-") +++ atom
>>> snd) input
(* val mulexp : lexeme list -> term * lexeme list )
and mulexp : (lexeme,term) parser_t = fun input ->
(atom +++ same_as (Other "
") +++ mulexp
>>> (fun ((a,_),m) -> Fn("*",[a;m]))

||| atom) input
and term : (lexeme,term) parser_t = fun input ->
(mulexp +++ same_as (Other "+") +++ term
>>> (fun ((a,_),m) -> Fn("+",[a;m]))

||| mulexp) input
and termlist : (lexeme, term list) parser_t = fun input ->
(term +++ same_as (Other ",") +++ termlist
>>> (fun ((h,_),t) -> h::t)

||| (term >>> (fun h -> [h]))) input;;
*)

let rec atom : lexeme list -> term * lexeme list = fun input ->
(name +++ same_as (Other "(") +++ termlist +++ same_as (Other ")")
>>> (fun (((name,),args),) -> Fn(name,args))

||| (name
>>> (fun s -> Var s))

||| (numeral
>>> (fun s -> Const s))

||| (same_as (Other "(") +++ term +++ same_as (Other ")")
>>> (snd << fst))

||| (same_as (Other "-") +++ atom
>>> snd)) input
(* val mulexp : lexeme list -> term * lexeme list )
and mulexp : (lexeme,term) parser_t = fun input ->
(atom +++ same_as (Other "
") +++ mulexp
>>> (fun ((a,_),m) -> Fn("*",[a;m]))

||| atom) input
and term : (lexeme,term) parser_t = fun input ->
(mulexp +++ same_as (Other "+") +++ term
>>> (fun ((a,_),m) -> Fn("+",[a;m]))

||| mulexp) input
and termlist : (lexeme, term list) parser_t = fun input ->
(term +++ same_as (Other ",") +++ termlist
>>> (fun ((h,_),t) -> h::t)

||| (term >>> (fun h -> [h]))) input;;

(* end of appended file *)

-- Brian

@vicuna
Copy link
Author

vicuna commented Sep 10, 2001

Comment author: administrator

Hello,

I'm not sure if this is a bug or a misunderstanding on my part. I 

translated the combinator parsing examples from John Harrison's online
text, which uses a pre-CSL version of Caml, into OCaml, and for combinator
symbols I used these operators

val ( +++ ) :
('a, 'b) parser_t -> ('a, 'c) parser_t -> ('a, 'b * 'c) parser_t
val ( >>> ) :
('a, 'b) parser_t -> ('b -> 'c) -> ('a, 'c) parser_t
val ( ||| ) :
('a, 'b) parser_t -> ('a, 'b) parser_t -> ('a, 'b) parser_t

with the expectation that the above is in descending precedence order,

OCaml determines precedence from the first characters of the infix
identifiers, as shown in section 6.7 of the reference manual. In your
example, +++ has highest precedence, and >>> and ||| have equal
precedence, lower than that of +++.

You can get the desired precedence by replacing >>> by something
starting with @ or ^

Hope this helps,

  • Xavier Leroy

@vicuna
Copy link
Author

vicuna commented Sep 10, 2001

Comment author: administrator

Works as documented.

@vicuna vicuna closed this as completed Sep 10, 2001
@vicuna vicuna added the bug label Mar 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant