Version française
Home     About     Download     Resources     Contact us    
Browse thread
mixing lexers with camlp4
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Pietro Abate <Pietro.Abate@a...>
Subject: Re: [Caml-list] mixing lexers with camlp4
In the best traditions, I partially answer to myself (below) but I've a new
question:

> - Does camlp4 allows me to mix lexers for different productions in the same
>   extension ?
well, it seems it doesn't. Now I get this error:

Error: entries "psymbol" and "symbol" do not belong to the same grammar.
Fatal error: exception Failure("Grammar.extend error")

- Is there a deep reason why I cannot mix different grammars ?
- Is there a way of forcing this behaviour ?

On Fri, Feb 02, 2007 at 12:40:11PM +1100, Pietro Abate wrote:
> Hi all,
> I want to parsa a language like this one:
> l := l & l | l % l | Id
[...]
> of course the Genlex module is not immediately compatible with the Plexer
> interface so I'm a bit lost...
> 
> - Is this the best way of doing it ?
don't know, maybe not.

> - How can I make the Genlex module compatible with the Plexer 
>   interface (example ?) ?
This should do the job (I think) even if ignore the location...

open Genlex
let lexer = Genlex.make_lexer [
    "+";"-";"*";"/";"=";
    "[";"]";"<";">";
    "%";"&";"*";"?";"~"
];;
let getkwd = function Kwd s -> s | _ -> failwith "aa" ;;
let rec glexer = parser
    [< 'Kwd ("+" | "-" | "*" | "/"
            |"=" | "[" | "]" | "<"
            |">" | "%" | "&" | "?" | "~" ) as s >] -> ("", getkwd s)
    | [< 'Ident s >] -> ("LIDENT",s)
    | [< >] -> ("EOI","")
;;
let lexer_gmake () = {
    Token.tok_func =
    Token.lexer_func_of_parser (fun s -> (glexer (lexer s), Token.dummy_loc));
    Token.tok_using = (fun _ -> ());
    Token.tok_removing = (fun _ -> ());
    Token.tok_match = Token.default_match;
    Token.tok_text = Token.lexer_text;
    Token.tok_comm = None
}
;;

The full code of my example:

to compile:
#> camlp4o pa_extend.cmo pr_o.cmo pa_test.ml >> test.ml
#> ocamlfind ocamlc -package camlp4 camlp4.cma str.cma test.ml 

------------ pa_test.ml ------------
open Genlex
type stype = Lid | Symbol of string ;;

let lexer = Genlex.make_lexer [
    "+";"-";"*";"/";"=";
    "[";"]";"<";">";
    "%";"&";"*";"?";"~"
];;
let getkwd = function Kwd s -> s | _ -> failwith "fail getkwd" ;;
let rec glexer = parser
    [< 'Kwd ("+" | "-" | "*" | "/"
            |"=" | "[" | "]" | "<"
            |">" | "%" | "&" | "?" | "~" ) as s >] -> ("", getkwd s)
    | [< 'Ident s >] -> ("LIDENT",s)
    | [< >] -> ("EOI","")
;;
let lexer_gmake () = {
    Token.tok_func =
    Token.lexer_func_of_parser (fun s -> (glexer (lexer s), Token.dummy_loc));
    Token.tok_using = (fun _ -> ());
    Token.tok_removing = (fun _ -> ());
    Token.tok_match = Token.default_match;
    Token.tok_text = Token.lexer_text;
    Token.tok_comm = None
}
;;

let symbgrammar = Grammar.gcreate (lexer_gmake ());;
let symbol strm =
    match Stream.peek strm with
    |Some("",s) -> Stream.junk strm; s
    |Some("LINDENT",s) -> Stream.junk strm; s
    | _ -> raise Stream.Failure
;;
let symbol = Grammar.Entry.of_parser symbgrammar "symbol" symbol ;;
let grammar = Grammar.gcreate (Plexer.gmake ());;
let gram_list = Grammar.Entry.create grammar "gram_list";;

EXTEND
GLOBAL: gram_list;

gram_list: [[ grams = LIST1 gram; EOI -> grams ]];

gram: [[ p = LIDENT; ":="; rules = LIST1 rule SEP "|" -> (p,rules) ]];

rule: [[ psl = LIST1 psymbol -> psl ]];

psymbol: [[
     "VAR" -> Lid
    | e = symbol -> Symbol(e)
]];

END
;;

let apply s = Grammar.Entry.parse gram_list (Stream.of_string s);;
(apply "l := VAR");;
(apply "l := VAR & VAR");;
(apply "l := VAR U VAR");;

Je vous remercie énormément pour votre aide.

:)
p


-- 
++ Blog: http://blog.rsise.anu.edu.au/?q=pietro
++ 
++ "All great truths begin as blasphemies." -George Bernard Shaw
++ Please avoid sending me Word or PowerPoint attachments.
   See http://www.fsf.org/philosophy/no-word-attachments.html