[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Pietro Abate <Pietro.Abate@p...> |
| Subject: | camlp4 and lexers |
Hi all,
This question was asked a few weeks ago, and again last week. However I
still don't really get how to proceed. I hope we can cook down a small
example to understand a bit more the camlp4 internals.
Say I want to write a small parser for regexp (or an aritmetic
calculator), but I don't want to extend the ocaml grammar to do that. I
just want to create a minimal lexer and a minimal grammar to parse
expressions like (aaa*|b?);c
The parser part is easy (below). The part I don't understand is how to
create a lexer. I had a look at the ocsigen xmlcaml lexer and the camlp4
lexer, but I still haven't found a minimal example I can use without
getting confused.
In particular, the problem below is that I want my lexer to give me back
CHAR tokens (different from the CHAR of char * string of camlp4) and not
strings. I could do the same with the camlp4 lexer, but all my regexp
should be then written as ('a''a''a' *) etc ... that it's not good
looking.
A while ago I did something similar with the old camlp4 [1] using
plexer, but this is not possible anymore...
Nicolas a while ago suggested to copy the Camlp4.PreCast module and the
lexer module and customize them. I think it should be possible just
to use Struct.Grammar.Static.Make with a new lexer instead... but, as I
said, I'm not able to write a very minimal lexer for this example...
Maybe I'm confused about this.
I think a minimal example will help more then one person here.
thanks :)
p
-------------------------- This is my parser...
module RegExGram = Struct.Grammar.Static.Make(RegExpLexer)
let regex = RegExGram.Entry.mk "regex"
EXTEND RegExGram
GLOBAL: regex;
regex: [[ e1 = SELF ; "|" ; e2 = concat -> Alt(e1,e2)
| e1 = seq -> e1 ]
];
concat:[[ e1 = SELF ; ";"; e2 = seq -> Seq(e1,e2)
| e1 = SELF ; e2 = seq -> Seq(e1,e2)
| e1 = seq -> e1 ]
];
seq: [[ e1 = simple ; "?" -> Opt e1
| e1 = simple ; "*" -> Star e1
| e1 = simple ; "+" -> Plus e1
| e1 = simple -> e1 ]
];
simple:[[ "." -> Dot
| "("; e1 = regex; ")" -> e1
| `CHAR(s) -> Sym s ]
];
END
----------------------
[1] http://groups.google.com/group/fa.caml/browse_thread/thread/e26569427cc8879d