Re: lexer, parser

From: Markus Mottl (mottl@miss.wu-wien.ac.at)
Date: Sat Jun 12 1999 - 14:02:58 MET DST


From: Markus Mottl <mottl@miss.wu-wien.ac.at>
Message-Id: <199906121102.NAA30678@miss.wu-wien.ac.at>
Subject: Re: lexer, parser
To: skaller@maxtal.com.au (John Skaller)
Date: Sat, 12 Jun 1999 13:02:58 +0100 (MET DST)
In-Reply-To: <3.0.6.32.19990603215554.0097b100@triode.net.au> from "John Skaller" at Jun 3, 99 09:55:54 pm

> Is there an 'object' version of the lexer and parser,
> or can I use or adapt the existing code?
>
> I need to maintain state, but the component also need
> to be re-entrant.

I guess you want to have just the syntactic parts in the scanner
and parser files, but semantics shall be kept within objects (very
convenient).

My solution to this is to have the parser return streams of functions.
These functions accept as final argument an object which implements the
appropriate semantics.

E.g.:

module (file) "Foo":

  class semantics = object
    method do_something param = ...
    ...
  end

  type 'a trans_fun = semantics -> 'a -> unit
  and 'a trans_strm = 'a trans_fun Stream.t

  let do_something param obj = obj#do_something param
  ...

The parser (e.g.):

  %start main
  %type <'a Foo.trans_strm> main

  main
    : left_part A_TOKEN { [< $1; 'Foo.do_something $2 >] }
    | A_TOKEN { [< $1 >] }
    | ...

  left_part
    : ANOTHER_TOKEN { [< $1 >] }
    | ...

The main program (e.g.):

  let main () =
    let lexbuf = Lexing.from_channel stdin
    and sm = new semantics in
    let msg_stream = Parser.main Scanner.start lexbuf
    Stream.iter (fun f -> f sm) msg_stream (* this executes semantics *)

It is possible to "rearrange" streams very fast, because substreams
can be combined arbitrarily (e.g. concatenated) without loss of
efficiency. This is very important in parsers. Thus, I prefer them over
lists (of functions).

Execution of "semantics" is also much faster than versions that use
algebraic datatypes and pattern matching, because here we only have to
call methods (wrapped in the functions of the stream) on objects instead
of match abstract syntax trees or else.

Another advantage: the streams can be directed to any kind of object that
matches the interface of "semantics". Thus, we could have "different"
semantics and have the parsed program evaluated over them. This adds
greatly to the modularity of the semantics implementation.

I have found this approach very expressive, efficient and easy to maintain
- I hope it will also help you.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:23 MET