Version française
Home     About     Download     Resources     Contact us    
Browse thread
OCamlLex-Patch for Rule Parameters
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Christian Lindig <lindig@i...>
Subject: OCamlLex-Patch for Rule Parameters


This patch provides an enhancement to OCamlLex from OCaml 2.01.  It
allows to pass additional values to rules.  Currently a user can only
access the lexbuf parameter inside rules but no user provided
parameters:

    rule this = parse
        eof                 { .. }
      | "#"                 { that lexbuf }

    and that = parse
        '\n'                { access lexbuf }
      | [^'\n']*            { that lexbuf }

The patch provides an extended syntax for OCamlLex specification files
that allows to pass user defined parameters:

    rule this = parse
        eof                 { .. }
      | "#"                 { that true 2 lexbuf }  (* pass true and 2 *)

    and that x y = parse    (* x,y are additional parameters *)
        '\n'                { access x, y, and lexbuf }
      | [^'\n']*            { that lexbuf }

The number of parameters is variable. Typically at least one rule will
have no because OCamlYacc generated parsers don't pass additional
parameters.  When a rule calls another rule it must pass the
additional parameters first and then lexbuf as usual:

    rule_name x y lexbuf

The patch is backward compatible: all OCamlLex files will work with
the patched OcamlLex. 
  
What are these additional parameters good for? They come in handy
whenever a semantic value is collected across many lexer calls. A
good example is the file ocaml-2.01/parsing/lexer.mll from the OCaml
parser/lexer:  while scanning a string, escape sequences must be decoded. 
Without additional parameters the result string must be hold in a global
variable which makes the lexer no longer reentrant. 

Here is another silly example:  It scans a file and replaces C style
comments by Pascal style comments.  The comment string is collected first
using a parameter and then printed as a whole. 

    (* 
     * ocamllex example.mll
     * ocamlc -o example example.ml
     * ./example < foo.c
     *)

    {
      let get   = Lexing.lexeme
    }

    (* lexer definitions *)

    rule scanner = parse
            eof                     { ()                                    }
        |   "/*"                    { comment "(*" lexbuf; scanner lexbuf   }
        |   [^'/' '\n']+            { print_string (get lexbuf); 
                                      scanner lexbuf                  }
        |   '\n'                    { print_char '\n' ; scanner lexbuf      }
        |   '/'                     { print_char '/'  ; scanner lexbuf      }

    and comment str = parse         
            eof                     { print_string str                      }
        |   "*/"                    { print_string (str ^ "*)")             }
        |   [^'*' '\n']+            { comment (str ^ (get lexbuf)) lexbuf   }
        |   '*'                     { comment (str ^ "*")  lexbuf           }
        |   '\n'                    { comment (str ^ "\n") lexbuf           }

    {
            let _ =
                let lexbuf = Lexing.from_channel stdin in
                scanner lexbuf
    }

The small patch (3k gzip'ed) can be downloaded from the following web
page:

    http://www.cs.tu-bs.de/softech/people/lindig/software/lex-patch.html

To apply the patch copy the lexer source from ocaml-2.01/lex into a
fresh directory lex (don't miss the .depend file) and cd into it. 
Then apply the patch and call make: 

        patch -p5 < lex-2.01.patch ; make

-- Christian

-- 
 Christian Lindig   Technische Universitaet Braunschweig, Germany
                    http://www.cs.tu-bs.de/softech/people/lindig
                    mailto:lindig@ips.cs.tu-bs.de