From: Pierre Weis <Pierre.Weis@inria.fr>
Message-Id: <199706021535.RAA22238@pauillac.inria.fr>
Subject: Re: lexing strings
In-Reply-To: <199706012353.SAA06495@kimbark.uchicago.edu> from Lyn A Headley at "Jun 1, 97 06:53:12 pm"
To: laheadle@midway.uchicago.edu (Lyn A Headley)
Date: Mon, 2 Jun 1997 17:35:43 +0200 (MET DST)
> [^'\n']*[^'\\']'\''
>
> which should match any sequence of non-newlines until it reaches a '
> not preceded by a backslash. slurp returns the token: STRING(!build)).
>
> My intent, when reading a string, is for the lexer to see the first ',
> jump into 'slurp,' eat up the string and return it as the STRING token,
> then have the parser read a newline and return EOL, thus matching the
> main grammar rule and printing the result. This almost works, but not
> until the user types _two_ newlines will the "interpreter" respond
> by printing the expression value! i.e., typing
>
> 'hi' [newline]
>
> at the prompt is not enough; two newlines are required. Other than
> that, the expected value is returned. Does this mean that the first
> newline is interpreted as part of the STRING? Why would my regex match
> the newline?
Yes, 'hi'\n' matches your regexp. I guess you want something along the
lines of
and slurp = parse
"'"
{ STRING(rev !build) }
| '\\' "'"
{ build := '\'' :: !build;
slurp lexbuf }
| eof
{ raise(Lexical_error "unterminated slurp") }
| c
{ build := c :: !build;
slurp lexbuf }
Hope this helps,
(Note: You should have defined the exception Lexical_error of string, in order
to signal the error "unterminated slurp".)
Pierre Weis
INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/
This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:11 MET