Version française
Home     About     Download     Resources     Contact us    
Browse thread
lexing strings
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Pierre Weis <Pierre.Weis@i...>
Subject: Re: lexing strings
> [^'\n']*[^'\\']'\''
> 
> which should match any sequence of non-newlines until it reaches a '
> not preceded by a backslash.  slurp returns the token: STRING(!build)).
> 
> My intent, when reading a string, is for the lexer to see the first ',
> jump into 'slurp,' eat up the string and return it as the STRING token,
> then have the parser read a newline and return EOL, thus matching the
> main grammar rule and printing the result.  This almost works, but not
> until the user types _two_ newlines will the "interpreter" respond
> by printing the expression value! i.e., typing
> 
> 'hi' [newline]
> 
> at the prompt is not enough; two newlines are required.  Other than
> that, the expected value is returned.  Does this mean that the first
> newline is interpreted as part of the STRING?  Why would my regex match
> the newline?

Yes, 'hi'\n' matches your regexp. I guess you want something along the
lines of

and slurp = parse
    "'"
    { STRING(rev !build) }
  | '\\' "'"
    { build := '\'' :: !build;
      slurp lexbuf }
  | eof 
    { raise(Lexical_error "unterminated slurp") }
  | c 
    { build := c :: !build;
      slurp lexbuf }

Hope this helps,

(Note: You should have defined the exception Lexical_error of string, in order
to signal the error "unterminated slurp".)

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/