Browse thread
Thread safe Str
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2005-01-11 (19:58) |
From: | Chris King <colanderman@g...> |
Subject: | Re: [Caml-list] Thread safe Str |
Forgot to CC this to the list, sorry if anyone gets a dupe.... > I'm sure you'd agree there are separable issues here: > > (1) using a string encoding of a regexp as opposed to > a lex like one -- this has nothing to do with > captures. Yes. As I stated in another e-mail in this thread, I'd love to see an API that exposes the parse tree of a regexp. > (2) Captures I'd like to add (3) Parsing vs. substitution. You can't effectively do the latter without captures (of course it can be done but it's messy). > The fact that the regexp syntax is not checked statically > isn't relevant in a dynamic language since the typing > of the rest of the program isn't either. I think we're talking about different things... I used the "s//g" syntax to represent the substitution function in whatever language is being used, not as an example of something to be compiled. (regexp "foo(.*)bar") certainly has a static type, and if it's malformed it simply raises an exception. > I'm trying to provide that in Felix. It has Python style literals, > and Python style substrings. However it is still clumbsy compared > with Perl (I guess .. I can't write Perl ..) Perl string mangling is clumsy compared with Python. The key is that Python treats strings as arrays (or lists) of characters. > > True. Performing multiple replacements on a single string with a > > regexp is retarded. But so is writing a lexer for a simple one-shot > > replacement job. > > That depends on how hard it is to write a lexer. Specifically, a lexer whose input and output are both strings and which performs substitution. Not pretty, unless captures are provided. > At present, Felix regexps have to be constants. It will be possible > in the bootstrapped compiler to generate them as text, and then > compile and link (i.e. there will be a function sort of like 'eval'). That's not acceptable for, say, an incremental search, though, where a new regexp must be generated on each keystroke. (Yes I know regexps aren't the best way to go about that but I'm sure there are better examples.) > > My point is just that regexps are useful enough to co-exist with lexers. > > But they're the same thing. Lexer provide regular definitions, > which is just a way of naming regexps, and reusing the regexps > by providing the name. Not in the form of lex/flex/ocamllex. Yes, lexer token definitions are equivalent to regexps, but everything else about them is different (specifically, lexers are event-based, and don't provied captures). I'd love to see a regexp engine allowing dynamic creation of token-based regexps, complete with captures. It could easily serve as both the base of a lexer and a substitution engine. Heck, that sounds like a fun project... what I am doing this weekend? :P