Browse thread
[Caml-list] ANNOUNCE: mod_caml 1.0.6 - includes security patch
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2004-01-16 (18:52) |
From: | Yutaka OIWA <oiwa@y...> |
Subject: | Re: [Caml-list] ANNOUNCE: mod_caml 1.0.6 - includes security patch |
Hello. >> On Fri, 16 Jan 2004 09:34:54 +0000, Richard Jones <rich@annexia.org> said: Richard> Being able to write: Richard> var ~ /ab+/ Richard> and similar certainly makes string handling and simple parsing a lot Richard> easier. >> On Fri, 16 Jan 2004 13:05:15 -0600 (CST), Brian Hurt <bhurt@spnz.org> said: Brian> What I'd like to see is to be able to pattern match on regexs, like: Brian> match str with Brian> | /ab+/ -> ... Brian> | /foo(bar)*/ -> ... Brian> etc. My camlp4-macro named Regexp/OCaml may solve most of the requests: try it from http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/caml/ . Using Regexp/OCaml, you can write the code like Regexp.match str with "^(\d+)-(\d+)$" as f : int, t : int -> for i = f to t do printf "%d\n" i done | "^(\d+)$" as s : int -> printf "%d\n" s to perform branch based on multiple regular patterns and to extract matched substrings automatically (bound to f, t, s respectively, after converted to int type by using int_of_string). See http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/pub/caml/regexp-pp-0.9.3/README.match-regexp for further details. Brian> The compiler could then combine all the matchings into a single DFA, Brian> improving performance over code like: Brian> if (regex_match str "ab+") then Brian> ... Brian> else if (regex_match str "foo(bar)*") then Brian> ... Brian> else Brian> ... The code generated by current Regexp/OCaml is something similar to the above, (however, pattern compilations are performed only once per execution per each pattern.) but if the backend regexp engine (currently Regexp/OCaml uses PCRE/OCaml) supports optimization for multiple regular expression matching, Regexp/OCaml can easily utilize it. Analysis for patterns may be performed at compilation (camlp4-translation) phase, if required. Brian> The regex matching would also let the compiler know if there were possible Brian> unmatched strings (these would should up as transitions to the error state Brian> in the DFA). This feature is not currently implemented in Regexp/OCaml, but as the macro package owns self-implemented parser for regular patterns, it is possible to implement if I have enough time to do. (And it is included in my personal to-do list for Regexp/OCaml.) -- Yutaka Oiwa Yonezawa Lab., Dept. of Computer Science, Graduate School of Information Sci. & Tech., Univ. of Tokyo. <oiwa@yl.is.s.u-tokyo.ac.jp>, <yutaka@oiwa.shibuya.tokyo.jp> PGP fingerprint = C9 8D 5C B8 86 ED D8 07 EA 59 34 D8 F4 65 53 61 ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners