Browse thread
yet another silly question on PXP
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2005-02-27 (19:05) |
From: | Gerd Stolpmann <gerd@g...> |
Subject: | Re: :pxp_evpull notation (was: yet another silly question on PXP) |
Am Freitag, den 25.02.2005, 19:14 +0300 schrieb Paul Argentoff: > Dear Gerd Stolpmann, > > Let GS = "Gerd Stolpmann" in > written_by GS => > > GS> See the file doc/PREPROCESSOR which is part of the distribution > GS> tarball. > > Thanks again for a reference. My next question is about :pxp_evpull > notation. Can I make such a construct: > > let pile = <:pxp_evpull< > <foo> (: some_fun () :) >> > > where some_fun generates a further "subtree" using the same pxp_evpull > notation. Yes, this works. some_fun is called when the events for the children of foo are generated. You must have some_fun : unit -> Pxp_types.event option and some_fun is repeatedly called until it returns None. pxp_evpull generates automata where every state returns an event. External functions like some_fun are represented as loops, i.e. the next state is the same state when the function returns Some _, and the following state for None. For your example, <:pxp_evpull< <foo> (: some_fun () :) >>, the automaton is: let _ = let _eid = Pxp_dtd.Entity.create_entity_id () in let rec _generator = let _state = ref 0 in fun _arg -> match !_state with 0 -> let ev = Pxp_types.E_start_tag ("foo", [], None, _eid) in _state := 1; Some ev | 1 -> begin match some_fun () _arg with None -> _state := 2; _generator _arg | Some Pxp_types.E_end_of_stream -> _generator _arg | Some ev -> Some ev end | 2 -> let ev = Pxp_types.E_end_tag ("foo", _eid) in _state := 3; Some ev | 3 -> None | _ -> assert false in _generator (output generated with "camlp4 -I ... pa_o.cmo pa_op.cmo pcre.cma unix.cma netstring.cma pxp_pp.cma pr_o.cmo sample.ml") some_fun can even be another pxp_evtree automaton. > My task really is to build a converter from a huge (>100M) text file (or > string Stream.t) to a huge xml file. Of course, I need to do all job with > lazy streams to avoid out-of-memory exceptions. Pull parsers are your friend. They were created with such applications in mind. Gerd -- ------------------------------------------------------------ Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de ------------------------------------------------------------