Browse thread
yet another silly question on PXP
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Gerd Stolpmann <gerd@g...> |
| Subject: | Re: :pxp_evpull notation (was: yet another silly question on PXP) |
Am Freitag, den 25.02.2005, 19:14 +0300 schrieb Paul Argentoff:
> Dear Gerd Stolpmann,
>
> Let GS = "Gerd Stolpmann" in
> written_by GS =>
>
> GS> See the file doc/PREPROCESSOR which is part of the distribution
> GS> tarball.
>
> Thanks again for a reference. My next question is about :pxp_evpull
> notation. Can I make such a construct:
>
> let pile = <:pxp_evpull<
> <foo> (: some_fun () :) >>
>
> where some_fun generates a further "subtree" using the same pxp_evpull
> notation.
Yes, this works. some_fun is called when the events for the children of
foo are generated. You must have
some_fun : unit -> Pxp_types.event option
and some_fun is repeatedly called until it returns None.
pxp_evpull generates automata where every state returns an event.
External functions like some_fun are represented as loops, i.e. the next
state is the same state when the function returns Some _, and the
following state for None.
For your example, <:pxp_evpull< <foo> (: some_fun () :) >>, the
automaton is:
let _ =
let _eid = Pxp_dtd.Entity.create_entity_id () in
let rec _generator =
let _state = ref 0 in
fun _arg ->
match !_state with
0 ->
let ev = Pxp_types.E_start_tag ("foo", [], None, _eid) in
_state := 1; Some ev
| 1 ->
begin match some_fun () _arg with
None -> _state := 2; _generator _arg
| Some Pxp_types.E_end_of_stream -> _generator _arg
| Some ev -> Some ev
end
| 2 ->
let ev = Pxp_types.E_end_tag ("foo", _eid) in _state := 3; Some ev
| 3 -> None
| _ -> assert false
in
_generator
(output generated with "camlp4 -I ... pa_o.cmo pa_op.cmo pcre.cma
unix.cma netstring.cma pxp_pp.cma pr_o.cmo sample.ml")
some_fun can even be another pxp_evtree automaton.
> My task really is to build a converter from a huge (>100M) text file (or
> string Stream.t) to a huge xml file. Of course, I need to do all job with
> lazy streams to avoid out-of-memory exceptions.
Pull parsers are your friend. They were created with such applications
in mind.
Gerd
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de
------------------------------------------------------------