[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Daniel_Bünzli <daniel.buenzli@e...> |
| Subject: | Re: [Caml-list] xpath or alternatives |
Sorry for the late reply.
On Wed, Sep 30, 2009 at 01:00:15AM +0200, Mikkel Fahnøe Jørgensen wrote:
> Otherwise there is xmlm which is self-contained in single xml file,
> and as I recall, has some sort of zipper navigator. (I initially
> intended to use it before deciding on the json format):
The cursor api was removed from the library in 1.0.0.
On Wed, Sep 30, 2009 at 6:16 PM, Richard Jones <rich@annexia.org> wrote:
> It's interesting you mention xmlm, because I couldn't write
> the code using xmlm at all.
Why ? That doesn't feel like an insurmontable task.
Below is a function that extracts from a (sub)tree's sequence of
signals the attributes' data of an absolute path (i.e. the particular
xpath pattern you're after if I understand correctly). Each
attribute's data is stored in a separate list. The function is simpler
than it looks, in essence it's just a recursive case analysis on
signals. In the function [aux], [pos] maintains the current path in
the parse tree. [mismatch] counts the level of mismatch w.r.t. the
[path] we are looking for.
let absolute_path_atts i path atts =
let rec aux i pos mismatch path accs = match Xmlm.input i with
| `El_start (tag, atts) ->
if mismatch > 0 then aux i (tag :: pos) (mismatch + 1) path accs else
begin match path with
| n :: path' when n = tag ->
if path' <> [] then aux i (tag :: pos) 0 path' accs else
let update_acc ((att, acc) as v) =
try att, (List.assoc att atts) :: acc with Not_found -> v
in
aux i (tag :: pos) 0 [] (List.map update_acc accs)
| _ -> aux i (tag :: pos) (mismatch + 1) path accs
end
| `El_end ->
begin match pos with
| _ :: [] -> List.rev_map (fun (att, acc) -> List.rev acc) accs
| tag :: pos' ->
if mismatch > 0 then aux i pos' (mismatch - 1) path accs else
aux i pos' 0 (tag :: path) accs
| [] -> assert false
end
| `Data _ -> aux i pos mismatch path accs
| `Dtd _ -> assert false
in
let accs = List.rev_map (fun att -> att, []) atts in
begin match Xmlm.peek i with
| `El_start _ -> aux i [] 0 path accs
| `Dtd _ | `El_end | `Data _ -> invalid_arg "no subtree here"
end
Now your function becomes something like this :
let get_devices_from_xml xml =
try
let i = Xmlm.make_input (`String (0, xml)) in
ignore (Xmlm.input i); (* `Dtd signal *)
let path = ["", "domain"; "","devices"; "", "disk"; "", "source"] in
match absolute_path_atts i path ["", "dev"; "", "file"] with
| [devs; files] when Xmlm.eoi i -> devs @ files
| _ -> failwith "xml document not well-formed"
with
| Xmlm.Error ((l,c), e) ->
failwith (Printf.sprintf "%d:%d: %s" l c (Xmlm.error_message e))
I know this is still more effort than you'd like, but
Xmlm is purposedly low-level and will remain. It provides only a
robust xmlm parser convenient (I believe) to develop higher-level
abstractions to process the insane uses of this standard. It would be
nice to develop a module using xmlm to provide a (non-camlp4) dsl for
xml queries. Unfortunately I do not have the time for that at the
moment (unless someone wants to fund me to do that...).
Best,
Daniel