Browse thread
[OSR] Suggested topic - XML processing API
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Bünzli_Daniel <daniel.buenzli@e...> |
| Subject: | Re: [Caml-list] [OSR] Suggested topic - XML processing API |
Le 5 févr. 08 à 06:02, Alain Frisch a écrit : > As suggested before, you really need to say something, at least, > about: [...] - Whether character references and predefined entity references must be resolved. Hint : yes. Le 5 févr. 08 à 06:02, Alain Frisch a écrit : > - having a common spec for several libs makes more sense if they can > share common types; maybe you should use polymorphic variants > instead of regular ones? Agreed. In xmlm these variants become polymorphic in the next version. Other comments. * IMHO, do not use camel casing. Underscores are more caml like, i.e. xml_node, etc. * Regarding naming I would call xmlNode xml_tree and in general drop the xml prefix from the cases. * "combine" argument, in my opinion parser should always combine adjacent pcdata nodes. * As other may now know I don't like to raise exceptions, the next version of xmlm doesn't raise exceptions (but given recent discussions it seems others do like exceptions). * Regarding the way the parser is invoked I don't like the way it is done : (1) The function "parse", I can only use it with channels this is not good (2) Having convenience parse_file is always useless to me since it is hard to know the exact kind of error handling performed by such functions without looking at its source. The way I do this kind of things is to define an input abstraction type. First you create an input abstraction from a data source (e.g. in_channel, strings, and a callback source) and then you invoke the parser with the input abstraction (actually I started an OSR on devising IO modules with non object-oriented IO sources and destination reflecting this view, but I'm reluctant to publish it). In general I'd like to say that I'm a little bit dubious about this effort. Actually I would refrain from formalizing the actual way the parser is invoked, clients can also perform their bit of work. I would concentrate on defining : 1) Parsing _result_ types and a precise definition of the actual _form_ of the data they contain. More than one form may be defined. This is the most important thing if you would like to be able to switch implementation, the actual input procedure can easily be isolated from the rest of your source. 2) A minimal list of input sources (e.g. in_channel and string) from which the parser should be able to read without going in further details on how the actual input procedure should be performed. Just specify the state in which sources are accepted for input and left after output. Best, Daniel