Browse thread
[OSR] Suggested topic - XML processing API
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Alain Frisch <alain@f...> |
| Subject: | Re: [Caml-list] [OSR] Suggested topic - XML processing API |
Jim Miller wrote: > type xmlNode = > | XmlElement of (namespace: string * tagName: string * attributes: > (string * string) list * (children:xmlNode list) ) > | XmlPCData of (text:string) There has been some discussions here a while ago about standardizing XML types across OCaml libraries. You might want to look up the archives. Here are some random remarks. First, you need to specify several things in the type above. - the encoding of strings; if the parser cannot be configured, I guess that normalizing everything to utf-8 is the most natural choice. - the handling of namespaces; does the first argument to XmlElement refers to the namespace prefix as used in the document (it'd make matching impossible because the document can use arbitrary prefixes), a normalized version (you'd need to provide the parser with more info), or the namespace URI (which makes pattern matching quite tedious). Also, it is sometimes necessary to keep the [prefix->uri] dictionnary available in at every node (e.g. to deal with XML Schema documents, where prefixes can be used in attribute values). Moreover, some XML documents may be valid w.r.t. to the XML spec without conforming to the XML Namespaces one. - whether adjacent XmlPCData nodes are allowed or not. - whether the parser performs whitespace normalization (and how). Also, in many cases, the client of the parser might want to get more information, like locations in the source document. If you intend to use the same type to produce XML documents from an internal representation, I think you might want to add an extra constructor: | XmlMany of xmlNode list This makes it much easier to build and compose XML fragments in a modular way. Also, you need to specify how the XML printer is supposed to deal with namespaces. -- Alain