Browse thread
XML library for validating MathML
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Till Varoquaux <till.varoquaux@g...> |
| Subject: | Re: [Caml-list] XML library for validating MathML |
PXP is tough to work with and feels a bit crazy but it is good with standards (It can sort out any DTD's I have ever thrown at it). xml-light is, well, very broken (it doesn't even support charcode switching). There are several XML parsers in OCaml and I've had a stint with a few of them; the only two I would consider using are expat and Pxp with a marked preference for the later. PXP can be very confusing and feels over engineered at times but it does the job. And remember parsing XML is a hard job, much harder than we often give it credit for.... Hats off to Gerd for providing us with a proper parser. Till On Thu, Sep 18, 2008 at 9:38 AM, Vincent Hanquez <tab@snarc.org> wrote: > On Wed, Sep 17, 2008 at 11:58:05AM -0700, Dario Teixeira wrote: >> Given a string containing a mathematical expression in the MathML >> markup, I need to verify that the expression is indeed valid MathML. >> I am therefore looking for an XML library that can verify an expression >> against a given DTD. >> >> Now, I have tried Xml-light, and the code I used is listed below. >> Unfortunately, it fails when trying to parse MathML's DTD (it's the >> standard DTD from the W3C). I have tried simpler DTDs, and it does work >> with them; am I therefore correct in assuming that Xml-light can only >> handle a particular version/subset of DTD features? > > I don't know about validation (i'll probably suggest looking at PXP tho), > but xml-light is very bad for XML compliance. the library is (happily) parsing > XML files that it shouldn't, which tell a lots concerning its validation > abilities ... > > for example, the XML supported character range is not even checked: > > Xml 1.0 specification -- 2.2 Characters > > Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | > [#xE000-#xFFFD] | [#x10000-#x10FFFF] > > others problems include (uncomplete list): > - complete unicode un-awareness > - funny & wrong entities handling > > -- > Vincent > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >