Version française
Home     About     Download     Resources     Contact us    
Browse thread
XML library for validating MathML
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Till Varoquaux <till.varoquaux@g...>
Subject: Re: [Caml-list] XML library for validating MathML
PXP is tough to work with and feels a bit crazy but it is good with
standards (It can sort out any DTD's I have ever thrown at it).
xml-light is, well, very broken (it doesn't even support charcode
switching). There are several XML parsers in OCaml and I've had a
stint with a few of them; the only two I would consider using are
expat and Pxp with a marked preference for the later. PXP can be very
confusing and feels over engineered at times but it does the job. And
remember parsing XML is a hard job, much harder than we often give it
credit for....

Hats off to Gerd for providing us with a proper parser.

Till

On Thu, Sep 18, 2008 at 9:38 AM, Vincent Hanquez <tab@snarc.org> wrote:
> On Wed, Sep 17, 2008 at 11:58:05AM -0700, Dario Teixeira wrote:
>> Given a string containing a mathematical expression in the MathML
>> markup, I need to verify that the expression is indeed valid MathML.
>> I am therefore looking for an XML library that can verify an expression
>> against a given DTD.
>>
>> Now, I have tried Xml-light, and the code I used is listed below.
>> Unfortunately, it fails when trying to parse MathML's DTD (it's the
>> standard DTD from the W3C).  I have tried simpler DTDs, and it does work
>> with them; am I therefore correct in assuming that Xml-light can only
>> handle a particular version/subset of DTD features?
>
> I don't know about validation (i'll probably suggest looking at PXP tho),
> but xml-light is very bad for XML compliance. the library is (happily) parsing
> XML files that it shouldn't, which tell a lots concerning its validation
> abilities ...
>
> for example, the XML supported character range is not even checked:
>
> Xml 1.0 specification -- 2.2 Characters
>
> Char       ::=          #x9 | #xA | #xD | [#x20-#xD7FF] |
>                [#xE000-#xFFFD] | [#x10000-#x10FFFF]
>
> others problems include (uncomplete list):
> - complete unicode un-awareness
> - funny & wrong entities handling
>
> --
> Vincent
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>