Version française
Home     About     Download     Resources     Contact us    
Browse thread
Serialisation of PXP DTDs
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Gerd Stolpmann <info@g...>
Subject: Re: [Caml-list] Re: Serialisation of PXP DTDs

Am Donnerstag, den 23.10.2008, 23:05 +0200 schrieb Mauricio Fernandez:
> I have been working for a while on a self-describing, compact, extensible
> binary protocol, along with an OCaml implementation which I intent to release
> in not too long.
> 
> It differs from sexplib and that bin-prot in two main ways:
> * the data model is deliberately more limited, as the format is meant to be
>   de/encodable in multiple languages.
> * it is extensible at several levels, achieving both forward and backward
>   compatibility across changes in the data type
> 
> You can think of it as an extensible Protocol Buffers[1] with a richer data
> model (albeit not in 1:1 accordance with OCaml's for the above mentioned
> reason).

Have you looked at ICEP (see zeroc.com)? It has bindings for many
languages, even for Ocaml (http://oss.wink.com/hydro/).

It is, however, not self-describing. Anyway, you may find there ideas
for portability.

Gerd

> In the criteria you gave in another message, namely
> (1) ease of use
> (2) "future-proofness"
> (3) portability
> (4) human-readability,
> 
> it does fairly well at the 3 first ones --- especially at (2) and (3), which
> were poorly supported by existing solutions (I looked into bin-prot, sexplib,
> Google's Protocol Buffers, Thrift and XDR; I also referred to IIOP and ITU-T
> X.690 DER during the design). Being a binary format, it obviously doesn't do
> that well at (4), but it is possible to get a human-readable dump of the
> binary data even in the absence of the interface definition, making
> reverse-engineering no harder than sexplib (and arguably easier in some ways).
> 
> For example, here's a bogus message definition to illustrate (2) and (4).
> This protocol definition is fed to the compiler, which generates the OCaml
> type definitions, as well as the encoders/decoders and pretty-printers (as you
> can see, the specification uses a mix of OCaml, Haskell and C++ syntax, but
> it's pretty clear IMO)
> 
>     type sum_type 'a 'b 'c = A 'a | B 'b | C 'c
> 
>     message complex_rtt =
>       A {
> 	a1 : [(int * [|bool|])];
> 	a2 : [ sum_type<int, string, long> ]
> 	}
>     | B {
> 	b1 : bool;
> 	b2 : (string * [int])
>       }
> 
> The protocol is extensible in the sense that you can add new constructors to a
> sum or message type, add new elements to a tuple, and replace any primitive
> type by a sum type including the original type. For instance, if at some point
> in time we find that the b1 field should have a different type, we can do
> 
>     type bool_or_something 'a = Orig unboxed_bool | New_constructor 'a
> 
> and then 
>    ...
>    | B { b1 : bool_or_something<some_type>; ... }
> 
> This, along with a way to specify default values, allows both forward and
> backward compatibility.
> 
> The compiler generates a pretty printer for these structures, useful for
> debugging. Here's a message generated randomly:
> 
> {
>   Complex_rtt.a1 =
>    [ ((-5378), [| false; false; false; true; true |]);
>      (3942717140522000971, [| false; true; true; true; false |]);
>      ((-6535386320450295), [| false |]); ((-238860767206), [|  |]);
>      (1810196202, [| false; false; true; true |]) ];
>   Complex_rtt.a2 =
>    [ Sum_type.A (-13830); Sum_type.A 369334576; Sum_type.A 83;
>      Sum_type.A (-3746796577167465774); Sum_type.A (-1602586945) ] }
> 
> Now, this is the information decoded in the absence of the above definitions
> (iow., what you'd have to work with if you were reverse-engineering the
> protocol):
> 
> T0 {
>      T0 [
>           T0 { Vint_t0 (-5378);
>                T0 [ Vint_t0 0; Vint_t0 0; Vint_t0 0; Vint_t0 (-1);
>                     Vint_t0 (-1)]};
>           T0 { Vint_t0 3942717140522000971;
>                T0 [ Vint_t0 0; Vint_t0 (-1); Vint_t0 (-1); Vint_t0 (-1);
>                     Vint_t0 0]};
>           T0 { Vint_t0 (-6535386320450295); T0 [ Vint_t0 0]};
>           T0 { Vint_t0 (-238860767206); T0 [ ]};
>           T0 { Vint_t0 1810196202;
>                T0 [ Vint_t0 0; Vint_t0 0; Vint_t0 (-1); Vint_t0 (-1)]}];
>      T0 [ T0 { Vint_t0 (-13830)}; T0 { Vint_t0 369334576}; T0 { Vint_t0 83};
>           T0 { Vint_t0 (-3746796577167465774)}; T0 { Vint_t0 (-1602586945)}]}
> 
> (I'm still changing some details so it might look better than this shortly.)
> 
> It's not a drop-in solution like sexplib's "with sexp", by design (since it is
> meant to allow interoperability between different languages), but it's still
> fairly easy to use.
> 
> If you're interested in this, tell me and I'll let you know when it's ready for
> serious usage.
> 
> [1] http://code.google.com/p/protobuf/
> 
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------