Browse thread
Serialisation of PXP DTDs
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Gerd Stolpmann <info@g...> |
| Subject: | Re: [Caml-list] Re: Serialisation of PXP DTDs |
Am Donnerstag, den 23.10.2008, 23:05 +0200 schrieb Mauricio Fernandez:
> I have been working for a while on a self-describing, compact, extensible
> binary protocol, along with an OCaml implementation which I intent to release
> in not too long.
>
> It differs from sexplib and that bin-prot in two main ways:
> * the data model is deliberately more limited, as the format is meant to be
> de/encodable in multiple languages.
> * it is extensible at several levels, achieving both forward and backward
> compatibility across changes in the data type
>
> You can think of it as an extensible Protocol Buffers[1] with a richer data
> model (albeit not in 1:1 accordance with OCaml's for the above mentioned
> reason).
Have you looked at ICEP (see zeroc.com)? It has bindings for many
languages, even for Ocaml (http://oss.wink.com/hydro/).
It is, however, not self-describing. Anyway, you may find there ideas
for portability.
Gerd
> In the criteria you gave in another message, namely
> (1) ease of use
> (2) "future-proofness"
> (3) portability
> (4) human-readability,
>
> it does fairly well at the 3 first ones --- especially at (2) and (3), which
> were poorly supported by existing solutions (I looked into bin-prot, sexplib,
> Google's Protocol Buffers, Thrift and XDR; I also referred to IIOP and ITU-T
> X.690 DER during the design). Being a binary format, it obviously doesn't do
> that well at (4), but it is possible to get a human-readable dump of the
> binary data even in the absence of the interface definition, making
> reverse-engineering no harder than sexplib (and arguably easier in some ways).
>
> For example, here's a bogus message definition to illustrate (2) and (4).
> This protocol definition is fed to the compiler, which generates the OCaml
> type definitions, as well as the encoders/decoders and pretty-printers (as you
> can see, the specification uses a mix of OCaml, Haskell and C++ syntax, but
> it's pretty clear IMO)
>
> type sum_type 'a 'b 'c = A 'a | B 'b | C 'c
>
> message complex_rtt =
> A {
> a1 : [(int * [|bool|])];
> a2 : [ sum_type<int, string, long> ]
> }
> | B {
> b1 : bool;
> b2 : (string * [int])
> }
>
> The protocol is extensible in the sense that you can add new constructors to a
> sum or message type, add new elements to a tuple, and replace any primitive
> type by a sum type including the original type. For instance, if at some point
> in time we find that the b1 field should have a different type, we can do
>
> type bool_or_something 'a = Orig unboxed_bool | New_constructor 'a
>
> and then
> ...
> | B { b1 : bool_or_something<some_type>; ... }
>
> This, along with a way to specify default values, allows both forward and
> backward compatibility.
>
> The compiler generates a pretty printer for these structures, useful for
> debugging. Here's a message generated randomly:
>
> {
> Complex_rtt.a1 =
> [ ((-5378), [| false; false; false; true; true |]);
> (3942717140522000971, [| false; true; true; true; false |]);
> ((-6535386320450295), [| false |]); ((-238860767206), [| |]);
> (1810196202, [| false; false; true; true |]) ];
> Complex_rtt.a2 =
> [ Sum_type.A (-13830); Sum_type.A 369334576; Sum_type.A 83;
> Sum_type.A (-3746796577167465774); Sum_type.A (-1602586945) ] }
>
> Now, this is the information decoded in the absence of the above definitions
> (iow., what you'd have to work with if you were reverse-engineering the
> protocol):
>
> T0 {
> T0 [
> T0 { Vint_t0 (-5378);
> T0 [ Vint_t0 0; Vint_t0 0; Vint_t0 0; Vint_t0 (-1);
> Vint_t0 (-1)]};
> T0 { Vint_t0 3942717140522000971;
> T0 [ Vint_t0 0; Vint_t0 (-1); Vint_t0 (-1); Vint_t0 (-1);
> Vint_t0 0]};
> T0 { Vint_t0 (-6535386320450295); T0 [ Vint_t0 0]};
> T0 { Vint_t0 (-238860767206); T0 [ ]};
> T0 { Vint_t0 1810196202;
> T0 [ Vint_t0 0; Vint_t0 0; Vint_t0 (-1); Vint_t0 (-1)]}];
> T0 [ T0 { Vint_t0 (-13830)}; T0 { Vint_t0 369334576}; T0 { Vint_t0 83};
> T0 { Vint_t0 (-3746796577167465774)}; T0 { Vint_t0 (-1602586945)}]}
>
> (I'm still changing some details so it might look better than this shortly.)
>
> It's not a drop-in solution like sexplib's "with sexp", by design (since it is
> meant to allow interoperability between different languages), but it's still
> fairly easy to use.
>
> If you're interested in this, tell me and I'll let you know when it's ready for
> serious usage.
>
> [1] http://code.google.com/p/protobuf/
>
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------