Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Troublesome nodes
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2008-07-11 (20:39)
From: Dario Teixeira <darioteixeira@y...>
Subject: Troublesome nodes

This problem was originally raised in a thread in the ocaml-beginners
list [1], but since polymorphic variants, covariant constraints, and
recursive knots were brought into the discussion, I reckon it deserves
the attention of some heavy weights.  Moreover, the problem is trickier
than first appearances suggest.

So, what's the situation?  I want to create a data structure holding
document nodes.  There are four different kinds of nodes, two of which
are terminals (Text and See), and two of which are defined recursively
(Bold and Mref).  Moreover, both See and Mref produce links, and there
is an additional constraint that a link node may *not* be the immediate
ancestor of another link node.  Using conventional union types, a node
could be modelled like this:

module Old_node =
    type seq_t = super_node_t list
    and super_node_t =
        | Nonlink_node of nonlink_node_t
        | Link_node of link_node_t
    and nonlink_node_t =
        | Text of string
        | Bold of seq_t
    and link_node_t =
        | Mref of string * nonlink_node_t list
        | See of string

The problem with this representation is that it introduces an unwanted
scaffolding for nodes.  Moreover, it prevents the use of constructor
functions for nodes, since non-link nodes may be represented in the
tree in a context-dependent fashion: either directly such as Bold [...],
or as Nonlink_node (Bold [...]).  Note that preserving the link/nonlink
distinction in the structure is helpful for pattern matching purposes,
but the extra scaffolding is just a pain.

One alternative is to use polymorphic variants, and to take advantage
of the fact that new types can be built as the union of existing ones.
Ideally, one could do something like this:

type seq_t = super_node_t list
 and nonlink_node_t =
    [ `Text of string
    | `Bold of seq_t ]
 and link_node_t =
    [ Mref of string * nonlink_node_t list
    | See of string ]
 and super_node_t = [nonlink_node_t | link_node_t]

However, this fails with an error "The type constructor nonlink_node_t is
not yet completely defined".  Jon Harrop suggested untying the recursive
knot, but the solution has a few drawbacks of its own [2].

Another alternative is to flatten the structure altogether and to annotate
the constructor functions with phantom types to prevent the violation of
the no-parent constraint:

module Node:
    type seq_t = node_t list
    and node_t =
        | Text of string
        | Bold of seq_t
        | Mref of string * seq_t
        | See of string

    type +'a t

    val text: string -> [> `Nonlink] t
    val bold: 'a t list -> [> `Nonlink] t
    val mref: string -> [< `Nonlink] t list -> [> `Link] t
    val see: string -> [> `Link] t
end =
    type seq_t = node_t list
    and node_t =
        | Text of string
        | Bold of seq_t
        | Mref of string * seq_t
        | See of string

    type +'a t = node_t

    let text txt = Text txt
    let bold inl = Bold inl
    let mref ref inl = Mref (ref, inl)
    let see ref = See ref

This works fine, but because the link/nonlink distinction is lost, making
even a simple Node_to_Node translator becomes a mess:

module Node_to_Node =
    let rec convert_nonlink_node = function
        | Node.Text txt          -> Node.text txt
        | Node.Bold inl          -> Node.bold ( convert_super_node inl)
        | _                      -> failwith "oops"

    and convert_link_node = function
        | Node.Mref (ref, inl)   -> Node.mref ref ( convert_nonlink_node inl)
        | Node.See ref           -> Node.see ref
        | _                      -> failwith "oops"

    and convert_super_node node = match node with
        | Node.Text _
        | Node.Bold _            -> (convert_nonlink_node node :> [`Link | `Nonlink] Node.t)
        | Node.See _
        | Node.Mref _            -> convert_link_node node

So, I am looking for a solution that meets the following conditions:

 - It satisfies the "no link node shall be parent of another" constraint;
 - the structure should be pattern-matchable;
 - but nodes should be created via constructor functions.

Any ideas?

Thanks in advance and sorry for the long post!
Dario Teixeira


Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo!