Version française
Home     About     Download     Resources     Contact us    
Browse thread
new library modules
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <xavier@T...>
Subject: new library modules
Hello, mailing list,

I'm considering to add the following modules to the standard library
in a future release of Caml Light:

        genlex          generic lexical analyzer
        set             applicative sets over totally ordered types
        map             applicative maps over totally ordered types

The interfaces are given at the end of this message. I'm interested in
any comments you might have on these modules, such as missing
operations, poor names, or vague documentation. 

Also, it's the right time to call for extensions or corrections in the
library (``Gee, I really miss a frobnicate_list function''), even
though what can go in the standard library is severely constrainted
both for ideological reasons (I believe no combinator should be part
of the standard library, including function composition) and practical
reasons (we're about to hit various size limitations in the PC 8086
version).

- Xavier Leroy

---------------------------- Module genlex ----------------------------

(* A generic lexical analyzer *)

(* This module implements a simple ``standard'' lexical analyzer, presented
   as a function from character streams to token streams. It implements
   roughly the lexical conventions of Caml, but is parameterized by the
   set of keywords of your language. *)

#open "stream";;

type token =
    Kwd of string
  | Ident of string
  | Int of int
  | Float of float
  | String of string
  | Char of char;;
        (* The type of tokens. The lexical classes are: [Int] and [Float]
           for integer and floating-point numbers; [String] for
           string literals, enclosed in double quotes; [Char] for
           character literals, enclosed in backquotes; [Ident] for
           identifiers (either sequences of letters, digits, underscores
           and quotes, or sequences of ``operator characters'' such as
           [+], [*], etc); and [Kwd] for keywords (either identifiers or
           single ``special characters'' such as [(], [}], etc). *)
           
value make_lexer: string list -> (char stream -> token stream);;
        (* Construct the lexer function. The first argument is the list of
           keywords. An identifier [s] is returned as [Kwd s] if [s]
           belongs to this list, and as [Ident s] otherwise.
           A special character [s] is returned as [Kwd s] if [s]
           belongs to this list, and cause a lexical error (exception
           [Parse_error]) otherwise. Blanks and newlines are skipped.
           Comments delimited by [(*] and [*)] are skipped as well,
           and can be nested.

           Example: a lexer suitable for a desk calculator is obtained by
-          [  let lexer = make_lexer ["+";"-";"*";"/";"let";"="; "("; ")"]
           ]
           The associated parser would be a function from [token stream]
           to, for instance, [int], and would have rules such as:
           [  let parse_expr = function
                  [< 'Int n >] -> n
                | [< parse_expr n1; (parse_end n1) n2 >] -> n2
                | [< 'Kwd "("; parse_expr n; 'Kwd ")" >] -> n
              and parse_end n1 = function
                  [< 'Kwd "+"; parse_expr n2 >] -> n1+n2
                | ...
           ] *)

---------------------------- Module set ----------------------------

(* Sets over ordered types *)

(* This module implements the set data structure, given a total ordering
   function over the set elements. All operations over sets
   are purely applicative (no side-effects).
   The implementation uses balanced binary trees, and is therefore
   reasonably efficient: membership takes time logarithmic in the size
   of the set, for instance. *)

type 'a t;;
        (* The type of sets containing elements of type ['a]. *)

type 'a ordering == 'a -> 'a -> int;;
        (* Most operations over sets take as argument a
           total ordering function over the set elements.
           This is a two-argument function [f] such that
           [f e1 e2] is zero if the elements [e1] and [e2] are equal,
           [f e1 e2] is strictly negative if [e1] is smaller than [e2],
           and [f e1 e2] is strictly positive if [e1] is greater than [e2].

           For instance, a suitable ordering function for type [int]
           is [fun i j -> i-j]. For type [string], you could use
           [compare_string].

           The user is responsible for always using the same ordering
           function to operate on a given set. (This is what we mean
           by ``the set [s] is ordered according to the ordering [ord]''
           in the following.) Otherwise, the functions
           below could produce incorrect results. *)

value empty: 'a t
        (* The empty set. *)
  and is_empty: 'a t -> bool
        (* Test whether a set is empty or not. *)
  and mem: 'a ordering -> 'a -> 'a t -> bool
        (* [mem ord x s] tests whether [x] is equal to an element
           of [s], in the sense that [ord x y = 0] for some element [y]
           of [s]. *)
  and add: 'a ordering -> 'a -> 'a t -> 'a t
        (* [add ord x s] returns a set containing all elements of [s],
           plus [x]. If [x] was already in [s], [s] is returned unchanged.
           [s] and the returned set are ordered according to [ord]. *)
  and remove: 'a ordering -> 'a -> 'a t -> 'a t
        (* [add ord x s] returns a set containing all elements of [s],
           except [x]. If [x] was not in [s], [s] is returned unchanged.
           [s] and the returned set are ordered according to [ord]. *)
  and union: 'a ordering -> 'a t -> 'a t -> 'a t
  and inter: 'a ordering -> 'a t -> 'a t -> 'a t
  and diff: 'a ordering -> 'a t -> 'a t -> 'a t
        (* Union, intersection and set difference. The two set arguments
           and the returned set are ordered according to
           the first argument. *)
  and equal: 'a ordering -> 'a t -> 'a t -> bool
        (* [equal ord s1 s2] tests whether the sets [s1] and [s2] are
           equal, that is, contain equal elements according to [ord]. *) 
  and compare: 'a ordering -> 'a t -> 'a t -> int
        (* Total ordering between sets. Can be used as the ordering function
           for doing sets of sets. *)
  and elements: 'a t -> 'a list
        (* Return the list of all elements of the given set.
           The elements appear in the list in some non-specified order. *)
  and iter: ('a -> 'b) -> 'a t -> unit
        (* [iter f s] applies [f] in turn to all elements of [s], and
           discards the results. The elements of [s] are presented to [f]
           in a non-specified order. *)
  and fold: ('a -> 'b -> 'b) -> 'a t -> 'b -> 'b
        (* [fold f s a] computes [(f xN ... (f x2 (f x1 a))...)],
           where [x1 ... xN] are the elements of [s].
           The order in which elements of [s] are presented to [f] is
           not specified. *)
;;

---------------------------- Module map ----------------------------

(* Association tables over ordered types *)

(* This module implements applicative association tables, also known as
   finite maps, given a total ordering function over the keys.
   All operations over maps are purely applicative (no side-effects).
   The implementation uses balanced binary trees, and therefore searching
   and insertion take time logarithmic in the size of the map. *)

type ('a, 'b) t;;
        (* The type of maps from type ['a] to type ['b]. *)


type 'a ordering == 'a -> 'a -> int;;
        (* Most operations over maps take as argument a
           total ordering function over the keys of the map.
           This is a two-argument function [f] such that
           [f k1 k2] is zero if the keys [k1] and [k2] are equal,
           [f k1 k2] is strictly negative if [k1] is smaller than [k2],
           and [f k1 k2] is strictly positive if [k1] is greater than [k2].

           For instance, a suitable ordering function for type [int]
           is [fun i j -> i-j]. For type [string], you could use
           [compare_string].

           The user is responsible for always using the same ordering
           function to operate on a given map. (This is what we mean
           by ``the map [m] is ordered according to the ordering [ord]''
           in the following.) Otherwise, the functions
           below could produce incorrect results. *)

value empty: ('a, 'b) t
        (* The empty map. *)
  and add: 'a ordering -> ('a, 'b) t -> 'a -> 'b -> ('a, 'b) t
        (* [add ord m x y] returns a map containing the same bindings as
           [m], plus a binding of [x] to [y]. Previous bindings for [x] 
           in [m] are not removed, but simply hidden: they reappear
           after performing [remove ord map x].
           (This is the semantics of association lists.)
           [m] and the returned map are ordered according to [ord]. *)
  and find: 'a ordering -> ('a, 'b) t -> 'a -> 'b
        (* [find ord m x] returns the current binding of [x] in [m],
           or raises [Not_found] if no such binding exists.
           [m] is ordered according to [ord]. *)
  and remove: 'a ordering -> ('a, 'b) t -> 'a -> ('a, 'b) t
        (* [remove ord m x] returns a map containing the same bindings
           as [m], minus the current binding for [x]. The previous
           binding for [x] is restored if it exists. [m] is returned
           unchanged if [x] is not bound in [m].
           [m] and the returned map are ordered according to [ord]. *)
  and iter: ('a -> 'b -> 'c) -> ('a, 'b) t -> unit
        (* [iter f m] applies [f] to all bindings in map [m],
	   discarding all the results.
           [f] receives the key as first argument, and the associated value
           as second argument. The order in which the bindings are passed to
           [f] is unspecified.
           A given binding will be presented to [f] only once. *)
  and fold: ('a -> 'b -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c
        (* [fold f m a] computes [(f kN dN ... (f k2 d2 (f k1 d1 a))...)],
           where [(k1,d1) ... (kN, dN)] are the bindings of map [m].
           The order in which bindings are presented to [f] is not
           specified. *)
;;