]> Hello, mailing list, I'm considering to add the following modules to the standard library in a future release of Caml Light: genlex generic lexical analyzer set applicative sets over totally ordered types map applicative maps over totally ordered types The interfaces are given at the end of this message. I'm interested in any comments you might have on these modules, such as missing operations, poor names, or vague documentation. Also, it's the right time to call for extensions or corrections in the library (``Gee, I really miss a frobnicate_list function''), even though what can go in the standard library is severely constrainted both for ideological reasons (I believe no combinator should be part of the standard library, including function composition) and practical reasons (we're about to hit various size limitations in the PC 8086 version). - Xavier Leroy ---------------------------- Module genlex ---------------------------- (* A generic lexical analyzer *) (* This module implements a simple ``standard'' lexical analyzer, presented as a function from character streams to token streams. It implements roughly the lexical conventions of Caml, but is parameterized by the set of keywords of your language. *) #open "stream";; type token = Kwd of string | Ident of string | Int of int | Float of float | String of string | Char of char;; (* The type of tokens. The lexical classes are: [Int] and [Float] for integer and floating-point numbers; [String] for string literals, enclosed in double quotes; [Char] for character literals, enclosed in backquotes; [Ident] for identifiers (either sequences of letters, digits, underscores and quotes, or sequences of ``operator characters'' such as [+], [*], etc); and [Kwd] for keywords (either identifiers or single ``special characters'' such as [(], [}], etc). *) value make_lexer: string list -> (char stream -> token stream);; (* Construct the lexer function. The first argument is the list of keywords. An identifier [s] is returned as [Kwd s] if [s] belongs to this list, and as [Ident s] otherwise. A special character [s] is returned as [Kwd s] if [s] belongs to this list, and cause a lexical error (exception [Parse_error]) otherwise. Blanks and newlines are skipped. Comments delimited by [(*] and [*)] are skipped as well, and can be nested. Example: a lexer suitable for a desk calculator is obtained by - [ let lexer = make_lexer ["+";"-";"*";"/";"let";"="; "("; ")"] ] The associated parser would be a function from [token stream] to, for instance, [int], and would have rules such as: [ let parse_expr = function [< 'Int n >] -> n | [< parse_expr n1; (parse_end n1) n2 >] -> n2 | [< 'Kwd "("; parse_expr n; 'Kwd ")" >] -> n and parse_end n1 = function [< 'Kwd "+"; parse_expr n2 >] -> n1+n2 | ... ] *) ---------------------------- Module set ---------------------------- (* Sets over ordered types *) (* This module implements the set data structure, given a total ordering function over the set elements. All operations over sets are purely applicative (no side-effects). The implementation uses balanced binary trees, and is therefore reasonably efficient: membership takes time logarithmic in the size of the set, for instance. *) type 'a t;; (* The type of sets containing elements of type ['a]. *) type 'a ordering == 'a -> 'a -> int;; (* Most operations over sets take as argument a total ordering function over the set elements. This is a two-argument function [f] such that [f e1 e2] is zero if the elements [e1] and [e2] are equal, [f e1 e2] is strictly negative if [e1] is smaller than [e2], and [f e1 e2] is strictly positive if [e1] is greater than [e2]. For instance, a suitable ordering function for type [int] is [fun i j -> i-j]. For type [string], you could use [compare_string]. The user is responsible for always using the same ordering function to operate on a given set. (This is what we mean by ``the set [s] is ordered according to the ordering [ord]'' in the following.) Otherwise, the functions below could produce incorrect results. *) value empty: 'a t (* The empty set. *) and is_empty: 'a t -> bool (* Test whether a set is empty or not. *) and mem: 'a ordering -> 'a -> 'a t -> bool (* [mem ord x s] tests whether [x] is equal to an element of [s], in the sense that [ord x y = 0] for some element [y] of [s]. *) and add: 'a ordering -> 'a -> 'a t -> 'a t (* [add ord x s] returns a set containing all elements of [s], plus [x]. If [x] was already in [s], [s] is returned unchanged. [s] and the returned set are ordered according to [ord]. *) and remove: 'a ordering -> 'a -> 'a t -> 'a t (* [add ord x s] returns a set containing all elements of [s], except [x]. If [x] was not in [s], [s] is returned unchanged. [s] and the returned set are ordered according to [ord]. *) and union: 'a ordering -> 'a t -> 'a t -> 'a t and inter: 'a ordering -> 'a t -> 'a t -> 'a t and diff: 'a ordering -> 'a t -> 'a t -> 'a t (* Union, intersection and set difference. The two set arguments and the returned set are ordered according to the first argument. *) and equal: 'a ordering -> 'a t -> 'a t -> bool (* [equal ord s1 s2] tests whether the sets [s1] and [s2] are equal, that is, contain equal elements according to [ord]. *) and compare: 'a ordering -> 'a t -> 'a t -> int (* Total ordering between sets. Can be used as the ordering function for doing sets of sets. *) and elements: 'a t -> 'a list (* Return the list of all elements of the given set. The elements appear in the list in some non-specified order. *) and iter: ('a -> 'b) -> 'a t -> unit (* [iter f s] applies [f] in turn to all elements of [s], and discards the results. The elements of [s] are presented to [f] in a non-specified order. *) and fold: ('a -> 'b -> 'b) -> 'a t -> 'b -> 'b (* [fold f s a] computes [(f xN ... (f x2 (f x1 a))...)], where [x1 ... xN] are the elements of [s]. The order in which elements of [s] are presented to [f] is not specified. *) ;; ---------------------------- Module map ---------------------------- (* Association tables over ordered types *) (* This module implements applicative association tables, also known as finite maps, given a total ordering function over the keys. All operations over maps are purely applicative (no side-effects). The implementation uses balanced binary trees, and therefore searching and insertion take time logarithmic in the size of the map. *) type ('a, 'b) t;; (* The type of maps from type ['a] to type ['b]. *) type 'a ordering == 'a -> 'a -> int;; (* Most operations over maps take as argument a total ordering function over the keys of the map. This is a two-argument function [f] such that [f k1 k2] is zero if the keys [k1] and [k2] are equal, [f k1 k2] is strictly negative if [k1] is smaller than [k2], and [f k1 k2] is strictly positive if [k1] is greater than [k2]. For instance, a suitable ordering function for type [int] is [fun i j -> i-j]. For type [string], you could use [compare_string]. The user is responsible for always using the same ordering function to operate on a given map. (This is what we mean by ``the map [m] is ordered according to the ordering [ord]'' in the following.) Otherwise, the functions below could produce incorrect results. *) value empty: ('a, 'b) t (* The empty map. *) and add: 'a ordering -> ('a, 'b) t -> 'a -> 'b -> ('a, 'b) t (* [add ord m x y] returns a map containing the same bindings as [m], plus a binding of [x] to [y]. Previous bindings for [x] in [m] are not removed, but simply hidden: they reappear after performing [remove ord map x]. (This is the semantics of association lists.) [m] and the returned map are ordered according to [ord]. *) and find: 'a ordering -> ('a, 'b) t -> 'a -> 'b (* [find ord m x] returns the current binding of [x] in [m], or raises [Not_found] if no such binding exists. [m] is ordered according to [ord]. *) and remove: 'a ordering -> ('a, 'b) t -> 'a -> ('a, 'b) t (* [remove ord m x] returns a map containing the same bindings as [m], minus the current binding for [x]. The previous binding for [x] is restored if it exists. [m] is returned unchanged if [x] is not bound in [m]. [m] and the returned map are ordered according to [ord]. *) and iter: ('a -> 'b -> 'c) -> ('a, 'b) t -> unit (* [iter f m] applies [f] to all bindings in map [m], discarding all the results. [f] receives the key as first argument, and the associated value as second argument. The order in which the bindings are passed to [f] is unspecified. A given binding will be presented to [f] only once. *) and fold: ('a -> 'b -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c (* [fold f m a] computes [(f kN dN ... (f k2 d2 (f k1 d1 a))...)], where [(k1,d1) ... (kN, dN)] are the bindings of map [m]. The order in which bindings are presented to [f] is not specified. *) ;;