Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Efficiency of 'a list
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2003-05-05 (13:18)
From: John Max Skaller <skaller@o...>
Subject: Re: [Caml-list] Two types of efficiency (Was Efficiency of 'a list)
Mattias Waldau wrote:

> It would be interesting to see how a typed scripting language (maybe
> built on top of ocaml) with advanced built-in datastructures (with
> syntax support) would look like.

There are two things you can look at:

(1) FISh 2 when it finally comes out,
will support general polynomial data structures
with functorial polymorphism: at the moment
only FISh 1 is available but gives the flavour:

(2) Felix:

isn't nearly as ambitious, indeed, its
a stock Algol like language with a couple
of twists which includes heavy support for
purely functional programming with eager

At the moment, the only 'advanced' data structure
built-in to the language is the ability to
build O(n) regular expression matching used like:

	regmatch string_expr with
	| regexp1 => ...
	| regexp2 => ..

[and there is some hackery here ..]

Contrary to providing 'built-in' advanced
data structures, Felix is exactly the opposite:
it has no built-in data types at all.

Instead, it provides powerful data constructors
such as anonymous sums and products, as well as
functions of course. These are strong enough to
construct bool in the standard library.

In addition, it provides a way to lift data types
from C/C++. For example:

	type int = "int";
	fun add: int * int -> int = "$1 + $2";

Each of these "user defined primitives" is defined
in C/C++ and provided as an abstract data type,
accessed by user specified primitive functions
(like "add" above). Primitives can now also be generic:

	type vector[t] = "std::vector<?1>";

I'm explaining all this because in my opinion,
providing some fixed "efficient/advanced" data structure
"built-in" to the language is in fact suboptimal,
and unnecessary.

The trick, instead, is to provide CORE functionality
which can be used to conveniently construct and access
such data structures. For example you can write:

	s[1 to 20]

to subscript a string. It's just syntactic sugar for:


and subscript is just a function: for a given string like
type, the user has to define it.

Some of the other features that make Felix friendly
include an advanced overloading scheme which fixes
many of the problems in the C++ mechanism (and contrary
to GCaml, is even more "open" than the C++ mechanism)

The most advanced feature "builtin" is, in fact,
not a data structure but a control structure:
Felix performs blocking reads on a central message queue
by yielding control (without using hardware threading).
[I call the suspended states continuations, but technically
I think they're resumptions]

It is likely I will provided synactic sugar for
some other concepts such as iterators. What's important
in scripting languages, apart from automatic storage
management, is that the syntax be convenient.

There is nothing in Python dictionaries or Perl hashes
that can't be done in C++ or Ocaml: the primary advantage
is NOT that these data structures are built in.
What's important is the easy use which specific
syntax provides.

Felix will try to provide the convenience without actually
building anything in: instead, it provides a way to bind
the syntax to user defined types. Operator overloading
and the like are the simplest and easiest way to do this,
although it's painful compared with what FISh 2 seeks to do:
allow functorially polymorphic algorithms to be written.

I have to recommend consideration of the protocols
definitions in Python, and the container and iterator
definitions in the C++ Standard. Both of these
systems handle, in a different way, the notion
of Sequence and Associative Container fairly well.
Similarly, SQL embodies well the notions of relational
algebra (and in some sense is a generalisation of
sequences and associative containers).

Its clear, however, that no one has much idea
what to do with trees and graphs (yacc like parser tools
are the closest thing to tree management syntaxes, and they're
usual extrinsic rather than embedded in the language).

In the end, convenient use of advanced data structures --
and i don't  mean simply sequences and associative containers --
depends on theoreticians developing the right general
calculi to deal with them in a compact way.

Only then can language designers provide generalised
syntax that goes beyond overloading and stuff like

	address ["fred"] = "20 Smith St Paris"

John Max Skaller,
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: