Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] @, List.append, and tail recursion
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2003-01-31 (02:18)
From: james woodyatt <jhw@w...>
Subject: Re: [Caml-list] @, List.append, and tail recursion

Earlier in this thread, I suggested using a queue if you are spending 
too much time in List.append.  Lack of optimized tail recursion is not 
really a factor compared to the waste of cycles involved in 
constructing a whole new list just to append a single element on the 

Apparently, nobody seemed to notice what I was talking about.  So I'm 
going to try to make my point again.  Sorry if you got it the first 

On Thursday, Jan 30, 2003, at 13:57 US/Pacific, Brian Hurt wrote:
> On Thu, 30 Jan 2003, Olivier Andrieu wrote:
>>> list1: 1.462s
>>> list2: 1.757s
>>> list3: 1.824s
>> There's an assert in setcdr : it's important because the first
>> argument mustn't be an empty list. It's never the case here, so you
>> can safely compile with -noassert.
> Doh!  OK- now, compiling with -noassert drops the time to 1.457 seconds
> (same machine, same environment)- to slightly better than the recursive
> version.

For grins, I wrote an equivalent test program.  It uses a functional 
deque instead of a list.  (I have written one.  It's a component of my 
Cf library, which remains unreleased at the moment.  Markus Mottl has 
translated several of Chris Okasaki's functional queue implementations 
into Ocaml, and you can find them on the Hump.)

To get started, I timed the 'benchmarks' by running them on my iBook 
(the 700 MHz G3 model) so I could get a baseline.  My little iBook is 
nowhere near as fast as your cool 1.6GHz P4, but I'm not complaining.

The results of my tests were:

	$ time ./list1
	3.690u 0.070s 0:04.14 90.8%     0+0k 0+0io 0pf+0w

	$ time ./list2
	4.180u 0.020s 0:05.01 83.8%     0+0k 0+0io 0pf+0w
	$ time ./list3
	3.700u 0.000s 0:04.49 82.4%     0+0k 0+0io 0pf+0w

Not real fast, but fast enough that I don't mind waiting for results.  
So, what difference does my functional deque implementation make?  Glad 
you asked.

My Cf_deque module matches the following signature:

	(* begin cf_deque.mli *)
	type 'a t

	val nil: 'a t

	module type Direction_T = sig
	    val pop: 'a t -> ('a * 'a t) option
	    val push: 'a -> 'a t -> 'a t

	module A: Direction_T
	module B: Direction_T

	val cat: 'a t -> 'a t -> 'a t
	(* end file *)

(Actually, this isn't the complete interface.  I've written a variety 
of ancillary functions that make it convenient to work with the objects 
in a deque without having to translate them into lists, e.g. fold, 
iterate, etc.)

All the functions above are purely functional, and they perform with 
O(1) average complexity.  (Or at least, that's my untrained analysis.  
I'd like to provide proof of that assertion, and I'm working on getting 
some help in that effort-- but, I'll have more news on that when I have 

Here's a variant of the test above, which uses my Cf_deque 
module instead:

	(* begin *)
	open Cf_deque

	let rec makelist_aux c accum =
	  let accum = B.push c accum in
	  if c > 0 then makelist_aux (pred c) accum else accum

	let makelist c = makelist_aux c nil

	let _ = makelist 5000;;
	(* end file *)

Here are the timing results on that same iBook:

	$ time ./t-opt.deque
	0.010u 0.010s 0:00.02 100.0%    0+0k 0+0io 0pf+0w

In other words, it's done before enough time passes even to measure it 

> And for the record, I just tested with appending to a list of 500,000
> elements, and it worked OK.

Here is the result of running my version of the test with 500,000 

	$ time ./t-opt.deque
	0.450u 0.080s 0:00.75 70.6%     0+0k 0+0io 0pf+0w

It took under a second of wall-clock time.  On the other hand, when I 
modified for 500,000 elements, it took *forever* in wall-clock 
time.  I gave up after almost an hour and a half.  Ultimately, I killed 
it with SIGINT before it finished.  I have no idea how far it got.

Clearly, I need to push a lot more elements into my deque before I will 
get a timing result that I can measure in heartbeats.  Here is the 
result for 5,000,000 elements:

	$ time ./t-opt.deque
	5.160u 0.510s 0:06.69 84.7%     0+0k 0+0io 0pf+0w

At this stage, I noticed my little iBook was just starting to thrash 
when the program finished.    So, I bumped it up to 50,000,000 
elements, because I like punishing my computer from time to time-- just 
to teach it respect.

At around 15 seconds into the run, the program turned into the 
psychedelic pizza delivery service: the pager went into fear and 
loathing mode, and the pretty Aqua GUI started acting like it was 
sniffing glue.  If I had let it run to completion, it probably would 
have wedged the machine.  (Mac OS X is pretty stable, but it hates 
resource bombs as much as any other operating system.)

None of the files were able to bring down the machine like 
that, no matter how long I let them run-- which should make sense, 
right?  The APPEND function is O(N) for lists.  Once N gets beyond a 
few hundred, the program spends almost all its cycles just copying list 
cells inside its innermost loop, only occasionally reaching the end of 
a list and tacking a new cell onto the end before starting over again.

The problem is not the language, or the compiler.  The problem is the 
design of the program.  The moral of this story: you really should 
consider using a queue if you find your programs are spending a lot of 
cycles appending things to very long lists.

j h woodyatt <>
that's my village calling... no doubt, they want their idiot back.

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: