Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] OCaml on G4
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-06-23 (06:03)
From: David McClain <David.McClain@A...>
Subject: [Caml-list] OCaml on G4
I have to congratulate the OCaml team! While I don't yet read PPC 
Assembly fluently enough to make sense of the generated OCamlOpt 
output, I have run some indirect math tests that verify its generation 
of optimal floating point math code.

Of course, optimal is in the eye of the beholder in this case, c.f., 
"MatLab's Loss is Nobody's Gain" by Prof. William Kahan of UC Berkeley, 
where he discusses the advantages and the pitfalls of using Fused MAC 
hardware when the host processor makes it available.

But at any rate, using some of the tests he outlines in that paper, it 
is easy to see that simple loops like the following:

	let vdot v1 v2 =
	  let rec iter sum n =
	    if n < 0 then sum
	      iter (sum +. Array.unsafe_get v1 n *.
		      Array.unsafe_get v2 n)
	    iter 0.0 (pred len1)

generate results consistent with the use of the Fused MAC on the PPC G4 
on the Mac OS X Panther.

Given the loose structure of this code, as written, I might have 
expected the storage of intermediate sums to memory. But that does not 
appear to be happening. Instead, when I perform a computation such as 
the vdot of two vectors given by [1-eps; eps-1] and [1+eps; eps+1], 
where eps = (nextafter 1.0 2.0 - 1.0), or the smallest discernible 
delta in the double precision numbers at magnitudes around 1.0, I get 
the answer for the dot product of -4.93e-32, instead of zero. That 
(slightly erroneous) value indicates the use of a Fused MAC instruction 

Now I don't mind the slight incorrectness as long as I'm aware of how 
and why it occurs, because I can certainly compensate for this effect. 
But in general the use of a Fused MAC is desirable in computing long 
dot products because it implies a lower than typical accumulation of 
roundoff errors.

But more than this, I just wanted to say how impressed I am at the 
prowess of the OCaml team in constructing an elegant compiler back end 
where the code generation occurs. That is a very challenging part of 
any compiler and the OCaml team has earned considerable respect for 
their efforts!

Long live OCaml!!

David McClain
Senior Corporate Scientist
Avisere, Inc.
+1.520.390.7738 (USA)

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: