Version française
Home     About     Download     Resources     Contact us    
Browse thread
Estimating the size of the ocaml community
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Thomas Fischbacher <Thomas.Fischbacher@P...>
Subject: Re: [Caml-list] The boon of static type checking

On Sat, 12 Feb 2005, Brian Hurt wrote:

> Note that with N=1.5, a 100,000 program takes 1,000 man-days, or 50 
> man-months (4 man-years), and a 1,000,000 line program takes man 
> centuries.
> 
> The important point is the scaling is exponential, not linear.

Sorry, that's power-law scaling.

Exponential would mean that it's of the form a*exp(x/b), with some 
dimensionful constant a.

power-law scaling is the only behaviour that can be expected from 
scale-invariant systems. That is, take some complex system which may well 
be determined by staggeringly complicated short-distance effects (say, 
intra-orbital quantum interaction between magnetic atoms - can it get 
any worse than that?).

Then, look at this system at larger and larger scales. Try to find a way 
to describe it which allows you to perform a scale transformation, that 
is, look at interactions/interrelations between clusters, in such a way 
that the scale-transformed system can be described in a formally 
equivalent way to the original system, only with other interaction 
parameters.

Then, study the flow of your system parameters under many many 
re-scalings. There are different types of behaviour, the most simple cases
one can expect are:

(i) Some parameters blow up.

(ii) Some parameters vanish.

(iii) Some parameters reach non-trivial fixed points.

If you have (i), this means - more or less - your system doesn't scale. We 
will not consider this any further.

(ii) means that at very large scales, a lot of the special characteristics 
drop out, only those that can exhibit proper scaling behaviour survive. 
Namely those of (iii). Interestingly, one finds that frequently, one can 
easily enumerate and fully classify those few effects that survive 
scaling. This leads to the interesting observation that quite many systems 
that may be determined by wildly different micro-effects show precisely 
the same macroscopic behaviour.

All in all, power-law behaviour is what should be normally expected for 
questions that do not have an obvious characteristic scale. What's the 
typical size of a software project? Here, I'd expect power-law behaviour, 
as there is no natural answer.

Indeed:

grep Installed-Size /var/lib/dpkg/available|perl -pe 'm/: (\d+)/;$h[$1/100]++}{$n=0;$_=join "\n",map{$n++." ".(0+$_)}@h[1..100]'>/tmp/a;echo -e 'set logscale x\nset logscale y\nplot "/tmp/a" with impulses, 2600*x**-1.25,1000*exp(-x/20)\npause 20'|gnuplot /proc/self/fd/0

See what I mean? Exponential behaviour *does* have a characteristic scale: 
the size increase over which the number of projects is reduced by 1/2.

-- 
regards,               tf@cip.physik.uni-muenchen.de              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)