Browse thread
[Caml-list] Executable size?
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Brian Hurt <bhurt@s...> |
| Subject: | Re: [Caml-list] Executable size? |
On Wed, 12 Nov 2003, John J Lee wrote:
> On Wed, 12 Nov 2003, Brian Hurt wrote:
>
> > On Wed, 12 Nov 2003, Richard Jones wrote:
> [...]
> > > This is not a criticism of OCaml, but the executables do tend to be
> > > quite large. This seems mainly down to the fact that OCaml links the
> > > runtime library in statically. There was previous discussion on this
> [...]
> > This isn't as bad as it sounds. A simplistic "hello world!" application
> > in Ocaml weighs in at 112K, versus 11K for the equivelent (dynamically
> > linked) C program- almost entirely either statically linked standard
> > libraries and infrastructure (garbage collections, etc.)- stuff that
> > doesn't expand with larger programs.
>
> OK. Is that 100K difference for "hello world" (which doesn't necessarily
> stay the same for larger programs, as you say below) simply a result of
> the fact that C has the "unfair" advantage of already having its runtime
> sitting on everyone's hard drive already?
Actually, I think Ocaml uses C's runtime libraries and builds on top of
them. For example, if I understand things correctly, Ocaml's printf is a
wrapper which calls C's printf. Which is why I haven't bothered comparing
Ocaml's size to C programs being statically linked. Ocaml is at least
nice enough to only link libraries you are actually using (see the
print_string v. printf results).
In addition to a more complicated and complete standard library and
bultins, Ocaml also has garbage collection, which is non-trivial to
implement. I wouldn't be surprised if half or more of that 100K of
overhead is just the GC. Currying, exceptions, etc. also have small size
penalties.
On the other hand, I would argue that these features, while bloating the
application. Which is exactly the sort of thing small "benchmark"
programs don't show. I don't know how many times I've read or written C
code like:
int copy_file(char * src, char * dst) {
char * buf;
FILE * inf;
FILE * outf;
if ((src == NULL) || (dst == NULL)) {
return EINVAL;
}
inf = fopen(src, "rb");
if (inf == NULL) {
return errno;
}
outf = fopen(dst, "wb");
if (outf == NULL) {
fclose(inf);
return errno;
}
buf = (char *) malloc(4096);
if (buf == NULL) {
fclose(outf);
fclose(inf);
return errno;
}
blah blah blah you get the idea
Vr.s the same code in Ocaml:
let copyfile src dst =
let inf = open_in_bin src
and outf = open_out_bin dst
and buf = String.make 4096 ' '
in
let rec loop () =
let c = input inf buf 0 4096 in
if (c > 0) then
begin
output outf buf 0 c;
loop ()
end
else
()
in
loop ()
The ocaml executable code for copyfile function will be smaller than the C
version, simply because the ocaml version takes advantage of various
features of the larger ocaml library and infrastructure- especially (in
this case) exceptions and garbage collection.
>
>
> > A naive assumption would be that an Ocaml program is about 100K or so
> > larger than the equivelent C program. Not much, considering how easy it
> > is to get executables multiple megabytes in size.
>
> [...]
> > Ocaml gets a lot more code reuse, and thus can actually lead to smaller
> > executables.
>
> I don't understand what you mean by that (probably my fault). What do you
> mean by "code reuse" here? I usually understand that phrase to mean using
> code written by people other than me, but you seem to mean it in a
> different sense.
>
I was using it in the most literal sense- using code more than once, in
more than one way. In general, it's much better to have only one copy of
a function, used in two places, than two copies of the function. The
trick is that generally the two copies are not exactly identical- if
the functions are, for example, the length of a linked list, one function
might operate on a linked list of integers, another a linked list of
floats. Ocaml encourages you to program in a generic way- you actually
have to work at it to write a linked list length routine that *isn't*
generic, the naive implementation is (so is the optimized version).
Again, this generally isn't a problem in small programs, which easily fit
into you brain as a whole. Code reuse becomes more of a trick on moderate
to large programs, especially moderate to large programs with more than
one programmer. How many times have we reimplemented linked lists in C?
>
> > Unless you have special constraints, the difference between C program
> > sizes and Ocaml program sizes are not enough to be worth worrying about.
>
> I don't really agree that the problem of distributing simple (few lines of
> code) applications in small executables is all that "special". Certainly
> there are *many* applications where you don't need that; equally, there
> are quite a few where you do need/want that.
I was thinking of special cases where the difference of a 100K or 1M or so
is the difference between working and not working. If you are, for
example, trying to fit your program on a 512K ROM, Ocaml's overhead might
be a problem.
--
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
- Gene Spafford
Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners