Browse thread
[Caml-list] Executable size?
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2003-11-12 (19:44) |
From: | Brian Hurt <bhurt@s...> |
Subject: | Re: [Caml-list] Executable size? |
On Wed, 12 Nov 2003, John J Lee wrote: > On Wed, 12 Nov 2003, Brian Hurt wrote: > > > On Wed, 12 Nov 2003, Richard Jones wrote: > [...] > > > This is not a criticism of OCaml, but the executables do tend to be > > > quite large. This seems mainly down to the fact that OCaml links the > > > runtime library in statically. There was previous discussion on this > [...] > > This isn't as bad as it sounds. A simplistic "hello world!" application > > in Ocaml weighs in at 112K, versus 11K for the equivelent (dynamically > > linked) C program- almost entirely either statically linked standard > > libraries and infrastructure (garbage collections, etc.)- stuff that > > doesn't expand with larger programs. > > OK. Is that 100K difference for "hello world" (which doesn't necessarily > stay the same for larger programs, as you say below) simply a result of > the fact that C has the "unfair" advantage of already having its runtime > sitting on everyone's hard drive already? Actually, I think Ocaml uses C's runtime libraries and builds on top of them. For example, if I understand things correctly, Ocaml's printf is a wrapper which calls C's printf. Which is why I haven't bothered comparing Ocaml's size to C programs being statically linked. Ocaml is at least nice enough to only link libraries you are actually using (see the print_string v. printf results). In addition to a more complicated and complete standard library and bultins, Ocaml also has garbage collection, which is non-trivial to implement. I wouldn't be surprised if half or more of that 100K of overhead is just the GC. Currying, exceptions, etc. also have small size penalties. On the other hand, I would argue that these features, while bloating the application. Which is exactly the sort of thing small "benchmark" programs don't show. I don't know how many times I've read or written C code like: int copy_file(char * src, char * dst) { char * buf; FILE * inf; FILE * outf; if ((src == NULL) || (dst == NULL)) { return EINVAL; } inf = fopen(src, "rb"); if (inf == NULL) { return errno; } outf = fopen(dst, "wb"); if (outf == NULL) { fclose(inf); return errno; } buf = (char *) malloc(4096); if (buf == NULL) { fclose(outf); fclose(inf); return errno; } blah blah blah you get the idea Vr.s the same code in Ocaml: let copyfile src dst = let inf = open_in_bin src and outf = open_out_bin dst and buf = String.make 4096 ' ' in let rec loop () = let c = input inf buf 0 4096 in if (c > 0) then begin output outf buf 0 c; loop () end else () in loop () The ocaml executable code for copyfile function will be smaller than the C version, simply because the ocaml version takes advantage of various features of the larger ocaml library and infrastructure- especially (in this case) exceptions and garbage collection. > > > > A naive assumption would be that an Ocaml program is about 100K or so > > larger than the equivelent C program. Not much, considering how easy it > > is to get executables multiple megabytes in size. > > [...] > > Ocaml gets a lot more code reuse, and thus can actually lead to smaller > > executables. > > I don't understand what you mean by that (probably my fault). What do you > mean by "code reuse" here? I usually understand that phrase to mean using > code written by people other than me, but you seem to mean it in a > different sense. > I was using it in the most literal sense- using code more than once, in more than one way. In general, it's much better to have only one copy of a function, used in two places, than two copies of the function. The trick is that generally the two copies are not exactly identical- if the functions are, for example, the length of a linked list, one function might operate on a linked list of integers, another a linked list of floats. Ocaml encourages you to program in a generic way- you actually have to work at it to write a linked list length routine that *isn't* generic, the naive implementation is (so is the optimized version). Again, this generally isn't a problem in small programs, which easily fit into you brain as a whole. Code reuse becomes more of a trick on moderate to large programs, especially moderate to large programs with more than one programmer. How many times have we reimplemented linked lists in C? > > > Unless you have special constraints, the difference between C program > > sizes and Ocaml program sizes are not enough to be worth worrying about. > > I don't really agree that the problem of distributing simple (few lines of > code) applications in small executables is all that "special". Certainly > there are *many* applications where you don't need that; equally, there > are quite a few where you do need/want that. I was thinking of special cases where the difference of a 100K or 1M or so is the difference between working and not working. If you are, for example, trying to fit your program on a 512K ROM, Ocaml's overhead might be a problem. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners