Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] GC and file descriptors
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2003-11-17 (19:16)
From: skaller <skaller@o...>
Subject: Re: [Caml-list] GC and file descriptors
On Mon, 2003-11-17 at 06:19, Brian Hurt wrote:
> On 16 Nov 2003, skaller wrote:

> I wish the mainstream would crawl out of the sixties era computer science 
> and get at least into the eighties. 

Lol, they're up to the sixties already?

>  The only "mainstream" programming 
> language I know of with *excusable* garbage collection is Java. 

I think Ocaml is very close to mainstream now.
[And I think the FP contest and Bagley are responsible,
and ORiley book in English will be the clincher]

Very hard to be the second (and third) fastest programming
language around and not be noticed :-)

>  Of 
> course, Java's type system is state of the art- for 1968. 

Err.. since when is downcasting everthing from Object
a type system??

> > Because of this problem, my own major Python program
> > interscript would not run on my interpreter, since
> > it relied on orderly finalisation.

> You were abusing finalization.  

Well no. I was *trying* to abuse it but failed :-)

In fact, partly my prompting on this was responsible
for Ocaml coded finalisers (instead of just C ones).

> A much better way to do this would be to 
> have a central command queue. 

This would have been impossible in interscript.
It works by executing user commands directly
(but in a restricted environment).

> > 
> > Anyhow .. there do exist cases where explicit finalisation
> > is difficult. 
> IIRC, in the general case it's equivelent to solving the halting problem.

If I can make a general comment here: there are things
like comparing functions which are equivalent to the halting
problem. Another is verifying program correctness :-)

But it is irrelevant, because programmers do not
intend to write arbitrary code. They intend to
write comprehensible code, whose correctness
is therefore necessarily easy to prove.

This is also true of finalisation. Necessarily.
The general case is irrelevant, because people
don't intend to write arbitrary code.

Even when multiple people work on an Open Source
project (just about the worst environment I can
think of .. ) there is at least an intent to
follow a design.

> On a more general note, I will agree that having the GC free non-memory
> resources raises problems.  I agree (and strongly beleive) that everything
> that isn't memory should have some way for the program to free them
> without invoking the GC- for example, file descriptors should have a close 
> routine.  And I agree that programs *should* use them instead of relying 
> on the GC in all but the most intractable cases.

If I may modify that: *even* in the most intractible cases :-)

However, two things spring to mind. First: if close is a primitive
operation, we need a high level way of not only tracking
what to close, but also of doing that tracking efficiently.

Secondly, since the resource is represented in memory,
and much of the time the dependencies which are *not*
represented in memory could be, using the code
of the garbage collector (and not just the same algorithm),
makes some sense.

> The question is, is it worse to have the GC try to reclaim the resources
> than it is to have a guarenteed leak? 

You are making an incorrect assumption here. You're assuming
that finalisers only release resources. Consider again
the case where the final action in generating a document
is to create a table of contents (know nnow that all chapters
are seen).

That is a quite non-trivial task which can take a lot
of time and involve a very large amount of code,
build new objects, etc.

The time to do this operation is "when the user cannot
add anymore chapters to the document" and the easiest
time to specify that is "when the user no longer
has any copy of the document handle".

This finalisation task isn't a destructor and it isn't
releasing anything, but it is, by specification,
a finalisation task.

>  Consider the case where close 
> return EINTR and doesn't close the file descriptor.  Since the descriptor 
> has already leaked- for it to reach this point the program no longer has 
> any way to reach the descriptor (if the program did, the object wouldn't 
> be being collected).  So what's the difference in this case between the 
> collector silently ignoring the error, and the collector not even trying 
> to free the resource?

Probably none, BUT you have again made an assumption and this
time it is a fatal mistake. Closing a file is not usually done
to free the resource, but to flush the buffers.

In interscript, this proved difficult, because sometimes
client code would expect to reopen and read the file
after the last reference: unfortunately, the open actually
works and reads 90% of the file (the last 10% still in a buffer).

I fix that problem by always writing a temporary file, and copying
it when the file is closed. So here, the finalisation is vital
(it copies a file, as well as closing one).

> Actually, an idea I've had is to add Java-style "throws" information- 
> basically what exceptions a function can throw- to the type information of 
> a function.  With type inference, this shouldn't be the headache in Ocaml 
> it is in Java.  The advantage here is that you could enlist the type 
> system to gaurentee that a destructor doesn't throw any exceptions- i.e. 
> the code for a destructor should handle all exceptions it generates.  

Sigh. I argued strongly in C++ that exception specifications should
be statically enforced. They're not. Its a total pain, because
they ALMOST are. Unfortunately, if you throw an exception you're
not allowed to .. a run time check detects the fault
and throw an Unexpected exception. You can rely on it .. and
it totally defeats any attempt to do static analysis (that one leak
can be turned into a complete defeat of the exception specifications).

On thing is for sure .. I *hate* seeing

"program terminated with uncaught "Not_found" exception"

because I do hundreds of Hashtbl.find calls....

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: