Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] How hard would more inlining, more unboxed floats be?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: [Caml-list] in_channel_length fails for files longer than max_int
> The pervasive function in_channel_length fails when the file size is
> too large for an int, but it doesn't raise an exception.  The code in
> io.c just checks for lseek errors.  Would a check for end > max_int be
> worthwhile?

That's one possibility.  The code for channel_size is already less
portable than it ought to be:

> long channel_size(struct channel *channel)
> {
>   long end;
>   end = lseek(channel->fd, 0, SEEK_END);

The return type of "lseek" is actually off_t, which is specified to be
"an extended signed integral type".  So, there is no guarantee that a
"long" can hold a file offset without losing bits, although it happens
to work on most Unix systems.  The check you suggest would handle this
case as well.

Then, the conversion of the long into an OCaml "int" loses one more
bit of precision.  Since off_t is signed, we don't actually lose bits,
but may end up with a negative file size, which is tolerable for
printing, but will cause an error if it is passed to "seek".

Chris Hecker suggests:

> Shouldn't all of the file size stuff be converted to int64s now anyway?

That's another option, especially since it would allow "lseek64" or
"llseek" to be used internally instead of "lseek" when available, thus
making it possible to work with files bigger than 2Gb on a 32-bit
platform.  However, we can't break backward compatibility, so we'd
have to add new functions "long_seek" and "long_{in,out}_channel_length"
to the OCaml API.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr.  Archives: http://caml.inria.fr