Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change of behavior of demarshalling an empty file #7142

Closed
vicuna opened this issue Feb 10, 2016 · 10 comments
Closed

Change of behavior of demarshalling an empty file #7142

vicuna opened this issue Feb 10, 2016 · 10 comments
Assignees
Labels
Milestone

Comments

@vicuna
Copy link

vicuna commented Feb 10, 2016

Original bug ID: 7142
Reporter: @mlasson
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2017-09-24T15:33:09Z)
Resolution: fixed
Priority: normal
Severity: minor
Target version: 4.03.1+dev
Fixed in version: 4.04.0 +dev / +beta1 / +beta2
Category: ~DO NOT USE (was: OCaml general)
Monitored by: @gasche @ygrek @jmeber @hcarty

Bug description

The program in Section "Steps to Reproduce" used to work in ocaml 4.01 and fails with the exception "input_value: truncated object" on the current trunk.

It is due to the recent change in intern.c, more precisely the merge of #224 [https://github.com//pull/224] (commit 4fd254e).

Before that commit, the function caml_input_val was raising the EOF exception while trying to read the magic number starting marshaled representation, and now the function call below fails with "input_value: truncated object" (is the object really truncated if there is no object ?):
if (caml_really_getblock(chan, header, 20) == 0) {
caml_failwith("input_value: truncated object");
}

I don't know if this is a bug or if the original behavior was an abuse the demarshaling function. In the later case, it should be documented.

Steps to reproduce

let write oc n =
for k = 1 to n do
Marshal.to_channel oc k []
done

let read ic =
try while true do
Printf.printf "%d\n%!" (Marshal.from_channel ic)
done with End_of_file -> ()

let () = begin
let filename, oc = Filename.open_temp_file ~mode:[Open_binary] "integers" "tmp" in
write oc 10;
close_out oc;
let ic = open_in_bin filename in
read ic;
close_in ic;
end

@vicuna
Copy link
Author

vicuna commented Feb 25, 2016

Comment author: @damiendoligez

The empty file is definitely a truncated object, and I don't see a good reason to special-case it.

I vote for documenting the new behaviour (and maybe also that it changed from previous versions).

@vicuna
Copy link
Author

vicuna commented May 3, 2016

Comment author: @zoggy

When reading from a pipe or a network communication, with the new implementation there is no way to distinguish between a closed channel (end of communication) from a bad transmitted value.

This change of behaviour has already broken Stog. Ok, I should have tested it before the release, but since 4.03.0 broke a lot of libraries Stog depends on (and not only mine), this was not possible in a reasonable amount of time.

@vicuna
Copy link
Author

vicuna commented May 6, 2016

Comment author: gabelevi

I'm also hitting this change in behavior when trying to compile Flow (github.com/facebook/flow) with 4.03.0.

@vicuna
Copy link
Author

vicuna commented Jun 2, 2016

Comment author: berenger

This bug also hits clangml.
My dirty fix looks like;
code before:

try io_loop ()
with End_of_file ->
close_in input;
[...]

code after:

try
try
io_loop ()
with Failure msg ->
begin
if msg = "input_value: truncated object" then
raise End_of_file
else
raise (Failure msg)
end
with
| End_of_file ->
close_in input;
[...]

@vicuna
Copy link
Author

vicuna commented Jun 2, 2016

Comment author: berenger

Thanks for the initial report: it allowed me to google the error message
and quickly find where and what to fix.

@vicuna
Copy link
Author

vicuna commented Jun 3, 2016

Comment author: @alainfrisch

When reading from a pipe or a network communication, with the new implementation there is no way to distinguish between a closed channel (end of communication) from a bad transmitted value.

This is an interesting argument in favor of restoring the previous behavior.

Damien: would you agree to that?

@vicuna
Copy link
Author

vicuna commented Jun 27, 2016

Comment author: @xavierleroy

I am probably the one to blame for this change of behavior, although it was not on purpose!

The old implementation of the unmarshaler would raise End_of_file if it hit end-of-file while reading the first 20 bytes of data (the header), and Failure "input_value: truncated object" if it hit end-of-file while reading the remainder of the marshaled data.

The 4.03 implementation raise Failure "input_value: truncated object" in both cases.

An arguably sensible behavior would be to raise End_of_file if the input is already at end-of-file, and Failure "input_value: truncated object" if end-of-file is hit later.

@vicuna
Copy link
Author

vicuna commented Jun 27, 2016

Comment author: @gasche

(I would also be in favor of un-breaking user's code by reverting to the old behavior. Both make sense, let's pick the convenient one.)

@vicuna
Copy link
Author

vicuna commented Jun 27, 2016

Comment author: gabelevi

All of my use cases are for two processes communicating with each other via Unix.select followed by input_value. The End_of_file exceptions that we expect should not come in the middle of reading an object, so xleroy's suggestion would work for us.

@vicuna
Copy link
Author

vicuna commented Jun 28, 2016

Comment author: @mlasson

I've opened the GPR #639 to fix this problem; the current solution raises end_of_file when we fail to read the first byte of the header, but I'd be happy to implement another solution.

#639

@vicuna vicuna closed this as completed Sep 24, 2017
@vicuna vicuna added this to the 4.03.1 milestone Mar 14, 2019
@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants